DOC

# Time Series Review

By Terry Griffin,2014-06-28 19:57
20 views 0
Time Series Review

1. Definitions

? Stationary Time Series- A time series is stationary if the properties of the

process such as the mean and variance are constant throughout time.

i. If the autocorrelation dies out quickly the series should be

considered stationary

ii. If the autocorrelation dies out slowly this indicates that the process

is non-stationary

? Nonstationarity- A time series is nonstationary if the properties of the process

are not constant throughout time

i. Unit Root Nonstationarity-

ii. Random Walk with Drift-

? White Noise- A time series is called a white noise if a sequence of

independent and identically distributed random variables with finite mean and

2variance, usually WN(0, ). White noise has covariance )

? Backward shift operator a short hand for shift backward in the time series. p βY= YβY= Ytt-1 tt-p

2. Autocorrelation

? Measures the linear dependence or the correlation between r and rtt-p.

(summarizes serial dependence)

Cov(r,r)Cov(r,r)ttlttl? (lVar(r)Var(r)Var(r)tttl

where Var(r) = Var(r) for weakly stationary process tt-1

? A way to check randomness in the data

? Lag 0 of the autocorrelation is 1 by definition

i. If the autocorrelation dies out slowly this indicates that the process

is non-stationary.

ii. If all the ACFs are close to zero, then the series should be

considered white noise.

? No Memory Series

i. Autocorrelation function is zero

? Short Memory Series

i. Autocorrelation function decays exponentially as a function of lag ? Long Memory Series

i. Autocorrelation function decays at polynomial rate

ii. The “differencing” exponent is between -? and ?.

3. Partial Autocorrelation

? Correlation between observations X and X after removing the linear tt+h

relationship of all observations in that fall between X and X. tt+h

~~rret0,1t11,t1,1

~~~rrret0,21,2t1t22,t2,2?

r~~r~r~ret0,31,3t12,3t23,3t3,t

;

ˆEach is the lag-p PACF ~p,p

? The PACF shows the added contribution of r to predicting r . t-pt

4. Diagnostics and Model Selection

? Residual Diagnostics

i. The residuals should be stationary white noise

ii. The ACF and PACF should all be zero

a. If there is a long memory in the residuals then the

assumptions are violated nonstationarity of residuals

? AIC (Akaike’s Information Criterion)

i. A measure of fit plus a penalty term for the number of parameters

ii. Corrected AIC- stronger penalty term ~ makes a difference with

smaller sample sizes

iii. Choose the model that minimizes this adjusted measure fit

iv. AIC = log(MLE estimate of the noise variance) + 2k/T, where T is k

the sample size and k is the number of parameters in the model

? Portmanteau Test

i. Tests whether the first m correlations are zero vs. the alternative

that at least one differs from zero.

ii. The sum of the first m squared correlation coefficients

((H:...0m01iii. where ρis the autocorrelation i H:0(ai

iv. Box and Pierce

m2*ˆQ(m)Tp ?l1l

Q*(m) is asymptotically a chi-squared random variable with m

degrees of freedom

v. Ljung and Box

2mˆ(lQmTT ()(2)?Tl1l

Modified Box & Pierce statistic to increase power

? Unit Root Test

i. Derived in 1979 by Dickey and Fuller to test the presence of a unit

root vs. a stationary process

ii. (~(e(~~(et01t1tt1t1t

If then he series is said to have unit root and is not stationary. ~11

The unit root test determines if is significantly close to 1. ~

~H:101 H:~1A1

iii. The behavior of the test statistics differs if it is a random walk with

drift or if it is a random walk without drift.

5. Unit Root Nonstationary Process

? Random Walk

i. The equation for a random walk is, where ((a(tt1t0

denotes the starting values and a is white noise. t

ii. A random walk is not predictable and this can not be forecasted.

iii. All forecasts of a random-walk model are simply the value of the

series at the forest origin.

iv. The series has a strong memory

? Random Walk with Drift

i. , where (?(a?E((()tt1ttt1

(?(a101

(?(?(a2aa212021 ;

(t?(aa...at0tt11

A positive μ implies that the series eventually goes to infinity.

6. Differencing

? Reasons why the Difference is taken

i. To transform non-stationary data into a stationary time series

ii. To remove seasonal trends tha. take 4 difference for quartly data thb. take 12 difference for monthly data ? First Difference- The first difference of a time series is zyyttt1

i. A way to handle strong serial correlation of ACF is to take the first

difference

? Second Difference- The second difference is z(yy)(yy)ttt1t1t2

7. Log Transformation

? Reasons to take log transformation

i. Used to handle exponential growth of a series

ii. Used to stabilize the variability

? Values must all be positive before the log is taken

i. If not all values are positive a positive constant can be added to

every data point

8. Autoregressive Model

? A regression model in which r is predicted using past values, rr tt-1, t-2,

i. AR(1) : , where a is a white noise series with r~~ratt01t1t

zero mean and constant variance

ii. AR(p): r~~r...~rat01t1ptpt

? Weak stationary is the sufficient and necessary condition of an AR model

i. For an AR model to be stationary all of its characteristic roots

must be less than 1 in modulus

? ACF for Autoregressive Model

i. The ACF decays exponentially to zero

1) For , the plot of ACF for AR(1) should decay ~0

exponentially

2) For , the plot should consist of two alternating ~0

2exponential decays with rate . ~12ii. The ACF for AR(1) , because then . So (~((1(~l1l10l1

the ACF for the AR(1) should decay to exponentially with rate ~1

starting at (10

? PACF for Autoregressive Model

ii. The PACF is zero after the lag of the AR process

ˆiii. converges to zero for all l > p. Thus for AR(p) the PACF cuts ~l,l

off at lag p.

9. Moving Average Model

? A linear regression of the current value of the series against the white noise or

random shocks of one or more prior values of the series.

i. , where μ is the mean of the series, a are X?a?at-itt1t1

white noise, and θ is a model parameter. 1

? The MA model is always stationary as it is the linear function of uncorrelated

or independent random variables.

? The first two moments are time-invariant

? MA model can be viewed as a infinite order AR model

? ACF for Moving Average Model

ii. The ACF is zero after the largest lag of the process ? PACF for Moving Average Model

i. The PACF decays to zero

10. ARMA [p,q]

? The series r is a function of past values plus current and past values of the t

noise.

? Combines an AR(p) model with a MA(q) model

? The equation for a ARMA(1,1) is r~ra?at1t1t1t1

? ACF for ARMA

i. The ACF begins to decay exponentially to zero after the largest lag

of the MA component.

11. ARIMA

? r is an ARIMA model if the first difference of r is an ARMA model. t t

? In an ARMA model, if the AR polynomial has 1 as the characteristic root,

then the model is a ARIMA

? Unit-root nonstationary because it’s AR has unit root.

? ARIMA has strong memory

12. ARFIMA

? A process is a fractional ARMA (ARFIMA) process if the fractional

ddifferenced series follows an ARMA(p,q) process. Thus if a series (1B)xt

follows ARMA(p,q) model, then the series is an ARFIMA(p,d,q).

13. Forecasting

? The multistep forecast converges to the mean of the series and the variances

of forecast errors converge to the variance of the series.

? For AR Model

i. The 1-step ahead forecast is the conditional expectation

p

ˆr(1)E(r|r,r,....)~~r ?1101hhhhihi1i

p

ˆii. For multistep ahead forecast: r(l)~~r ?0hihli1i

ˆiii. The forecast error for 1 step ahead: e(1)rr(1)ahh1hh1

iv. Mean reverting. For a stationary AR(p) model, long term point

forecasts approach then unconditional mean. Also, the variance of

the forecast approaches the unconditional variance of r t.

? For MA Model

i. Because the model has finite memory, its point forecasts go to the

mean of the series quickly.

ii. The 1-step ahead forecast for MA(1) is the conditional expectation

ˆr(1)E(r|r,r,....)c?a hh1hh1o1h

The 2-step ahead forecast for MA(1)

ˆ r(1)E(r|r,r,....)chh1hh1o

iii. For a MA(q) model, the multistep ahead forecasts go to the mean

after the first q steps.

14. Spectral Density

? A way of representing a time series in terms of harmonic components at

various frequencies. Tells the dominant cycles or periods in the series

? Spectral Density is only appropriate for stationary time series data.

? A Periodogram at a particular frequency is proportional to the squared ?

amplitude of the corresponding cosine wave, , fitted to cos(?t)sin(?t)

the data using least squares.

? For a Covariance stationary time series(CSTS) with autocovariance function

, v = 0, ?1, ?2… the spectral density is given by ?(v)

?ivh2?f(v)(h)e?？，n where v[-1/2, 1/2] 1/2?ivh2?(h)ef(v)dv?1/2

15. VaR Value at Risk

? Estimates the amount which an institution’s position in a risk category

could decline due to general market movements during a given holding

period.

? Concerned with market risk

? In reality, used to assess risk or set margin requirements

i. Ensures that financial institutions can still be in business after a

catastrophic event

? Determined via forecasting

? If multivariate:

22i. VaRVaRVaR2(VaRVaR1212

16. VAR Vector Autoregressive Model

? A vector model used for multivariate time series

r~Φrai. VAR(1): ; where ~ is a k-dim vector, is a k x 0t01t-1t

k matrix, and {a} is a sequence of serially uncorrelated random vectors t

with mean zero and covariance matrix . ；;positive definite.

ii. VAR(p): r~Φr...Φra t01t-1pt-pt

? Can also model VMA and VARMA models

i. One issue, VARMA has an identifiability problem (i.e. may not be

uniquely defined

ii. When VARMA models are used, you should only entertain lower order

models.

17. Volatility Models

? ARCH

i. Only an AR term

)atttii. ARCH(m): 222)a...attmtm011

iii. Weaknesses:

; Assume +ve & -ve shocks have same effects on volatility (i.e. use square of

previous shocks to determine order) use “leverage to account for the fact that –

ve shocks (i.e. “bad news”) have larger impact on volatility than +ve shocks (i.e.

“good news”).

; Model is restrictive (see p.86, 3.3.2(2))

; Only describes the behavior of the conditional variance. Does not explain the

source of the variations.

; Likely to over-predict the volatility since the respond slowly to large isolated

shocks to the return series.

? GARCH generalized ARCH

i. Mean structure can be described by an ARMA model

)attt

msii. GARCH(m,s): 222)a)??0titijtj11ij

iii. Same weaknesses as the ARCH

iv. If the AR component has a unit root, then we have an IGARCH model (i.e.

Integrated GARCH; a.k.a. unit-root GARCH model)

v. EGARCH (i.e. Exponential GARCH) allows for asymmetric effects

between +ve & -ve asset returns. Models the log(cond. variance) as an

ARMA. PRO: variances are guaranteed to be positive.

? GARCH-M - GARCH in mean

i. Used when the return of a security depends on its volatility

)attt

2?)rcaii. GARCH(1,1)-M: ; where ?, c constant. A +ve c ttt

222)a)ttt01111

indicates that the return is positively related to its past volatility. 2iii. Cross-Correlation: series correlated against series; used to determine whether

there exists volatility in the mean structure.

? Alternative GARCH models

1) CHARMA Conditional heteroscedastic ARMA uses random coefficients to

produce conditional heteroscedasticity.

2) RCA Random Coefficient Autoregressive model accounts for variability among

different subjects under study. Better suited for modeling the conditional mean as

it allows for the parameters to evolve over time.

3) SV Stochastic Volatility model is similar to an EGARCH but incorporates an

innovation to the conditional variance equation.

4) LMSV Long-Memory SV model allows for long memory in the volatility.

NOTE: Differencing ONLY effects mean structure, Log Transformation effects volatility

structure.

18. MCMC Methods (Markov Chain Monte Carlo)

? Markov chain simulation creates a Markov process on (, which converges

to a stationary transition distribution, P(?, X).

? GIBBS SAMPLING (p.397) no Likelihood unknown, conditional dists known.

o Need starting values nno Sampling from cond. dists converges to sampling from the joint dist.

o PRO: Compared to MCMC, Gibbs can decompose a high-dim

estimation problem into several lower-dim ones.

o CON: When parameters are highly correlated, you should draw them

jointly.

o In practice, repeat several times with different starting values to ensure

the algorithm has converged.

? BAYESIAN INFERENCE (p. 400) no Combines prior belief with data to obtain posterior dists on which

statistical inference is based.

Report this document

For any questions or suggestions please email
cust-service@docsford.com