Time Series Analysis (Stat 569 Lecture Notes)
Time Series Analysis (Stat 569 Lecture Notes)
Series
Analysis
July 14
2011
These are Class Notes for certain topics of the Core Course STAT 569 to be
offered to the Students of M.Sc.(Ag) Statistics & Maths., specifically
prepared by Dr. B.S. Kulkarni
Course No.
STAT 569
Components of Time Series: It is assumed that an observation recorded at any moment of time t, i.e., Ot
is the result (outcome) of mainly three factor-effects, which are referred as Components of time series.
These are:
1. Trend (T);
Periodic Changes consisting of:
2. Seasonal (S) and
3. Cyclic (C) variations and
4. Irregular (I) or Random variations
Characteristics of the Components:
Trend:
It is the general tendency the data to increase / decrease during a long period of time.
It is the general smooth and long term average tendency. It is not necessary that the increase or
decrease should be in the same direction throughout a given period.
It is not necessary that every time series may exhibit an upward / downward trend. There may be a
series that fluctuate around a constant value (oscillatory type).
The trend may be linear or non-linear type. This can be identified only through a line-graph of the
series.
The term long term may depend on the type of data- In certain situations, a period as small as week
may be fairly long; while in some other situations, even a period of 2 years may be very small.
In the context of agricultural data on crop production, the period has to be sufficiently more than 10
years; while in bacterial count studies, the bacteria count recorded every after 5 minutes of a week
may exhibit a trend.
Periodic Changes: These are classified into Seasonal and Cyclic changes:
Seasonal Changes:
These are short-term changes, which operate in a regular and periodic manner over a span of less
than a year; i.e., the period is 12 months and the pattern is almost same during every year.
These changes may occur due to natural causes such as climatic /seasonal factors (Sale of ColdDrugs-Vicks, Rain-Coats during rainy season or sale of Air-Coolers / Air-Conditioners during
summer)
These changes may also occur due to man-made (human) causes- habits / customs / conventions of
people (Sale of Jewelry items during marriage season, sale of crackers during deepawali, etc.)
Cyclic Changes:
The oscillatory movements in time series with a period more than one year are termed as cyclic
variations.
One complete period is a Cycle
The period (Cycle) varies depending on the data (Weather Cycle- Draught every after 5 / 10 years;
business cycle- recession).
The cycle has up-swings and down-swings, commonly referred as peaks and troughs.
Irregular / Random Changes:
These variations do not belong to any of the above categories. The fluctuations are random
(erratic) or non-recurring type.
Eg: Political Instability, Natural Calamities (Earth Quakes, Floods, etc.)
Analysis of Time Series Data:
Objective: to identify the components affecting the time series and measure its extent /contribution.
The approach involved in measurement is isolation.
Models for Measuring the Contribution of Components:
1. Additive Model:
It is assumed that the components are independently affecting the observations of the time series.
2. Multiplicative (Product) Model:
It is assumed that the components are jointly affecting the observations of the time series. This
model is relatively preferred.
Compiled by Dr. B.S. KULKARNI
It is assumed that trend and periodic components are determined by separate forces acting
independently so that simple aggregation would constitute the series. However, it is possible that
the observation during a year may largely depend on its value in the previous year (lag effect). This
is a common phenomenon in economics data.
Measurement of Trend: There are no valid automatic techniques to identify the trend component in the
data. However, as long as the trend is monotonous- i.e., consistently increasing or decreasing the
identification and hence the measurement may not be difficult. Generally, the trend in the time series data
is masked by year-to-year fluctuations. Hence the fundamental step in the identification of trend is
smoothening.
Smoothening: It involves some form of local averaging of the data such that the non-systematic
components of the individual observations cancel out. The most commonly applied technique is the moving
average smoothening.
Patterns of Trend: The smoothed series may exhibit variety of patterns, which can be described with a
mathematical function. The trend may be linear or non-linear. The following are the most common
functional forms of the trend:
The following methods are generally applied for measuring the trend:
1. Graphical Method
2. Method of Semi-Averages
3. Method of Moving-Averages
4. Method of Curve-fitting
1. Graphical Method:
This method involves obtaining a line-graph of the data and then joining it with a smooth line in the
either directions. The points on this smooth curve determine the trend.
The smoothened graph eliminates the variations due to periodic and irregular components.
It involves no computations. Simplicity in the approach is its greatest advantage.
It is applied for preliminary study of the trend.
2. Method of Semi-Averages:
This method involves dividing the data (series) into two equal groups and computing the semiaverages of these groups
These semi-averages are then plotted against their corresponding (centered) time values and are
joined with a line. This line is then extended either-ways for forecasting / estimation purpose.
The semi-averages smoothens the fluctuations.
Suitable only when the trend is of linear type.
Not suitable when the number of observations in the series are odd (for forming groups with equal
observations)
Not suitable when the number of observations in the series are even. This creates centering
(location) problem for the semi-averages
Not suitable when the fluctuations within the groups are heterogeneous.
3. Method of Moving Averages:
This method involves smoothing the fluctuations by computing the moving averages.
These moving averages are then plotted corresponding to their centered time axis
Moving Average refers to average of successive (time dependent) observations of a period. This
period has to be carefully defined on the basis of nature (period) of the year-to-year fluctuations.
Generally, the period for computing the moving average is odd number of years- i.e., 3 or 5 years.
The odd number is convenient for centering (locating) the year of the average.
If the period of fluctuations is not uniform, then the moving averages of the series may exhibit an
oscillatory series. This is the Slutsky-Yule Effect.
If there are outliers in the data, then instead of average, median would sufficiently smooth the
data.
If fluctuations are very large, then instead of moving averages, exponential smoothening techniques
can be applied. These are based on application of weighted least-squares.
The smoothening procedures filter out the irregular (i.e., noise) variations and transform the series
into a smoothed one, which is free from outliers.
4. Method of Curve-fitting
This method involves fitting of appropriate trend-curve (equation) to the time series data. The
identification of the trend-patterns can be carried out with a simple line-graph of the smoothed
series or through the mathematical characteristics of the trend curves, as listed above.
Curve-fitting is generally carried out by applying the Least-Squares approach. Least-Squares
method is the most scientific approach for fitting these curves.
The least-squares approach eliminates the element of subjectivity, as involved in the semiaverages method, particularly when there is even number of observations for computing the semiaverages.
It is the only scientific method for computing the growth rates.
The approach can be applied only for those curves that can be transformed to linearity. Hence it is
not suitable for non-linear curves such as Modified Exponential, Gompertz and Logistic.
The least squares approach is essentially fitting of a regression equation to the time series data.
The independent variable is the time t.
The linear trend equation can be fitted as a simple regression equation.
The Exponential trend equation can be also fitted as a simple regression equation, after
transforming to linearity through logarithmic transformation.
The Polynomial trend equations, however, can be fitted in the form of a multiple regression
equation by taking X1 (= t), X2 (= t2), X3 (=t3), etc., as the independent variables.
Compiled by Dr. B.S. KULKARNI
These trend equations can be fitted by applying any of the following methods:
1. Method of Selected Points and
2. Method of Partial Totals (Sums)
The Method of Selected Points is based on selection of three equidistant points corresponding to
the time axis.
The rationale in selection of these points is not well defined and does not utilize complete data.
The Method of Partial Totals involves complete utilization of data that are to be divided into three
successive groups (sequentially ordered as per time axis) with equal number of observations.
This method is therefore preferable to that of the method of selected points due to the lack of
subjectivity in selection of data points.
Fitting of Modified Exponential Trend
1. Method of Selected Points:
Consider the trend equation,
Let Y1, Y2 and Y3 be three equidistant points corresponding to the time axis, such that
The equation (1) then implies that:
By re-arrangement of (B) (A) and (C) (B) and division leads to:
On substitution of c and subsequent terms in (A) or (B) or (C) leads to the three parameters as:
The logistic equation can be transformed to Modified Exponential by applying the reciprocal transformation:
Hence the method of partial totals for fitting the Modified Exponential equation can be applied to the
transformed data (1/Y) for obtaining the Logistic Trend.
Measurement of Seasonal Variations: Seasonal variations are measured in the form of index. The
following methods are applied for measuring the indices:
1. Method of Simple Averages
2. Ratio-to-Trend Method
3. Ratio-to-Moving Average Method
The effect of Irregular variations is eliminated on computing the seasonal averages over the years.
2. Ratio-to-Trend Method:
This method is based on the multiplicative model- O = T x S x C x I
It involves computing Trend (T) values by fitting suitable trend equation and then computing the
ratio:
The C and I components are eliminated by averaging the S values over the years.
The trend equation is fitted to the yearly data, which is averaged over the seasons. The Trend
values for the seasons are computed by expressing the incremental constant on per season basis.
This method assumes non-existence of Cyclic and Irregular variations. Hence, if these are present,
the averaging will not eliminate these components and thus the T values would be biased.
3. Ratio-to-Moving Average Method:
This method is also based on the multiplicative model- O = T x S x C x I
Since the period of seasonal variations is a year (12 months or 4 quarters), the Periodic variations
(Cyclic) and Trend variations can be obtained through computing the 12 monthly moving averages.
These averages are then centered to coincide with the corresponding month (of the year). These
centered averages give:
Finally I component is eliminated by averaging the percentages over the years.
This method is a refinement to the Ratio-to-Trend method and is relatively more stable
The only limitation is that it does not utilize the complete data due to computation of moving
averages- first 6 months and last 6 months cannot be included.
Measurement of Cyclic Variations:
Cyclic variations can be measured as percentages through elimination process, by measuring Trend and
Seasonal variations, as follows:
The method involves assigning values for in the range of 0 to 9 (the choice is arbitrary).
The values of A and B are then computed corresponding to the each value of , which gives:
A plot of S () against is obtained. This is the Periodogram.
The most significant- i.e., relatively maximum value of S () and the corresponding is then identified from
the Periodogram.
This represents the period of oscillation, provided no is multiple of another .
The corresponding S () represents the amplitude.
Some of the techniques of identifying and measuring the contribution of time series
components are discussed in this section.
These techniques have their own importance in studying the characteristics of the
components. However, these are considered to be classical (old).
The techniques that are essential for developing the statistical models for forecasting are
presented in the next section, i.e., Time Series Modeling.
10
11
Where,
and
The function
Autocorrelation is also defined as a regression coefficient in the relationship Yt = Y(t-1) + et -1 < < -1.
The Autocorrelation coefficient in this relation is the first order autocorrelation or autocorrelation of lag 1.
Where,
In particular, if k = 2,
If PACF plot of
against k exhibits significant- tall spikes, i.e., bars at p-th lag, then it indicates
that the autocorrelation of lag p, i.e., effectively explains all the higher order autocorrelations
A plot of
against k is referred as the Periodogram
ACF and PACF helps in identifying the Stationarity in the time series data.
12
13
The terms- Unit Root, Random Walk and Non-Stationarity are thus synonymous.
The concept of Unit Root has led to the following statistical test for verifying the Stationarity of the series:.
Dickey Fuller (DF) Test
The D-F test is based on the concept of unit root.
Let {Yt} be a time series with N-observations and let Ut be a white noise (random error) error term. Then Yt
can be modeled as:
Yt = Yt-1 + Ut; -1 +1 (C)
Subtracting Yt-1 on both sides, (C)
Yt - Yt-1 = Yt-1 - Yt-1 + Ut or,
Yt = ( 1) Yt-1 + Ut (D)
i.e., Yt = Yt-1 + Ut (D); where, = ( 1).
If {Yt} is non-stationary, i.e., Random Walk Model series, then we know that Y t is stationary. Thus testing
the Stationarity implies testing the Null Hypothesis H0: = 0 1 = 0 or = 1.
To test the null hypothesis, we can fit the regression model (D), i.e., Yt = Yt-1 + Ut to the time series data
with N-observations and verify H0: = 0, by computing the test criterion
Normally, in regression analysis, the distribution of the above criterion is Students t with (N-2) df. In the
context of time dependent series, the distribution of the criterion is nott but . Dickey and Fuller have
derived the distribution of and tabulated the critical values of .
Inference: When
(Cal) (Tab), then reject Null Hypothesis, which implies that the series is Stationary.
Similarly, when
(Cal) < (Tab), then accept Null Hypothesis, which implies that the series is NonStationary.
D-F test can be applied to the following forms of regression model:
i) Yt = Yt-1 + Ut
-- --- --- --- Without Intercept
ii) Yt = 0 + Yt-1 + Ut
-- --- --- --- With Intercept
iii) Yt = 0 + 1 t + Yt-1 + Ut
-- --- --- --- With Intercept and Trend
In the above models, the Inference about stationarity is however drawn only on the basis of testing
the Null Hypothesis about .
Augmented Dickey Fuller (ADF) Test
It is likely that in the application of D-F test to the model (D) and its other forms, the error term Ut may be
auto-correlated, i.e., not independent. Under this situation, Dickey Fuller proposed the following
modification to the Model (iii):
(iv) Yt = 0 + 1 t + Yt-1 + Yt - i + Ut,
Where, Yt - i = (Yt I - Yt i-1); I = 1, , m.
Specifically,
Yt 1 = (Yt 1 - Yt 2); Yt 2 = (Yt 2 - Yt 3), etc.
The number of differenced variables included in (iv) has to be determined empirically- stepwise. i.e.,
starting with Yt 1, the variables Yt i are added during each step of fitting the model till Ut becomes uncorrelated.
The D-F / ADF test can be also considered as a tool for transforming the non-stationary series to
stationarity, by applying the differencing. (It determines the order of differencing required to transform
the series to Stationarity):
The model to be fitted for this purpose is:
(v) Dt = 0 + Dt - 1,
where: Dt = (Dt - Dt - 1) and Dt - 1 = Yt - 1 = (Yt - 1 - Yt - 2).
Suppose {Yt} is verified to be Non-Stationary. Then we can fit the Model (v) and test the Null Hypothesis
H0: = 0
If the H0 is rejected, implying that the differenced series is Stationary, then transform Yt to Yt for
developing the time series models.
If H0 is accepted, implying that the series is Non-Stationary, then we can explore whether the second
order differenced series is stationary, by taking Yt = (Yt - Yt 1) instead of Yt (i.e., Dt) in the relation
(v).
Model (v) involves differenced variables on both sides of the equation, unlike that of Model (i).
Compiled by Dr. B.S. KULKARNI
14
15
16
Pattern of ACF
Pattern of PACF
AR (p)
MA (q)
Declines Exponentially
ARMA (p, q)
Exponential Decay
Exponential Decay
Compiled by Dr. B.S. KULKARNI
17
In this example, the series is differenced with order 2 for making it to Stationarity, i.e., the ARIMA
parameter d = 2. It can be observed from the ACF plot that there is a single negative significant spike
corresponding to lag 1; while the PACF plot is decaying exponentially. Hence it is appropriate to
model the series with MA (1), i.e., the parameter q = 1. Thus the model is ARIMA (0, 2, 1):
A: ACF Plot of Second Order Differenced Series
B: PACF Plot of Second Order Differenced Series
Similarly, the characteristics of ARIMA (1, 1, 1) can be interpreted as a first order differenced series,
which has the ACF plot with significant positive spike at lag 1 and the PACF plot exhibits both positive
and negative spikes. However there is a significant positive spike at lag 1 and thereafter the plot is
decaying exponentially.
In general, the actual data may not coincide with the patterns listed in the Table. The practical
approach would be to draw 95% Confidence Limits and select the parameters on the basis of either
or
that fall outside these limits.
Compiled by Dr. B.S. KULKARNI
18
If the time series data is not stationary, transform it to stationarity by applying first / second order
differences, depending on the outcome of the D-F test.
2. Estimation of the Model: On the basis of identification of the parameters (p, d, q) the series is subjected
to fitting of the appropriate ARIMA (p, d, q) model.
The procedure for fitting the model involves transforming the series through appropriate differencing, in case
it is non-stationary, and then subjecting the differenced series to fitting.
Choice of parameters on the basis of significant
or
is solely on empirical (trial and error) basis. For
example, for a AR process, if the Auto Correlations are significant up to lag 4, then one can model AR
systematically in a stepwise manner the models AR (1), AR (2), AR (3) and finally AR (4). If during any step
of the fitting, say, AR (2), the model satisfies the diagnostics checks, then the series can be modeled as AR
(2). The choice of AR (4), as is done in the straight forward approach may not be then relevant.
One can even include only those AR variables, for which the corresponding Auto Correlation coefficients are
statistically significant, in case the ACFs are not significant in a systematic order like for lag 1, 3 and 4, etc.
The estimation procedure for AR Models is straight forward either by ML or Least-Squares approach, as is
done in regression analysis. However, when the MA terms are involved in the model, the estimation
procedure is complex and requires software.
3. Diagnostics Checking: The models that are estimated are acceptable only when the residuals are
random. For this purpose, several alternative models that may be appropriate are to be fitted. The ACF and
PACF of the residuals of these models are then estimated. If the plot of these ACF and PACF exhibit a nonsignificant pattern, then the corresponding model is considered as valid and can be considered for
forecasting.
4. Forecasting: The model that satisfies all the diagnostic checks is considered for forecasting. If the model
is based on differencing / de-trending transformations, then the model has to be represented with relevant
expressions of original series. Then only, the forecasts can be made.
Estimation of ARIMA models is computationally not simple. Software / Packages are essential for its fitting.
The popular packages are SAS, SYSTATand SPSS
Practical Illustration
Developing Forecasting Model for A.P. Rice Production:
Data: 50 Years- 1955-56 to 2004-05
Step-1: Testing the Stationarity of Data
i) ACF Plot of Production Data- No. of : N/3 = 17
PACF Plot of Production Data:
Autocorrelation Plot
1.0
0.5
Correlation
Correlation
1.0
0.0
-0.5
-1.0
0.5
0.0
-0.5
-1.0
10
Lag
15
20
10
Lag
15
20
Coeff
SE
t Stat
1034.92
505.72
2.05
-0.13
0.07
-1.91
Comments:
ACF Plot indicates significant Auto Correlations up to lag 4; PACF has significant spike at lag 1.
This implies that the production data is non-stationary
Further verification by applying Dickey-Fuller Test reveal that (Cal) = -1.91 < (Tab), implying the
series to be non-stationary.
19
Parameter
Intercept
D(t)
13000
Coeff
SE
t- Stat
234.36
174.72
1.34
-1.37
0.13
-10.27
4000
11000
2000
9000
7000
5000
6 11 16 21 26 31 36 41 46
-2000
3000
1 6 11 16 21 26 31 36 41 46
-4000
Comments:
Dickey-Fuller Test reveal that (Cal) = -10.27 > (Tab), implying that the series is Stationary.
Step-3 Identification of ARIMA Model:
Since the production data was non-stationary, the first order differencing has led it to stationarity. The
differenced series, i.e., D(t) is then subjected to fitting of ARIMA Model.
The ACF and PACF plots of D(t) helps in identifying the AR and MA parameters:
ACF Plot of D(t)
PACF Plot of D(t)
Partial Autocorrelation Plot
1.0
1.0
0.5
0.5
Correlation
Correlation
Autocorrelation Plot
0.0
-0.5
-1.0
0.0
-0.5
-1.0
10
Lag
15
20
10
Lag
15
20
The ACF plot indicates that the auto correlation corresponding to lag 1 is significant; while the PACF plot
indicate that the partial auto correlations corresponding to lag 1 and 4 significant (falling above the
confidence limits).
On the basis of PACF plot, we can infer that the forecasting model may not have any Moving Average (MA)
component and the series can be modeled with only Auto Regressive (AR) Component.
The significant partial correlations in PACF plot indicate that there could be three options to formulate the
ARIMA model1. ARIMA with only single AR term to account for significant spike at lag-1
2. ARIMA with two AR terms to account for significant spikes at lag-1 and lag-4
3. ARIMA with all the four AR terms up to lag-4.
Step-4 Estimation:
Choice-1: ARIMA(1, 1, 0):
Model: D(t) = 179.23 -0.41 D(t-1)
Parameter
Coeff.
S.E.
t Stat
P-value
Intercept
179.23
174.67
1.03
0.31
D(t-1)
-0.41
0.13
-3.01
0.00
20
ACF Plot of Residuals
1.0
1.0
0.5
0.5
Correlation
Correlation
Autocorrelation Plot
0.0
-0.5
-1.0
0.0
-0.5
-1.0
10
Lag
15
20
10
Lag
15
20
Coeff.
SE
t Stat
P-value
250.10
179.85
1.39
0.17
D(t-1)
-0.39
0.13
-2.96
0.01
D(t-4)
-0.37
0.16
-2.38
0.02
Intercept
1.0
1.0
0.5
0.5
Correlation
Correlation
Autocorrelation Plot
0.0
-0.5
-1.0
10
Lag
15
0.0
-0.5
-1.0
20
10
Lag
15
20
Choice-3: ARIMA(4, 1, 0)
Parameter
Coeff.
SE
t Stat
P-value
Intercept
401.92
185.06
2.17
0.04
D(t-1)
-0.51
0.14
-3.61
0.00
D(t-2)
-0.33
0.16
-2.06
0.05
D(t-3)
-0.39
0.20
-2.00
0.05
D(t-4)
-0.56
0.18
-3.07
0.00
Autocorrelation Plot
1.0
1.0
0.5
0.5
Correlation
Correlation
Model: D(t) = 401.92 - 0.51 D(t-1) - 0.33 D(t-2) - 0.39 D(t-3) - 0.56 D(t-4)
0.0
-0.5
-1.0
0
10
Lag
15
20
0.0
-0.5
-1.0
0
10
Lag
15
20
The 3 Choices clearly indicate that ARIMA (4, 1, 0) is the appropriate choice, as its residuals are uncorrelated.
Compiled by Dr. B.S. KULKARNI
21
Step-4: Forecasting
The ARIMA (4, 1, 0) Model can be applied for forecasting. Since the Model involves D(t), the differenced
series, it has to be transformed back to original series with Y t. The transformed model and the forecasts are
as follows:
Transformed Model:
Y(t) = Y(t-1)+ 401.92 - 0.51 D(t-1) - 0.33 D(t-2) - 0.39 D(t-3) - 0.56 D(t-4)
Year
Observed
Forecast
Bias (%)
2002-03
7327
2003-04
8953
10143.21
-13.29
2004-05
9601
9266.10
3.49
2005-06
11704
11319.47
3.29
2006-07
11872
12437.88
-4.77
2007-08
10335.35
**** ******** ***