How to Estimate ARIMA models in EViews
buy the material
Welcome to a new Free EViews Tutorial. In this opportunity, I will teach you how to estimate Arima models in EViews. Ensure to download the dataset (free) below, and replicate the results I obtain in the tutorial. There is a FAQ section where I provide the answer to many common questions about ARIMA models. Let’s begin!
What is an arima model?
ARIMA stands for Autoregressive Integrated Moving Average and is one of the most popular and widely used techniques for univariate time series forecasting. Some of the variables you can forecast with ARIMA models are: GDP, Consumer Price Index (CPI), and price of stocks or commodities.
We won’t try to forecast future values of a variable (i.e., inflation) by using many other regressors (i.e., GDP, Money supply, interest rates). Instead, we will rely on past levels of inflation to forecast future levels of inflation. Knowing how the variable behaved in the past will allow us to predict where it will head in the future.
In this tutorial we apply the Box Jenkins Method to select appropriate models and forecast future values of our variable of interest.
What is the box jenkins methodology?
The Box Jenkins methodology was named after the authors George Box and Gwilym Jenkins, who proposed a three steps method to select appropriate ARIMA models to forecast economics variables. We will try to find a model that fits the data well and can forecasting appropriate values. The method consists of three basic steps:
- Stage 1: Identification
- Stage 2: Estimation
- Stage 3: Diagnostics and Forecasting
Some textbooks indicate that the box jenkins method has 4 stages
Don’t be afraid! The original book written by Box and Jenkins entitled “Time Series Analysis: Forecasting and Control” specifies only three steps. However, some textbooks have split the method into four stages. Stage 3 is diagnostics and, Stage 4 is Forecasting. The analysis remains the same.
ARIMA models in EViews, Stage 1: Identification
Overview of stage box jenkins stage 1:
ARIMA is written as ARIMA(p,d,q) where “p” is the order of the autoregressive component, “d” is the times we need to differentiate the variable to achieve stationarity, and “q” is the order of the moving average element.
order of the autoregressive component
order of the moving average component
coefficient of the autoregressive model
coefficient of the moving average model
Stage 1 focuses on two aspects. We are first checking for stationarity of our variable of interest. Next, determining the order of our autoregressive and moving average components. In other words, on stage 1 we will determine “p”, “d” and “q”.
In our example, we are trying to fit an ARIMA model for the series “consumer price index – USA“. We have to begin our analysis by checking for stationarity. Why? Our series needs to be stationary in order to forecast it. If our variable is non stationary in levels, we need to apply the appropriate transformations (logs/differences) to make it stationary.
To check for stationarity, we look at :
- The Graph
- The correlogram
- Formal tests: Augmented Dickey Fuller, Phillips-Perron Test and KPSS test.
Please watch my stationarity tutorial if you need further clarification on the procedure.
In our example, we verified that CPI is non stationary in levels, but stationary in first differences. Consequently, we use the variable in first differences.
arima: how to determine the order of "p" and "q"
To identify the order of the autoregressive and moving average components, we will focus on the correlogram of “CPI” in the first differences. We are displaying the correlogram in the first differences because we have confirmed that “CPI” is stationary in the first differences. The aim of this step is to find all the possible models to estimate.
In order to determine the order of the autoregressive component (“p”), we have to observe the partial autocorrelation column (PACF). In the column, we observe a confidence band on the sides. The values that exceed the band suggest the possible order of the autoregressive component. Looking at the correlogram, the first lag is a highly significant AR(1) component, and then lags 2 and 3 are on the line and could be tested. For the purpose of this example, I will only consider an AR(1) component.
Next, to determine the order of the moving average component (“q”), we have to observe the Autocorrelation column (ACF). We can see that lags 1, and 3 exceed the confidence bands. Consequently, there are two possible moving average components MA(1) and MA(3).
NOTE: You can also try fitting more AR components. (Lags 2 and 3 of the PAC). However, for this example we will stick to the selected possible models: ARIMA(1,1,1,) and ARIMA(1,1,3).
stage 2: EStimation
Once we have identified possible ARIMA models candidates, we need to estimate them and decide which model is the most appropriate. The two models we decided to estimate are:
- ARIMA (1,1,1)
In Box Jenkins Method, Stage 2 we:
- Estimate the models we identified in Stage 1
- We select a model based on the significance of the coefficient estimates
- and, based on model criterions such as: Schwartz, Akaike and Hannan-Quinn
- The model with the smallest values in the model criterions and most significant coefficient will be the most appropriate
How to select the most appropriate arima model
To select the most appropriate model, I recommend you to do a table like the one below, and fill the information with the data we obtained in the previous figure (estimated ARIMA models).
We need to ensure the following:
- Significance of the ARMA terms : select the model with most significant terms (p-values<0.05)
- SigmaSQ: is a measure of volatility. Select the smallest one
- Log Likelihood: We need to select the biggest value, since we are maximizing the log-likelihood function. (in our case the biggest is the least negative value).
- Model selection criterias: Select the model with smallest Akaike, Schwarz and Hannan-Quinn
Conclusion: Model B has a better fit than model A.
stage 3: diagnostics and forecasting
We identified possible models and estimated them in stage 2. We also selected the most appropriate model based on diverse criterions. Now it is time to ensure the model satisfies the requirementes to forecast and predict future values!
In Box Jenkins Method, Stage 3 we:
- Ensure the model satisfies the stability conditions
- There is no autocorrelation
- Is the above requirements are met, then we can forecast!
To check for autocorrelation, we display the correlogram of the ARIMA(1,1,3) model and look at the Ljung Box Q statistic, where the null hypothesis is “residuals are white noise”.
As we can see in the figure below, the p-values for the Q-statistics are all over 0.05 which confirms that the residuals are white noise. The last step is to confirm if the inverse AR/MA roots lie inside the unit circle.
- The estimated model is covariance stationary: inverse AR roots should lie inside the unit circle
- The estimated process is invertible: inverse MA roots should lie inside the unit circle
We can see in the figure above that all the inverse roots lie inside the unit circle. Our ARIMA(1,1,3) satisfied the stability conditions and the error terms are white noise. We are in a good spot now to forecast future values of the consumer price index. If the model you had selected did not satisfy the stability condition, you would need to repeat stage 2 and 3 again, and find another suitable possible candidate.
arima(1,1,3) forecast - cpi
We can also plot together the original series and the forecasted values.
Thanks for reading!
- If you value the content, please subscribe to my YouTube Channel and feel free to share this post in your social media. There are available links to share this post at the top and bottom of the article.
- Ensure to watch the video to go through the steps. You can Download the dataset to replicate the content.
- Finally, you can buy the ARIMA step by step guide, along with slides and EViews.
book a meeting
Do you need help with your research plan?
Are you stuck and need help to design/plan your thesis topic and methodology?
buy the material
Elevate your learning experience. You can buy the package for each of the tutorials. Each package contains the slides of the video + Workfile/Do File + Data & Support
Download the dataset for free and replicate the content covered in the video
Download the Data Set for Free (direct download, no adds).
Arima models - faq
Most frequent questions and answers
No. ARIMA models can only be estimated using stationary variables. The (I) stands for “Integrated” and reflects the order of integration. In other words, how many times you need to differentiate the variable to become stationary. If your variable is non stationary in levels, you will have to use logs and/or first differences to achieve stationarity.
No. ARIMA models are univariate models. In other words, you are using past information to predict future information. For example: If I know what marks you got in your last 10 exams, I can use that past information to predict your future mark. If you want to estimate multivariate models, you need to estimate a VAR model (or, structural VAR). Ensure to watch the tutorials for VAR and SVAR models.
ARIMA models are widely used in Finance. They are simple models and an effective way to forecast future stocks value. However, it always works better for short-run predictions. The more ahead in time we predict, there are more chances of getting inaccurate results.
SARIMA models incorporate a seasonal component. Some variables have a seasonal element. It is common when predicting hydro consumption, that in summer the consumption spikes. You will be able to identify the Seasonal component by looking at the AC column in the correlogram. If you notice the spikes get sharp every 6 months, then that is an indicator that your series has a seasonal component.
ARIMA models are popular to estimate demand of products or services, sales, production quotas, financial stocks, and other economic variables such as CPI. Be aware that for most macroeconomic variables, ARIMA models can provide some insights (i.e., inflation, money supply, GDP, etc.), however there are more sophisticated models you can estimate such as VAR models.
Yearly data is not recommended for any type of models (in general). Using higher frequency data will allow us to identify seasonal components and see fluctuations. If you graph yearly data you will notice the line is very smooth. However, when you graph monthly or quarterly data, you will notice some spikes in the line. Those spikes reflect the fluctuations in the year.
If you don’t have many observations, or the frequency of your data is low (i.e. yearly data), ARIMA models are not powerful. Think about the following: ARIMA models use past data to be able to forecast future data. If you don’t have a lot of past observations, ARIMA predictions will be poor.
check out other free courses
Learn applied time series in Stata for Free. Some topics you wll learn are how to generate time variables, ARIMA Models, VAR models and more!