cointegration model : engle and granger methodology
In univariate models, we have seen that if a variable is non stationary, we can apply first differences, and then we can forecast future values using an ARIMA model. However, when working with multiple variables, the process is not that straightforward. Two variables may be non-stationary, but there can exist a linear combination between them that is stationary. In such case, the variables hold a long run relationship and the model can be estimated using an error correction model. By watching
what is cointegration?
Two or more variables are said to be cointegrated, if they are integrated of the same order but exists a cointegrating vector of coefficients that forms a stationary linear combination of them. A cointegration test is used to determine whether the variables hold a long run relationship (equilibrium). For example, imports and exports in Canada seem to move closely together and be in equilibrium. If imports grow more than exports, in the long run export will increase and imports will decrease (to keep a positive trade balance). There exists a correction mechanism that drives the disequilibrium back to equilibrium. In this tutorial we will check if import and exports in Canada are cointegrated or they are just a spurious regression.

what are we going to cover in part 1:
Part 1, you are going to learn:
- Spurious Regression vs Cointegration
- What happens if variables are non sationary
- What happens if variables are non stationary but hold a long run relationship
- How to test for cointegration with unit root tests
- How to test for cointegration with cointegration tests (i.e, Engle and Granger and Phillips Ouliaris)
- How to estimate the Long Run Model
Cointegration vs spurious Regression
If the variables do not hold a long run relationship, we are in the presence of a spurious regression. But, how do we know if the variables do not hold a long-run equilibrium?
According to Newbold and Granger (1974), spurious regression signs are:
- High
and low Durbin Watson statistic. (Rule of thumb:
>DW. statistic)
- T-Statistics are very high: Variables are highly significant.
- Residuals of the regression between the variables of our model are not stationary.
Implications of spurious regressions
Spurious regressions have no economic interpretation. The model will seem to have significant statistics, but they have no real sense. Two variables that have a positive trend may be highly correlated, but they don’t explain each other. Spurious regressions will arrive to non-sense conclusions.
cointegration: money demand example
A classic example in the literature is the money demand model. The classic money market hypothesis states that the market is in equilibrium and clears. (Money Demand = Money Supply). In other words, if money demand increases, money supply will eventually meet the demand. There is a long run relationship.
Where:
= Money Demand
= Price Level
= Real Income
= Interest Rate
= Stationary Disturbance Term
From classic theory, the terms estimated () should have the following signs:
>0 : if the prices in the economy increase, individuals will demand more money to pay for goods and services.
>0 : when your income increases, your demand for money increases.
<0 : If interest rates increase, the price of money increases. Borrowing money becomes more expensive. Money demand will decrease.
= The disurbance terms needs to be stationary. Otherwise, any shock in the model would cause a permanent deviation from the equilibrium.
implications of the money demand example
The main issue with the money demand model is that all the variables are non stationary in levels (i.e., they are I(1)).
- How can the model be in equilibrium?
- What are the implications for the model if the variables are non stationary?
If the variables are non stationary, any shock in the model will have a permanent effect. If there are any deviations from the equilibrium, there would be no return. The market would never be in equilibrium and would never clear. If there is a shock in the demand, the demand would increase and supply would never meet the demand.
So how can non-stationary variables be in equilibrium? The answer is simple. There is a linear combination between the variables that makes the residuals stationary.
where
cointegration: engle and granger method
Now that we have covered the difference between cointegration and spurious regressions and have also gone through an example of cointegrated variables, let’s talk about how to verify if two (or more) variables are cointegrated.
Engle and Granger: 2 steps method
Step 1: Long run regression
The first step is to verify that the variables are integrated of the same order. To do so, check for stationarity and ensure that your variables are all I(1), (or higher order). If the order of integration of your variables are different, we cannot proceed.
Once we ensure the variables are integrated of the same order, we can estimate the long run regression. If you are wondering what the long run regression is, it’s just your linear regression model.
In our example, we are estimating a model about trade balance in Canada. Exports are explained by a constant and imports. What is the intuition behind the model? On the first hand, imports are required for a country to grow and be able to produce goods to export. On the other hand, we know that imports cannot exceed imports (in a systematic way). There can be periods when imports grow more than exports (resulting in a negative trade balance), but over the next periods, countries will try to stabilize the external balance. Consequently, there will be a long run relationship between exports and imports. There has to be an equilibrium over time.
long run regression
The long run regression in our model is represented by:
Where , and
.
save the residuals and check for stationarity
Once we estimated our model, we can save the residuals and perform the Augmented Dickey-Fuller test on the residual series. We need to ensure that the residuals are stationary.
engle granger cointegration test critical values
If you are using an unit root test (i.e., Augmented Dickey Fuller) to test for the stationarity of the residuals, you need to use the critical values of the cointegration tables. The p-value reported in the Augmented-Dickey fuller test is correct, however, the critical values are not. Why? We cannot use the critical values of the unit root tests for a series that results from an estimated regression (a non observable series).

Using the cointegration tables, we reject the null hypothesis of the augmented Dickey Fuller test. Our residual series is stationary.
EViews Tutorial
Part 2
- Estimate the short run model
- Error correction term details
- Model Diagnostics
- In sample Forecast
short run model: error correction model
We have estimated the long run model. Now it’s time to estimate the short run model. For the short run model, the variables need to be in differences – stationary form, and we need to incorporate to the model the error correction term. The error correction terms are the residuals of the long run regression but lagged one period.
error correction term - residuals lagged one period
variables in their stationary form (in 1st differences)
Plug equation (1) in (2) and you obtain to the short run model
cointegration: error correction term interpretation
-
is the error correction term estimated coefficient, where -1<
<0
- Values out of the range are explosive results. You need to review/re-estimate your model.
- The coefficient determines the “speed of adjustment” towards the long run equilibrium.
- The deviations from the long-run equilibrium are corrected gradually by the error correction term through a series of partial short-run adjustments.
- If the Error correction term is close to 1, it means that almost 100% of the deviations are corrected withing a period (period depends on the frequency of your data: daility, monthly, quarterly, etc).
- If the error term is close to 0, the model is very slow to return to an equilibrium after a shock.
short run and long run estimated model
long run model
short run model
interpretation of the model
- A 1% increase in imports, results in a 1.02% increase in exports. The results are appropriate since in the long run, export grow more than imports.
- The short run model is telling us that 11% of the deviations from the long run equilibrium are corrected within a period (In our example, a period is a quarter since we are working with quarterly data).
- The speed of adjustment in our model is slow. It would take around 9 periods to achieve equilibrium.
Final step: model diagnostics
residuals normality test
- Jarque-Bera Statistic for testing normality
- H0= Residuals are Normally distributed
- If p>0.05, residuals are normally distributed
serial correlation
- As we have lagged variables, DW statistics is no longer valid.
- You can check for the correlogram – Q statistic
- H0= No serial Correlation
- or, Serial Correlation LM test – Breusch-Godfrey test
- H0= No Serial Correlation
heteroskedasticity
- In the presence of heteroskedasticity, standard errors are no longer valid
- Conduct the Breusch-Pagan-Godfrey test
- H0= Homoskedasticity
in sample forecast

recommended literature
Download the dataset for free and replicate the content covered in the video
Download the Data Set for Free (direct download, no adds).
$4.99 CAD
PRemium Content
- Video Slides of the 2 videos
- Eviews Workfile + Dataset
- Support for any inconvenience
check out other free courses

Stata
Learn applied time series in Stata for Free. Some topics you wll learn are how to generate time variables, ARIMA Models, VAR models and more!

LaTex with overleaf
It's time to write your paper in a professional format. Make your paper look great with Overleaf.
book a meeting
Do you need help with your research plan?
Are you stuck and need help to design/plan your thesis topic and methodology?
buy the material
Elevate your learning experience. You can buy the package for each of the tutorials. Each package contains the slides of the video + Workfile/Do File + Data & Support