# Vector autoregression (VAR) in R

In this post, I want to show how to run a vector autoregression (VAR) in `R`

. First, I'm gonna explain with the help of a finance example when this method comes in handy and then I'm gonna run one with the help of the `vars`

package.

# Some theory

So what exactly is a VAR? Without going into too much detail here, it's basically just a generalization of a univariate autoregression (AR) model. An AR model explains *one* variable linearly with its own previous values, while a VAR explains a *vector* of variables with the vector's previous values. The VAR model is a statistical tool in the sense that it just fits the coefficients that best describe the data at hand. You still should have some economic intuition on why you put the variables in your vector. For instance, you could easily estimate a VAR with a time-series of the number of car sales in Germany and the temperature in Australia. However, it's hard to sell to someone *why* you are doing this, even if you would find that one variable helps explaining the other...

Let's make an example of a VAR often applied in finance (starting with Campbell/Ammer, 1993). Concretely, I implement an approach to decompose unexpected returns into two parts: cash flow (CF) news and discount rate (DR) news. This is an important issue, as pointed out for instance by Chen/Zhao (2009): Return decomposition, which notation I'm going to use here as well:

Naturally, ﬁnancial economists place keen interest in the relative importance of CF news and DR news—the two fundamental components of asset valuation—in determining the time-series and cross-sectional variations of stock returns. Relatively speaking, CF news is more related to ﬁrm fundamentals because of its link to production; DR news can reﬂect time-varying risk aversion or investor sentiment. Their relative importance thus helps greatly to understand how the ﬁnancial market works, and provides the empirical basis for theoretical modeling.

However, financial economists are also interested on *why* returns didn't turn out to be the same as expected. Well obviously, some news must have arrived in period $t+1$ which led to a revisal and adjustment of the stock price, which in turn leads to a different return. The Campbell/Shiller decomposition shows that there are only two relevant parameters: news about future expected cash flows and news about future expected returns. As the above quote already shows, separating between these two is an important issue in financial research.

This derivation looks more complicated than it is. Basically, it's just the application of a perpetuity or infinite geometric series. Why do we have to apply a perpetuity here? Well, the VAR tells us that returns today are explained by returns from last period multiplied by a persistence factor and a random shock. However, returns last period were explained by returns two periods ago and so on. So this means that every shock is not transitory (which means it only has relevance for one period), but is persistent.

Also, maybe some of you are like me and get a headache when dealing with matrix multiplication. For those, I want to explain the computation of the $\lambda$ a little longer. This is a generalization of a geometric series which is called a Neumann series in mathematics. It states that

$$ (I - A)^{-1} = \sum_{j=0}^{\infty} A^j $$

This formula only works if the sum of each row is smaller than 1. There are two subleties to note though:

- $A^j$ does not mean an element-wise operation, but a $j$-times multiplication of the matrix with itself. In
`R`

, though, if you just write $A^j$, you get the former, not the latter. If you want the latter, you have to use a special operator from the`expm`

package (see discussion on SO). (I don't want to confuse you, so to be clear: you don't need that package here because the above formula is so much easier than applying the sum-formula. But if you want to check that the formula is correct, you can't just call $A^j$ in`R`

.) - $A^{-1}$ in
`R`

is not identical to what is meant here! In`R`

, it just returns the reciprocal of each element. In mathematics, it means that the inverse of a matrix is needed ($A A^{-1} = I$).

The big takeaway is that you have to be really careful when implementing matrix formulas in `R`

. I don't have a mathematical background, so I always start the most obvious way, i.e. just type $A^j$ and $A^{-1}$ and get completely non-sensical results.

So let's check that the Neumann series formula actually works. Here, I start with $j=1$ instead of $j=0$, so the formula has to be $A (I - A)^{-1}$.

```
library(expm)
```

```
## Loading required package: Matrix
```

```
## Loading required package: lattice
```

```
## Attaching package: 'expm'
```

```
## The following object(s) are masked from 'package:Matrix':
##
## expm
```

```
mat <- matrix(c(.1, .2, .3, .4), nrow=2)
res <- matrix(0, nrow=2, ncol=2)
for (j in 1:1000) {
res[1,1] <- res[1,1] + (mat %^% j)[1,1]
res[2,1] <- res[2,1] + (mat %^% j)[2,1]
res[1,2] <- res[1,2] + (mat %^% j)[1,2]
res[2,2] <- res[2,2] + (mat %^% j)[2,2]
}
res
```

```
## [,1] [,2]
## [1,] 0.2500 0.625
## [2,] 0.4167 0.875
```

```
mat %*% solve(diag(2) - mat)
```

```
## [,1] [,2]
## [1,] 0.2500 0.625
## [2,] 0.4167 0.875
```

As you can see, applying the Neumann series formula or doing it the hard way lead to the same results.

(This, by the way, is the big objection Chen/Zhao (2009) have against the return decomposition approach. Most studies model the DR news part directly and the CF news part is backed out. So every modeling error one makes ends up in the residual which is nothing else than the CF news part. Hence, one cannot distinguish anymore between modeling noise and true CF news. They support their argument by two nice examples. First, they show that the return decomposition approach results in high CF news for government bonds although those securities do not have any CF news part by definition. Second , they show that this approach yields very different results for stocks, subject to the state variable that is used. This supports the hypothesis that the CF news is mostly modeling noise. If you are interested in this literature, make sure to also read Engsted/Pedersen/Tanggaard (2012): Pitfalls in VAR based return decompositions: A clarification. They respond to the critique brought forward by Chen/Zhao and defend the VAR based return decomposition approach.)

Alternatively, if we include log dividend growth in the state vector as the second element, we can compute the CF news part directly as

where $e2$ is a vector where the second element is 1 and the rest 0.

# Implementation

Let's try to replicate the results in table 4 of Chen/Da/Zhao (2013): What Drives Stock Price Movements? because they use state variables that are all available in the data set of Amit Goyal.

They use the following vector of state variables:

OK, let's read in the data. (You find more information about the data set on my Goyal/Welch (2008) replication post).

```
intYear <- 1927
#Use that to check out the other time period
#intYear <- 1946
library(data.table)
library(ggplot2)
library(lubridate)
library(vars)
library(reshape2)
annual <- read.csv2("/home/christophj/Dropbox/FinanceIssues/ReturnPredictability/Data/annual_update_2010.csv",
na.strings=c("NaN", "NA"), stringsAsFactors=FALSE)
annual <- as.data.table(annual)
annual <- annual[, IndexDiv := Index + D12]
annual <- annual[, dp := log(D12) - log(Index)]
annual <- annual[, ep := log(E12) - log(Index)]
vec_dy <- c(NA, annual[2:nrow(annual), log(D12)] - annual[1:(nrow(annual)-1), log(Index)])
annual <- annual[, dy := vec_dy]
annual <- annual[, logret :=c(NA,diff(log(Index)))]
vec_logretdiv <- c(NA, annual[2:nrow(annual), log(IndexDiv)] - annual[1:(nrow(annual)-1), log(Index)])
annual <- annual[, logretdiv:=vec_logretdiv]
annual <- annual[, logRfree := log(Rfree + 1)]
annual <- annual[, rp_div := logretdiv - logRfree]
annual <- annual[, div_growth := c(NA, diff(log(D12)))]
vec_state <- annual[yyyy >= intYear, list(logretdiv, div_growth, dp, eqis)]
```

So now that we have the vector of state variables, we can estimate the VAR. To do so, we use the package `vars`

in `R`

.

```
var_est <- VAR(vec_state, p=1, type="const")
summary(var_est)
```

```
##
## VAR Estimation Results:
## =========================
## Endogenous variables: logretdiv, div_growth, dp, eqis
## Deterministic variables: const
## Sample size: 83
## Log Likelihood: 531.442
## Roots of the characteristic polynomial:
## 0.949 0.419 0.419 0.407
## Call:
## VAR(y = vec_state, p = 1, type = "const")
##
##
## Estimation results for equation logretdiv:
## ==========================================
## logretdiv = logretdiv.l1 + div_growth.l1 + dp.l1 + eqis.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## logretdiv.l1 0.1231 0.1078 1.14 0.2572
## div_growth.l1 -0.2007 0.1762 -1.14 0.2581
## dp.l1 0.1160 0.0477 2.43 0.0174 *
## eqis.l1 -0.5069 0.1954 -2.59 0.0113 *
## const 0.5719 0.1722 3.32 0.0014 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.185 on 78 degrees of freedom
## Multiple R-Squared: 0.137, Adjusted R-squared: 0.0932
## F-statistic: 3.11 on 4 and 78 DF, p-value: 0.0199
##
##
## Estimation results for equation div_growth:
## ===========================================
## div_growth = logretdiv.l1 + div_growth.l1 + dp.l1 + eqis.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## logretdiv.l1 0.3484 0.0525 6.64 3.8e-09 ***
## div_growth.l1 0.2034 0.0857 2.37 0.02 *
## dp.l1 -0.0199 0.0232 -0.86 0.39
## eqis.l1 -0.1282 0.0951 -1.35 0.18
## const -0.0409 0.0838 -0.49 0.63
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.0899 on 78 degrees of freedom
## Multiple R-Squared: 0.439, Adjusted R-squared: 0.41
## F-statistic: 15.3 on 4 and 78 DF, p-value: 2.94e-09
##
##
## Estimation results for equation dp:
## ===================================
## dp = logretdiv.l1 + div_growth.l1 + dp.l1 + eqis.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## logretdiv.l1 0.2267 0.1212 1.87 0.065 .
## div_growth.l1 0.4237 0.1979 2.14 0.035 *
## dp.l1 0.8928 0.0536 16.65 <2e-16 ***
## eqis.l1 0.3895 0.2196 1.77 0.080 .
## const -0.4819 0.1935 -2.49 0.015 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.208 on 78 degrees of freedom
## Multiple R-Squared: 0.81, Adjusted R-squared: 0.8
## F-statistic: 83 on 4 and 78 DF, p-value: <2e-16
##
##
## Estimation results for equation eqis:
## =====================================
## eqis = logretdiv.l1 + div_growth.l1 + dp.l1 + eqis.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## logretdiv.l1 0.17269 0.05241 3.30 0.0015 **
## div_growth.l1 0.00365 0.08562 0.04 0.9661
## dp.l1 0.04280 0.02320 1.85 0.0688 .
## eqis.l1 0.46124 0.09498 4.86 6e-06 ***
## const 0.23049 0.08369 2.75 0.0073 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.0898 on 78 degrees of freedom
## Multiple R-Squared: 0.36, Adjusted R-squared: 0.328
## F-statistic: 11 on 4 and 78 DF, p-value: 4.04e-07
##
##
##
## Covariance matrix of residuals:
## logretdiv div_growth dp eqis
## logretdiv 0.03415 0.001279 -0.034346 0.001159
## div_growth 0.00128 0.008078 0.006991 0.000426
## dp -0.03435 0.006991 0.043096 -0.000821
## eqis 0.00116 0.000426 -0.000821 0.008064
##
## Correlation matrix of residuals:
## logretdiv div_growth dp eqis
## logretdiv 1.0000 0.0770 -0.895 0.0699
## div_growth 0.0770 1.0000 0.375 0.0527
## dp -0.8953 0.3747 1.000 -0.0440
## eqis 0.0699 0.0527 -0.044 1.0000
```

The function call is pretty much self-explonatory. We estimate a VAR with only one lag. However, let's explain the output results of the `summary`

function a little.

There are basically four summary outputs of regressions stacked up. This makes sense if you check the definition of a VAR further above again; a VAR basically wants to explain every current value of a variable with its previous value (in the case of `p=1`

, otherwise with its previous value**s**) and the previous values of the other variables in the vector. Since we only want to allow for linear relations between those variables, we are basically estimating an OLS for every variable in the vector. So we can easily replicate the results by running the OLS ourselves. Let's do that for the `eqis`

variable in the data set. (I will use the package `dyn`

for that because we have to lag the independent variables. To use the package, I have to transform the vector of state variables into a time-series object.)

```
library(dyn)
summary(dyn$lm(eqis ~ lag(logretdiv, -1) +
lag(div_growth, -1) +
lag(dp, -1) +
lag(eqis, -1),
data=ts(vec_state)))
```

```
##
## Call:
## lm(formula = dyn(eqis ~ lag(logretdiv, -1) + lag(div_growth,
## -1) + lag(dp, -1) + lag(eqis, -1)), data = ts(vec_state))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.2830 -0.0504 -0.0152 0.0311 0.3425
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.23049 0.08369 2.75 0.0073 **
## lag(logretdiv, -1) 0.17269 0.05241 3.30 0.0015 **
## lag(div_growth, -1) 0.00365 0.08562 0.04 0.9661
## lag(dp, -1) 0.04280 0.02320 1.85 0.0688 .
## lag(eqis, -1) 0.46124 0.09498 4.86 6e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0898 on 78 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.36, Adjusted R-squared: 0.328
## F-statistic: 11 on 4 and 78 DF, p-value: 4.04e-07
```

As you can see, we are getting exactly the same coefficients this way.

Next, let's calculate $\lambda$. To do so, $\rho$ is set to $0.96$.

```
rho <- 0.96
Gamma <- t(sapply(coef(var_est), FUN=function(df) {df[1:4, 1]}))
lambda <- (rho * Gamma) %*% solve(diag(4) - rho * Gamma)
```

A short explanation on how the `Gamma`

is computed. First, remember that $\Gamma$ is the matrix of coefficients that basically completely describes the VAR. So for instance, to explain the log equity returns which is the first element of the state vector, we use the first row of $\Gamma$. The first element in this row is the OLS regression coefficient of the previous log equity return regressed on the current equity return, the second element is the coefficient of the previous log dividend growth on the current loq equity return, and so forth.

However, the function `coef`

applied on a `vars`

object doesn't return such a matrix, but a list of results, where each element of the list is basically the results of one OLS. So we want to loop through those list elements and get the coefficients, which are the first four rows in the first column of each list object. This is exactly what is done in the `sapply`

call.

Now we have all the ingredients to compute both the DR and the CF news. Also, the return news is just the residual:

```
u <- resid(var_est)[, 1]
DR_news <- as.vector(c(1,0,0,0) %*% lambda %*% t(resid(var_est)))
CF_news <- as.vector(c(0,1,0,0) %*% (diag(4) + lambda) %*% t(resid(var_est)))
#Alternatively, backed out
CF_news_backed <- as.vector((c(1,0,0,0) + c(1,0,0,0) %*% lambda) %*% t(resid(var_est)))
#Other ways of writing that
#CF_news_backed <- as.vector(c(1,0,0,0) %*% t(resid(var_est)) + c(1,0,0,0) %*% lambda %*% t(resid(var_est)))
#CF_news_backed <- u + DR_news
```

```
#Regression coefficients as reported in table 4 of Chen/Da/Zhao (2013)
summary(lm(DR_news ~ u))$coef[2, 1]*-1
```

```
## [1] 0.5612
```

```
summary(lm(CF_news_backed ~ u))$coef[2, 1]
```

```
## [1] 0.4388
```

```
summary(lm(CF_news ~ u))$coef[2, 1]
```

```
## [1] 0.4038
```

```
#Variance decomposition; terms have to add to 1
var(DR_news)/var(u)
```

```
## [1] 0.586
```

```
var(CF_news_backed)/var(u)
```

```
## [1] 0.4637
```

```
-2*cov(CF_news_backed, DR_news)/var(u)
```

```
## [1] -0.0497
```

Those results are pretty similar to Chen/Da/Zhao (2013). So for the time period $1927-2010$, DR and CF news seem to be equally important. If you set `intYear <- 1946`

, however, the regression coefficient of discount rate news on unexpected return is over 1, while CF news has a negative coefficient. This means that positive news on cash flows has a negative impact on returns, which is counterintuitive. As you can see, this approach is quite sensitive to the time period.

blog comments powered by Disqus