Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

Day 3: Heterogeneity and Dynamics in Data

Dynamic Regression Models

Robert W. Walker

August 8, 2022

1

Outline for Day 3

  1. Dynamic Models
  2. GLS vs. OLS and fixes
2

Day 3: Dynamic Models

The ARIMA approach is fundamentally inductive. The workflow involves the use of empirical values of ACFs and PACFs to engage in model selection. Dynamic models engage theory/structure to impose more stringent assumptions for producing estimates.

3

Time Series Linear Models/Dynamic Models

First, a result. Aitken Theorem

In a now-classic paper, Aitken generalized the Gauss-Markov theorem to the class of Generalized Least Squares estimators. It is important to note that these are GLS and not FGLS estimators. What is the difference? The two GLS estimators considered by Stimson are not strictly speaking GLS.

Definition ˆβGLS=(XΩ1X)1XΩ1y

Properties

(1) GLS is unbiased.
(2) Consistent.
(3) Asymptotically normal.
(4) MV(L)UE

4

A Quick Example

The variance/covariance matrix of the errors for a first-order autoregressive process is useful to derive.

  1. The matrix is banded; observations separated by one point in time are correlated ρ. Period two is ρ2; the corners are ρT1. The diagonal is one.
5

A Quick Example

The variance/covariance matrix of the errors for a first-order autoregressive process is useful to derive.

  1. The matrix is banded; observations separated by one point in time are correlated ρ. Period two is ρ2; the corners are ρT1. The diagonal is one.

  2. What I have actually described is the correlation; the relevant autocovariances are actually defined by σ2ρs1ρ2 where s denotes the time period separation.

6

A Quick Example

The variance/covariance matrix of the errors for a first-order autoregressive process is useful to derive.

  1. The matrix is banded; observations separated by one point in time are correlated ρ. Period two is ρ2; the corners are ρT1. The diagonal is one.

  2. What I have actually described is the correlation; the relevant autocovariances are actually defined by σ2ρs1ρ2 where s denotes the time period separation.

  3. It is also straightforward to prove (tediously through induction) that this is invertible; it is square and the determinant is non-zero having assumed that |ρ<1|.

    • The 2x2 determinant is 11ρ2.
    • The 3x3 is 1(1ρ2)ρ(ρρ3)+ρ2(ρ2ρ2). The first term is positive and the second term is non-zero so long as ρ0. But even if ρ=0, we would have an identity matrix which is invertible.
7

Φ=σ2Ψ=σ2e(1ρ1ρ2ρT1ρ11ρ1ρT2ρ2ρ11ρT3ρT1ρT2ρT31)

given that et=ρet1+νt. A Toeplitz form....

If the variance is stationary, we can rewrite, σ2e=σ2ν1ρ2

A comment on characteristic roots....

8

Cochrane-Orcutt

We have the two key elements to implement this except that we do not know ρ; we will have to estimate it and estimates have uncertainty. But it is important to note this imposes exactly an AR(1). If the process is incorrectly specified, then the optimal properties do not follow. Indeed, the optimal properties also depend on an additional important feature.

9

What does the feasible do?

We need to estimate things to replace unknown covariance structures and coverage will depend on properties of the estimators of these covariances.

  • Consistent estimators will work but there is euphemistically considerable variation in the class of consistent estimators.

  • Contrasting the Beck and Katz/White approach with the GLS approach is a valid difference in philosophies.1 One takes advantage of OLS and Basus Theorem; one goes full Aitken.

1 We will return to this when we look at Hausman because this is the essential issue.

10

Incremental Models

yt=a1yt1+ϵt

is the simplest dynamic model but it cannot be estimated consistently, in general terms, in the presence of serial correlation. Why?

11

Incremental Models

yt=a1yt1+ϵt

is the simplest dynamic model but it cannot be estimated consistently, in general terms, in the presence of serial correlation. Why? The key condition for unbiasedness is violated because E(yt1ϵt)0. OLS will not generally work.

A note on dynamic interpretation.

12

Incremental Models with Covariates

yt=a1yt1+βXt+ϵt

The problem is fitting and the key issue is white noise residuals post-estimation. But we have to assume a structure and implement it.

13

Distributed Lag Models

yt=α+β0Xt+β1xt1++ϵt

The impact of x occurs over multiple periods. It relies on theory, or perhaps analysis using information criteria/F [owing to quasi-nesting and missing data]. OLS is a fine solution to this problem but the search space of models is often large.

In response to this problem, we have structured distributed lag models; there are many such schemes.

  • Koyck/Geometric decay:
14

Distributed Lag Models

yt=α+β0Xt+β1xt1++ϵt

The impact of x occurs over multiple periods. It relies on theory, or perhaps analysis using information criteria/F [owing to quasi-nesting and missing data]. OLS is a fine solution to this problem but the search space of models is often large.

In response to this problem, we have structured distributed lag models; there are many such schemes.

  • Koyck/Geometric decay:
    short run and long-run effects are parametrically identified yt=α+β(1λ)j=0λjXtj+ϵ
15

Distributed Lag Models

yt=α+β0Xt+β1xt1++ϵt

The impact of x occurs over multiple periods. It relies on theory, or perhaps analysis using information criteria/F [owing to quasi-nesting and missing data]. OLS is a fine solution to this problem but the search space of models is often large.

In response to this problem, we have structured distributed lag models; there are many such schemes.

  • Koyck/Geometric decay:
    short run and long-run effects are parametrically identified yt=α+β(1λ)j=0λjXtj+ϵ
  • Almon (more arbitrary decay) yit=TFtA=0ρtAxttA+ϵt with coefficients that are ordinates of some general polynomial of degree TF>>q. The ρtA=TFk=0γktk.
16

Autoregressive Distributed Lag Models

yt=α+γ1yt1+β0Xt+β1Xt1+β2Xt2++ϵt

  • OLS is often used if iid; ϵt is unrelated to yt1 is common if nonsensical.
  • If not iid: GLS is needed.
  • The authors argue that the lagged dependent variable often yields white noise for free. As they also note, there is a deBoef and Keele paper showing the relationship between these models and a form of error correction models. More on that tomorrow.
  • There is substance to the timing of impacts.
17

Structural vs. Non-structural

Data analysis can quite yield models comparisons among competing dynamic structures. The key issue is that the analyst need divine the process; what is the relevant error process and what is the structure and timing of effects alongside the potential question of incremental adjustment. We need good theory for that.

Given such theory, we can take an equations as analysis approach, measure the variables, and derive reduced forms, and then recover parameter estimates deploying simultaneous equations methods. Very large such systems were a core part of early empirical macroeconomics. The failures of such systems led to the proposal of alternatives.

Chris Sims suggested a more flexible approach: the VAR.

18

VAR: Vector AutoRegression

  • Choose a relevant set of lag lengths and write each variable in the system as a function of lags of itself and other variables to the chosen lengths.1 For Stata and for R

1 A nice blog post with an extended example in R can be found on towardsdatascience. Kit Baum has a similar worked example in slides.

19

VAR: Vector AutoRegression

  • Choose a relevant set of lag lengths and write each variable in the system as a function of lags of itself and other variables to the chosen lengths.1 For Stata and for R

1 A nice blog post with an extended example in R can be found on towardsdatascience. Kit Baum has a similar worked example in slides.

  • The key insight is that this VAR is the reduced form to some more complicated as yet unspecified structural form.
20

VAR: Vector AutoRegression

  • Choose a relevant set of lag lengths and write each variable in the system as a function of lags of itself and other variables to the chosen lengths.1 For Stata and for R

1 A nice blog post with an extended example in R can be found on towardsdatascience. Kit Baum has a similar worked example in slides.

  • The key insight is that this VAR is the reduced form to some more complicated as yet unspecified structural form.

  • But if the goal is to specify how variables related to one another and to use data to discover Granger causality and responses to impulse injected in the system.

21

A very simple example

library(forecast)
mdeaths
fdeaths
save(mdeaths, fdeaths, file = "./img/LungDeaths.RData")
library(hrbrthemes)
load(url("https://github.com/robertwwalker/Essex-Data/raw/main/LungDeaths.RData"))
Males <- mdeaths; Females <- fdeaths
Lung.Deaths <- cbind(Males, Females) %>% as_tsibble()
Lung.Deaths %>% autoplot() + theme_ipsum_rc()
22

23
lung_deaths <- cbind(mdeaths, fdeaths) %>%
as_tsibble(pivot_longer = FALSE)
fit <- lung_deaths %>%
model(VAR(vars(mdeaths, fdeaths) ~ AR(3)))
report(fit)
Series: mdeaths, fdeaths
Model: VAR(3) w/ mean
Coefficients for mdeaths:
lag(mdeaths,1) lag(fdeaths,1) lag(mdeaths,2) lag(fdeaths,2)
0.6675 0.8074 0.3677 -1.4540
s.e. 0.3550 0.8347 0.3525 0.8088
lag(mdeaths,3) lag(fdeaths,3) constant
0.2606 -1.1214 538.7817
s.e. 0.3424 0.8143 137.1047
Coefficients for fdeaths:
lag(mdeaths,1) lag(fdeaths,1) lag(mdeaths,2) lag(fdeaths,2)
0.2138 0.4563 0.0937 -0.3984
s.e. 0.1460 0.3434 0.1450 0.3328
lag(mdeaths,3) lag(fdeaths,3) constant
0.0250 -0.315 202.0027
s.e. 0.1409 0.335 56.4065
Residual covariance matrix:
mdeaths fdeaths
mdeaths 58985.95 22747.94
fdeaths 22747.94 9983.95
log likelihood = -812.35
AIC = 1660.69 AICc = 1674.37 BIC = 1700.9
24
fit2 <- lung_deaths %>%
model(VAR(vars(mdeaths, fdeaths) ~ AR(2)))
report(fit2)
Series: mdeaths, fdeaths
Model: VAR(2) w/ mean
Coefficients for mdeaths:
lag(mdeaths,1) lag(fdeaths,1) lag(mdeaths,2) lag(fdeaths,2) constant
0.9610 0.3340 0.1149 -1.3379 443.8492
s.e. 0.3409 0.8252 0.3410 0.7922 124.4608
Coefficients for fdeaths:
lag(mdeaths,1) lag(fdeaths,1) lag(mdeaths,2) lag(fdeaths,2) constant
0.3391 0.2617 -0.0601 -0.2691 145.0546
s.e. 0.1450 0.3510 0.1450 0.3369 52.9324
Residual covariance matrix:
mdeaths fdeaths
mdeaths 62599.51 24942.79
fdeaths 24942.79 11322.70
log likelihood = -833.17
AIC = 1694.35 AICc = 1701.98 BIC = 1725.83
25
fit %>%
fabletools::forecast(h=12) %>%
autoplot(lung_deaths)

26

Female

lung_deaths %>%
model(VAR(vars(mdeaths, fdeaths) ~ AR(3))) %>%
residuals() %>%
pivot_longer(., cols = c(mdeaths,fdeaths)) %>%
filter(name=="fdeaths") %>%
as_tsibble(index=index) %>%
gg_tsdisplay(plot_type = "partial") + labs(title="Female residuals

27

Male

lung_deaths %>%
model(VAR(vars(mdeaths, fdeaths) ~ AR(3))) %>%
residuals() %>%
pivot_longer(., cols = c(mdeaths,fdeaths)) %>%
filter(name=="mdeaths") %>%
as_tsibble(index=index) %>%
gg_tsdisplay(plot_type = "partial") + labs(title="Male residuals

28

Easy Impulse Response

What happens if I shock one of the series; how does it work through the system?

The idea behind an impulse-response is core to counterfactual analysis with time series. What does our future world look like and what predictions arise from it and the model we have deployed?

29

Easy Impulse Response

What happens if I shock one of the series; how does it work through the system?

The idea behind an impulse-response is core to counterfactual analysis with time series. What does our future world look like and what predictions arise from it and the model we have deployed?

Whether VARs or dynamic linear models or ADL models, these are key to interpreting a model in the real world.

30

Males

VARMF <- cbind(Males,Females)
mod1 <- vars::VAR(VARMF, p=3, type="const")
plot(vars::irf(mod1, boot=TRUE, impulse="Males"))

31

Females

plot(vars::irf(mod1, boot=TRUE, impulse="Females"))

32

Outline for Day 3

  1. Dynamic Models
  2. GLS vs. OLS and fixes
2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow