Loading [MathJax]/jax/output/CommonHTML/jax.js
+ - 0:00:00
Notes for current slide
Notes for next slide

Day 6: Panel Models

Fixed and Random and Some of Both

Robert W. Walker

August 15, 2022

1

Day 6: Models for Heterogeneity

  • With models, we need a model for comparison.
  • Some Useful Notation
  • Fixed and Random Effects
  • Comparing FE and RE with Hausman
  • The Multilevel generalization (Bell and Jones and others)
  • Backfilling details
2

The Dimensions of TSCS Summary

  • Presence of a time dimensions gives us a natural ordering.
  • Space is not irrelevant under the same circumstances as time -- nominal indices are irrelevant on some level. Defining space is hard. Ex. targeting of Foreign Direct Investment and defining proximity.
  • ANOVA is informative in this two-dimensional setting.
  • A part of any good data analysis is summary and characterization. The same is true here; let's look at some examples of summary in panel data settings.
3

Basic xt commands

In Statas language, xt is the way that one naturally refers to CSTS/TSCS data. Consider NT observations on some random variable yit where iN and tT. The TSCS/CSTS commands almost always have this prefix. \begin{itemize}

  • xtset: Declaring \texttt{xt} data
  • xtdes: Describing \texttt{xt} data structure
  • xtsum: Summarizing \texttt{xt} data
  • xttab: Summarizing categorical \texttt{xt} data.
  • xttrans: Transition matrix for \texttt{xt} data.
  • xtline: Line graphs for \texttt{xt} data.
4

A Primitive Question

Given two-dimensional data, how should we break it down? The most common method is unit-averages; we break each unit's time series on each element into deviations from their own mean. This is called the within transform. The between portion represents deviations between the unit's mean and the overall mean. Stationarity considerations are generically implicit.

5

Some Useful Variances and Notation

  • W(ithin) for unit i [NB: Thus the total within variance would be a summary over all iN]: Wi=Tt=1(xit¯xi)2
  • B(etween): BT=Ni=1(¯xi¯x)2
  • T(otal): T=Ni=1Tt=1(xit¯x)2
6

Some Useful Notation

  • The Kronecker Product : This is a simple way of condensing the notation for sets of matrices. It is important to note that conformity is not required. So, for a general matrix Akl and Bmn, we can write AB=(a11Ba12Ba13Ba21Ba22Ba23Ba31Ba32Ba33B) with a result C of dimension (km)(ln).
  • The inverse of a Kronecker product is well defined [under invertibility conditions] [AB]1=[A1B1]
  • As are products of Kronecker products (AB)(CD)=ACBD
7

Why is the Notation Useful?

Let A be a variance/covariance matrix across panels and B be the same matrix for a given panel. This is a fairly general way to conceive of a panel data problem.

  • Heteroscedasticity?
  • Temporal Autocorrelation?
  • Spatial Autocorrelation?
8

Heteroscedasticity

The homoscedastic case is described by σ2I.

The [unit] heteroscedastic case is described, generally, by σ2iINIt (σ21IT0000σ22IT00000000σ2NIT) The ultimate result will be of dimension NT×NT. The first T entries will be σ21, entries T+1 to 2T will be σ22 and the entries (N1)T+1 to NT will be σ2N. If we believed that the heteroscedasticity arose from time points rather than units, replace N with T and vice versa; i becomes t.

9

The Managable Autocorrelation Structure

Φ=σ2Ψ=σ2e(1ρ1ρ2ρT1ρ11ρ1ρT2ρ2ρ11ρT3ρT1ρT2ρT31)

given that et=ρet1+νt. A Toeplitz form....

This allows us to calculate the variance of e using results from basic statistics, i.e. Var(et)=ρ2Var(et1)+Var(ν). If the variance is stationary, we can rewrite,

σ2e=σ2ν1ρ2

10

Autocorrelation

When discussing heteroscedasticity, we notice that the off-diagonal elements are all zeroes. This is the assumption of no correlation among [somehow] adjacent elements. The somehow takes two forms: (1) spatial and (2) temporal. Just as before where time-induced heteroscedasticity simply involved interchanging N and T and i and t; the same idea prevails here.

11

Aitken's Theorem?

In a now-classic paper, Aitken generalized the Gauss-Markov theorem to the class of Generalized Least Squares estimators. It is important to note that these are GLS and not FGLS estimators. What is the difference? The two GLS estimators considered by Stimson are not strictly speaking GLS.

Definition: ˆβGLS=(XΩ1X)1XΩ1y

Properties:

  1. GLS is unbiased.
  2. Consistent.
  3. Asymptotically normal.
  4. MV(L)UE
12

What does the feasible do?

We need to estimate things to replace unknown covariance structures and coverage will depend on properties of the estimators of these covariances. Consistent estimators will work but there is euphemistically considerable variation in the class of consistent estimators. Contrasting the Beck and Katz/White approach with the GLS approach is a valid difference in philosophies. NB: We will return to this when we look at Hausman because this is the essential issue.

13

The Beck and Katz solution

Beck and Katz take a different tack to the general data types in common use (long T). The basic idea is to generate estimates using OLS because GLS can be quite bad. What do we need to be able to do this?

  • Locate a specification to purge serial correlation (in t).
  • [p. 638] Construct the panel corrected standard error. Construct Σ ($N \times N$) using ˆΣ=Tt=1eitejtT. Estimate the cross-sectional correlation matrix. Kronecker product this in IT remembering how we got I.
  • Inference with OLS and PCSE in the spirit of White, really Huber (1967) but the key is separable moments. Brief diversion here about separability; it turns out the result yesterday is what gives rise to the appropriate intuition.
14

Thinking about robust and cluster

Every Stata user is familiar with this, it seems. Though not developed by Stata (but Hardin, a student of Huber), the two are synonymous. What would these look like in an application?

  • just robust is unstructured heteroscedastic
  • cluster utilizes the multidimensional axes
15

xtgls and xtpcse

Two significant options of note

  1. panels(iid,heteroscedastic,correlated)
  2. correlation(ar1,psar1,independent)
16

panels

  • iid ϵϵ=σ2IN×N gives us homscedasticity and no spatial correlation; σ2 is scalar.
  • heteroscedastic ϵϵ=σ2iIN×N gives us heteroscedasticity and no spatial correlation; σ2i is an N-vector.
  • correlated ϵϵ=(σ21σ12σ13σ1Nσ21σ22σ23σ2Nσ31σ32σ23σ3NσN1σN2σN3σ2N) gives us heteroscedastic and (contemporaneously) spatially correlated errors
17

correlation

  • independent gives us no autoregression ϵϵ=IT×T .
  • ar1 gives us a global autoregressive parameter for the errors. In simple terms, all cross-sections share the same level of serial correlation. ϵϵ=(1ρρ2ρT1ρ1ρρT2ρ2ρ1ρT3ρT1ρT2ρT31)
  • psar1 gives us an autoregressive parameter for the errors that is unique to each cross-section. Each cross-section has a distinct level of serial correlation. ϵϵ=(1ρiρ2iρT1iρi1ρiρT2iρ2iρi1ρT3iρT1iρT2iρT3i1)
18

Unit Heterogeneity

Most discussions of panel data estimators draw on a fixed versus random effects distinction. The subtle distinction is important but perhaps overstated.

19

Definitions

Let's construct a general model: yit=αit+Xitβit+ϵit

  • Pooled Model: yit=α+Xitβ+ϵit
  • Year Dummies Model: yit=αt+Xitβ+ϵit
  • (Two-way) LSDV: yit=αi+αt+Xitβ+ϵit
  • Unit Dummies Model: yit=αi+Xitβ+ϵit
  1. Fixed effects: yitˉyi=ΔiXitβ+Δiϵit
  2. Random effects αiXit:

αi[α,σ2α]

ϵit[0,σ2ϵ]

20

Why does heterogeneity matter?

  • If ααii, then serial correlation is induced in the errors. At a minimum, this implies incorrect standard errors for inference and inefficiency.
  • If E[Xitαi]0, then (αiα) is an omitted variable with a consequent bias induced. We can draw a picture of this.

A brief simulation.

21

Some ANCOVA

  • Pooled Slope and Intercepts
  • Pooled Intercepts
  • Pooled Slopes
22

Constructing Estimators

  • Pooled Estimator ˆβT=T1xxTxy=(XX)1Xy
  • Within Estimator ˆβW=W1xxWxy
  • Between Estimator ˆβB=B1¯x¯xB¯x¯y
23

A Variation Identity

  • T=Wxx+W¯x¯x
  • In different notation, T=W+B or St=Sw+Sb.

Ni=1Tt=1(xit¯x)2=Ni=1Tt=1x2itNT¯x2

Ni=1Tt=1x2itTi¯xi+Ti¯xiNT¯x2

Ni=1Tt=1(xit¯xi)2Wi+Ni=1Ti(¯xi¯x)2BT

24

Back to ANCOVA

  1. RSS from Wi with DF = NTNKN
  2. RSS from W with DF = NTNK
  3. RSS from T with DF = NTK1

For total pooling, we can F-test 311. This is the least and most restricted sets of models. If we can reject this, pooling is (perhaps) justified? Now let's construct some others. Suppose we reject total pooling. Is it intercepts, slopes, or both? Imposing a slope restriction gives us 2, the F we want is 211. What do we get from 322? NB: It's conditional. We can also do this with time. This is a good starting point, but it is not as clean as we might like.

25

OLS as Weighted Average

ˆβOLS=[Stxx]1Stxy ˆβOLS=[Swxx+Sbxx]1(Swxy+Sbxy) ˆβOLS=[Swxx+Sbxx]1Swxy+[Swxx+Sbxx]1Sbxy

Let Fw=[Swxx+Sbxx]1SwxxFb=IFw=[Swxx+Sbxx]1Sbxx.

My claim is that ˆβOLS=Fwβw+Fbβb.

ˆβOLS=[Swxx+Sbxx]1Swxx[Swxx]1ISwxy+[Swxx+Sbxx]1Sbxx[Sbxx]1ISbxy ˆβOLS=[Swxx+Sbxx]1Swxy+[Swxx+Sbxx]1Sbxy

26

A Random Effects Estimator

  • Assume that the unit means have some distribution rather than being some fixed constant.
  • This allows (under normality) us to partition the global error into components.
  • The method is the same, the difference is the weighting by a covariance matrix with a known structure.
  • As we noted, there is a simple problem with the application of the OLS estimator if the error is correlated with the regressors.
  • How might we think about remedying this?
27

Comparing Fixed and Random Effects

  • The Hausman test: smart and broadly applicable idea. Wish it worked better... See V. E. Troeger.
  • Mundlak's argument merits consideration.
  • Pluemper and Troeger's idea is clever.
28

Hausman's Idea

The basic idea is that the fixed effects estimator is consistent but potentially inefficient. The random effects estimator is only consistent under the null. We can leverage this to form a test in the Hausman family using the result proved in the paper. This is implemented in Stata using model storage capabilities.

  • Estimate a consistent model
  • Store the result as XXX.
  • Estimate an efficient model
  • Store the result as YYY.
  • hausman XXX YYY
29

Mundlak

The basic idea behind Mundlak's paper is that the fixed versus random effects debate is ill conceived. Moreover, there is a right model. Why and how?

  • Conditional versus unconditional inference.
  • FE problem is inefficiency.
  • RE problem can be bias.
  • Maybe we want an MSE criterion?
  • As usual, N and T matter in size. Plug-in estimators in general.
30

Bell, Fairbrother, and Jones

Estimate a variant of the Mundlak model that accommodates all the concerns.

yit=β0+β1W(xit¯xi)+β2B¯xi+β3zi+(νi+ϵit)

31

First-Differences

Define Δ to be a difference operator so that we can define ΔX=XitXi,t1

Δy=yityi,t1

Observation: N(T-1) observations if Ti2i. Equality case is interesting. The first-difference estimator is then: Δy=β(ΔX)+ϵit

And an OLS estimator would simply look like: ˆβ=(ΔXΔX)1(ΔXΔy) NB: For T=2 show that FE is FD.

32

First Differences/Fixed Effects

Either transformation removes heterogeneity. The difference is that the two estimators operate at different orders of integration. The difference is not purely convenience; there is substance to this and theory can help. At the same time, the statistics matter.

33

Specification Testing and Interpretation in the Fixed Effects Model

  • F-test of the dummy variables. What does this mean?
  • Above can be done in one- and two- way frameworks.
  • The substance depends on the first-order question. Under what conditions are first-order effects unbiased (we know this)? The RE/GLS approach works when the orthogonality is maintained.
  • Example from Arellano, p. 40
34

Conditional versus unconditional prediction?

The fixed effect model is entirely conditional on the sample. If we do not know a unit fixed effect, the predictions are undefined. The random effects model can sample from the distribution of random effects.

35

Stata Implementation

  • xtreg: contains five estimators. For now, we will skip ($\texttt{pa}$).

  • be: the between effects estimator. ¯yi=¯xi+ϵi

  • fe: the fixed effects or within estimator. yCi=XCi+ϵit
    • re: the standard GLS random effects estimator.
  • mle: the maximum likelihood random effects estimator.
36

Random Effects in Estimation

  • The between estimator ignores all within variation ($\psi=0$).
  • OLS is a weighted average of between and within ($\psi=1$).
  • GLS is an optimally determined compromise given the orthogonality assumption ($0 \geq \psi \geq 1$).

That weight is not in any sense optimally determined, it is a function of the relative ratio of the two quantities (all variance counts the same). As Hsiao (p. 37) points out that the random effects estimator is often known as a quasi-demeaning estimator. This is because it is a partial within transformation.

37

Details on Random Effects GLS (FGLS)

We will start with the model we defined as random effects before. We defined random effects αiXit: αi[α,σ2α]ϵit[0,σ2ϵ]. Consider νit=αi+ϵit.

For a single cross-section (remembering the Kronecker product will help us here) E(νitνit)=σ2ϵIT+σ2α1T=Ω The inverse is given by Ω1=1σ2ϵ[ITσ2ασ2ϵ+Tσ2α1T]

38

We can also estimate this by using ordinary least squares applied to transformed data. The quasi-demeaning can be done in a first-stage with OLS estimates on the quasi-demeaned data. Recall the pooled regression uses no transformation. The within estimator uses complete demeaning. The random effects estimator is somewhere in between.

39

Random Effects Variance

Breusch and Pagan (modified by Baltagi and Li) have developed a Lagrange multiplier test of whether or not the random effects have a variance. The test statistic is defined as:

LM=NT2(T1)[N(Tϵit)2NTϵ2it1]χ21

40
. xtreg growth lagg opengdp openex openimp leftc central inter, re
Random-effects GLS regression Number of obs = 240
Group variable (i): country Number of groups = 16
-----------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------
lagg1 | .151848 .0865508 1.75 0.079 -.0177884 .3214843
opengdp | .0082889 .0010012 8.28 0.000 .0063267 .0102511
openex | .0019834 .0005903 3.36 0.001 .0008263 .0031404
openimp | -.0047988 .0010474 -4.58 0.000 -.0068518 -.0027459
leftc | -.0268801 .0108211 -2.48 0.013 -.048089 -.0056711
central | -.7428119 .2547157 -2.92 0.004 -1.242045 -.2435784
inter | .0138935 .0041671 3.33 0.001 .0057261 .0220609
_cons | 3.607517 .571187 6.32 0.000 2.488011 4.727023
-------------+---------------------------------------------------------------
sigma_u | .36517121
sigma_e | 2.0094449
rho | .03196908 (fraction of variance due to u_i)
-----------------------------------------------------------------------------
41

R-squareds

. xtreg growth lagg1 opengdp, fe
Fixed-effects (within) regression Number of obs = 240
Group variable (i): country Number of groups = 16
R-sq: within = 0.2562 Obs per group: min = 15
between = 0.0031 avg = 15.0
overall = 0.1563 max = 15
F(2,222) = 38.23
corr(u_i, Xb) = -0.3888 Prob > F = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .2647972 .0851979 3.11 0.002 .0968971 .4326972
opengdp | .0094949 .0011229 8.46 0.000 .007282 .0117078
_cons | .5289261 .3719065 1.42 0.156 -.2039929 1.261845
-------------+----------------------------------------------------------------
sigma_u | 1.142546
sigma_e | 2.0889953
rho | .23025918 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(15, 222) = 3.55 Prob > F = 0.0000
42
. reg Cgrowth Clagg1 Copengdp
Source | SS df MS Number of obs = 240
-------------+------------------------------ F( 2, 237) = 40.81
Model | 333.650655 2 166.825327 Prob > F = 0.0000
Residual | 968.786108 237 4.0877051 R-squared = 0.2562
-------------+------------------------------ Adj R-squared = 0.2499
Total | 1302.43676 239 5.4495262 Root MSE = 2.0218
-----------------------------------------------------------------------------
Cgrowth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------
Clagg1 | .2647972 .0824577 3.21 0.002 .1023536 .4272408
Copengdp | .0094949 .0010868 8.74 0.000 .0073539 .0116359
_cons | 1.30e-08 .1305071 0.00 1.000 -.2571021 .2571021
------------------------------------------------------------------------------
43

Betweens

. by country: egen gmean = mean(growth)
. by country: egen glmean = mean(lagg1)
. by country: egen opengdpmean = mean(opengdp)
. gen yhatb = _b[_cons] + _b[lagg1]*glmean + _b[opengdp]*opengdpmean
. reg gmean yhatb
Source | SS df MS Number of obs = 240
-------------+------------------------------ F( 1, 238) = 0.75
Model | .445360906 1 .445360906 Prob > F = 0.3868
Residual | 140.975583 238 .592334381 R-squared = 0.0031
-------------+------------------------------ Adj R-squared = -0.0010
Total | 141.420943 239 .591719429 Root MSE = .76963
------------------------------------------------------------------------------
gmean | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yhatb | -.0570801 .0658282 -0.87 0.387 -.1867605 .0726003
_cons | 3.185291 .2044862 15.58 0.000 2.782457 3.588125
------------------------------------------------------------------------------
44

Total

gen yhatT = _b[_cons] + _b[lagg1]*lagg1 + _b[opengdp]*opengdp
. fit growth yhatT
Source | SS df MS Number of obs = 240
-------------+------------------------------ F( 1, 238) = 44.11
Model | 225.744206 1 225.744206 Prob > F = 0.0000
Residual | 1218.11349 238 5.11812392 R-squared = 0.1563
-------------+------------------------------ Adj R-squared = 0.1528
Total | 1443.8577 239 6.0412456 Root MSE = 2.2623
------------------------------------------------------------------------------
growth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yhatT | .6927893 .1043154 6.64 0.000 .48729 .8982887
_cons | .9257153 .3465985 2.67 0.008 .2429227 1.608508
------------------------------------------------------------------------------

Extending this basic logic will hold for all \texttt{xtreg} estimators. Basically, think about them as projecting any given model result to the centered data, to group means, and to all data.

45

Random Coefficients

We saw fixed and random effects. The basic idea generalizes to regression coefficients on variables that are not unit-specific factors/indicators.

  • Random Coefficients Specifications (Swamy 1970)

yit=α+(¯β+μi)Xit+ϵit E[αi]=0;E[αiXit]=0 E[αiαj]={Δif i=j0if ij

Hsiao and Pesaran (2004, IZA DP 136) show that the GLS estimator is a matrix weighted average of the OLS estimator applied to each unit separately with weights inversely proportional to the covariance matrix for the unit.

46

xtrc: Implementing Random Coefficients

xtrc estimates the Swamy random coefficients model and provides us with a test statistic of parameter constancy. If the statistic is significantly different from zero, parameter constancy is rejected. Option betas gives us the unit-specifics. We have vce options here also.

Note, as with many xt commands, the jackknife is unit-based.

47

xtmixed

Stata has a mixed effects module that we can use for some things we have already seen and for extensions. I should say in passing that this also works for dimensions with nesting properties, though we are looking at two-dimensional data structures.

. sum
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
year | 240 1977 4.329523 1970 1984
country | 240 8.5 4.619406 1 16
growth | 240 3.013292 2.457895 -3.6 9.8
lagg1 | 240 3.119855 1.652682 -2.40641 6.683519
opengdp | 240 174.6452 146.2456 -32.1 736.02
----------+--------------------------------------------------------
openex | 240 489.7662 420.4374 30.94 2879.2
openimp | 240 482.8254 267.6722 64.96 1415.2
leftc | 240 34.79583 39.56008 0 100
central | 240 2.02421 .9593759 .4054115 3.618419
inter | 240 91.33376 117.5622 0 361.8419
48
. xtreg growth lagg1 opengdp openimp openex leftc, re
Random-effects GLS regression Number of obs = 240
Group variable (i): country Number of groups = 16
R-sq: within = 0.2960 Obs per group: min = 15
between = 0.2038 avg = 15.0
overall = 0.2811 max = 15
Random effects u_i ~ Gaussian Wald chi2(5) = 92.41
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .2194248 .0875581 2.51 0.012 .0478142 .3910355
opengdp | .0077965 .0009824 7.94 0.000 .005871 .0097219
openimp | -.0053695 .0009868 -5.44 0.000 -.0073035 -.0034355
openex | .0019647 .0006047 3.25 0.001 .0007796 .0031498
leftc | .0030365 .0036142 0.84 0.401 -.0040472 .0101202
_cons | 2.491734 .4633904 5.38 0.000 1.583505 3.399962
-------------+----------------------------------------------------------------
sigma_u | .21759529
sigma_e | 2.0364407
rho | .01128821 (fraction of variance due to u_i)
------------------------------------------------------------------------------
49

An MLE

. xtreg growth lagg1 opengdp openimp openex leftc, mle
Random-effects ML regression Number of obs = 240
Group variable (i): country Number of groups = 16
Random effects u_i ~ Gaussian Obs per group: min = 15
LR chi2(5) = 81.33
Log likelihood = -514.4714 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .1873509 .0881362 2.13 0.034 .014607 .3600947
opengdp | .0077706 .0009913 7.84 0.000 .0058276 .0097136
openimp | -.0055243 .0010506 -5.26 0.000 -.0075835 -.0034651
openex | .0020447 .0005936 3.44 0.001 .0008812 .0032082
leftc | .0044378 .0039745 1.12 0.264 -.0033521 .0122277
_cons | 2.583146 .5204807 4.96 0.000 1.563022 3.603269
-------------+----------------------------------------------------------------
/sigma_u | .5100119 .1962033 .2399497 1.084028
/sigma_e | 2.018389 .0957214 1.839233 2.214995
rho | .0600166 .0445522 .0110832 .2056057
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01)= 3.56 Prob>=chibar2 = 0.030
50
. xtmixed growth lagg1 opengdp openimp openex leftc || R.country, mle
Mixed-effects ML regression Number of obs = 240
Group variable: _all Number of groups = 1
Wald chi2(5) = 97.44
Log likelihood = -514.4714 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .1873501 .0859494 2.18 0.029 .0188925 .3558078
opengdp | .0077706 .0009911 7.84 0.000 .0058281 .009713
openimp | -.0055243 .0010452 -5.29 0.000 -.0075729 -.0034757
openex | .0020447 .0005915 3.46 0.001 .0008854 .0032039
leftc | .0044378 .0038479 1.15 0.249 -.003104 .0119796
_cons | 2.583148 .5173579 4.99 0.000 1.569145 3.597151
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
_all: Identity |
sd(R.country) | .5100191 .1962046 .2399545 1.084037
-----------------------------+------------------------------------------------
sd(Residual) | 2.018388 .0957229 1.83923 2.214997
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) = 3.56 Prob >= chibar2 = 0.0296
51

General Stata things, , vce()

For virtually all Stata commands, we can acquire multiple variance/covariance matrices of the parameters.

  • , robust sometimes
  • , cluster() sometimes
  • , vce(boot)
  • , vce(jack)
52

xtmixed

Will allow us to do tons of things. In particular, we can play with the residual correlation matrix using the option residuals. One can recreate virtually everything that we have seen so far this way. The remaining task for you in the lab is to figure out what all you can make it do.

  • exchangeable
  • ar
  • ma
  • unstructured
  • banded
  • toeplitz
  • exponential
53

Mixed Effects Models in Stata with xtmixed

Mixed effects models will allow us to estimate many interesting models for \texttt{xt} data.

  • Simple random effects
  • Crossed random effects
  • Random Coefficients
  • Determined random coefficients
54

Examples

For the simple random effects estimator, there are two ways to do it via ML.

  • xtreg depvar indvars, mle
  • xtmixed depvar indvars || \_all: R.UnitID, mle
55
. xtreg growth lagg1 opengdp openimp openex leftc, mle
LR chi2(5) = 81.33
Log likelihood = -514.4714 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .1873509 .0881362 2.13 0.034 .014607 .3600947
opengdp | .0077706 .0009913 7.84 0.000 .0058276 .0097136
openimp | -.0055243 .0010506 -5.26 0.000 -.0075835 -.0034651
openex | .0020447 .0005936 3.44 0.001 .0008812 .0032082
leftc | .0044378 .0039745 1.12 0.264 -.0033521 .0122277
_cons | 2.583146 .5204807 4.96 0.000 1.563022 3.603269
-------------+----------------------------------------------------------------
/sigma_u | .5100119 .1962033 .2399497 1.084028
/sigma_e | 2.018389 .0957214 1.839233 2.214995
rho | .0600166 .0445522 .0110832 .2056057
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01)= 3.56 Prob>=chibar2 = 0.030
. xtmixed growth lagg1 opengdp openimp openex leftc || _all: R.country, mle
Wald chi2(5) = 97.44
Log likelihood = -514.4714 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .1873501 .0859494 2.18 0.029 .0188925 .3558078
opengdp | .0077706 .0009911 7.84 0.000 .0058281 .009713
openimp | -.0055243 .0010452 -5.29 0.000 -.0075729 -.0034757
openex | .0020447 .0005915 3.46 0.001 .0008854 .0032039
leftc | .0044378 .0038479 1.15 0.249 -.003104 .0119796
_cons | 2.583148 .5173579 4.99 0.000 1.569145 3.597151
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
_all: Identity |
sd(R.country) | .5100191 .1962046 .2399545 1.084037
-----------------------------+------------------------------------------------
sd(Residual) | 2.018388 .0957229 1.83923 2.214997
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) = 3.56 Prob >= chibar2 = 0.0296
56

Crossed Random Effects

Mixed-effects ML regression Number of obs = 240
Group variable: _all Number of groups = 1
Obs per group: min = 240
avg = 240.0
max = 240
Wald chi2(5) = 7.18
Log likelihood = -503.45468 Prob > chi2 = 0.2076
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .0059048 .1296512 0.05 0.964 -.2482069 .2600164
opengdp | .0001904 .0016087 0.12 0.906 -.0029626 .0033433
openimp | -.0030722 .0015617 -1.97 0.049 -.006133 -.0000114
openex | .002307 .0010185 2.27 0.024 .0003108 .0043032
leftc | .0048234 .0036133 1.33 0.182 -.0022585 .0119053
_cons | 3.147245 .7630121 4.12 0.000 1.651768 4.642721
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
_all: Identity |
sd(R.country) | .6667379 .1900389 .3813634 1.165658
-----------------------------+------------------------------------------------
_all: Identity |
sd(R.year) | 1.554459 .4033566 .9347738 2.58495
-----------------------------+------------------------------------------------
sd(Residual) | 1.752177 .0885389 1.586961 1.934595
------------------------------------------------------------------------------
LR test vs. linear regression: chi2(2) = 25.59 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference
. estimates store MLEtwowayRE
57
. lrtest MLEtwowayRE MLEunitRE
Likelihood-ratio test LR chibar2(01) = 22.03
(Assumption: MLEunitRE nested in MLEtwowayRE) Prob > chibar2 = 0.0000
. qui xtmixed growth lagg1 opengdp openimp openex leftc || _all: R.year, mle
. lrtest MLEtwowayRE .
Likelihood-ratio test LR chibar2(01) = 10.04
(Assumption: . nested in MLEtwowayRE) Prob > chibar2 = 0.0008
58
. xtmixed growth lagg1 opengdp openimp openex leftc || country: leftc, covariance(unstructured)
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log restricted-likelihood = -540.17955
Iteration 1: log restricted-likelihood = -540.15493
Iteration 2: log restricted-likelihood = -540.15472
Iteration 3: log restricted-likelihood = -540.15472
Computing standard errors:
Mixed-effects REML regression Number of obs = 240
Group variable: country Number of groups = 16
Obs per group: min = 15
avg = 15.0
max = 15
Wald chi2(5) = 95.70
Log restricted-likelihood = -540.15472 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lagg1 | .170562 .0869219 1.96 0.050 .0001982 .3409259
opengdp | .0078608 .0010053 7.82 0.000 .0058905 .0098312
openimp | -.0055371 .0010763 -5.14 0.000 -.0076465 -.0034277
openex | .0020745 .0005967 3.48 0.001 .0009051 .0032439
leftc | .0039332 .0046265 0.85 0.395 -.0051346 .013001
_cons | 2.570449 .5444497 4.72 0.000 1.503347 3.637551
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
country: Unstructured |
sd(leftc) | .0089451 .0078813 .0015908 .0502989
sd(_cons) | .6566839 .2658791 .2969756 1.452085
corr(leftc,_cons) | -.6168731 .5300418 -.9835763 .7429732
-----------------------------+------------------------------------------------
sd(Residual) | 2.022226 .098202 1.83863 2.224156
------------------------------------------------------------------------------
LR test vs. linear regression: chi2(3) = 5.40 Prob > chi2 = 0.1445
Note: LR test is conservative and provided only for reference
. * The coefficient is insignificant as is the randomness
59
. estat recovariance
Random-effects covariance matrix for level country
| leftc _cons
-------------+----------------------
leftc | .00008
_cons | -.0036236 .4312338
. capture drop u1 u2
. predict u*, reffects
60
. by country, sort: sum u*
---------------------------------------------------------------------------------------------------------------
-> country = AUL
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
u1 | 15 -.0006591 0 -.0006591 -.0006591
u2 | 15 .1237475 0 .1237475 .1237475
-> country = AUS
u1 | 15 .0005591 0 .0005591 .0005591
u2 | 15 .0125652 0 .0125652 .0125652
-> country = BEL
u1 | 15 -.0000316 0 -.0000316 -.0000316
u2 | 15 -.0002924 0 -.0002924 -.0002924
-> country = CAN
u1 | 15 -.0035756 0 -.0035756 -.0035756
u2 | 15 .4255248 0 .4255248 .4255248
-> country = DEN
u1 | 15 .0019625 0 .0019625 .0019625
u2 | 15 -.462575 0 -.462575 -.462575
-> country = FIN
u1 | 15 .003543 0 .003543 .003543
u2 | 15 .1606634 0 .1606634 .1606634
-> country = FRA
u1 | 15 -.0083416 0 -.0083416 -.0083416
u2 | 15 .3128709 0 .3128709 .3128709
-> country = GER
u1 | 15 .0011514 0 .0011514 .0011514
u2 | 15 -.3119804 0 -.3119804 -.3119804
-> country = IRE
u1 | 15 -.0021854 0 -.0021854 -.0021854
u2 | 15 .3908045 0 .3908045 .3908045
-> country = ITA
u1 | 15 .0002358 0 .0002358 .0002358
u2 | 15 -.1705837 0 -.1705837 -.1705837
-> country = JAP
u1 | 15 -.0090248 0 -.0090248 -.0090248
u2 | 15 1.074025 0 1.074025 1.074025
-> country = NET
u1 | 15 .0031352 0 .0031352 .0031352
u2 | 15 -.2520462 0 -.2520462 -.2520462
-> country = NOR
u1 | 15 .0088704 0 .0088704 .0088704
u2 | 15 .0223926 0 .0223926 .0223926
-> country = SWE
u1 | 15 .002398 0 .002398 .002398
u2 | 15 -.5351107 0 -.5351107 -.5351107
-> country = UK
u1 | 15 .000085 0 .000085 .000085
u2 | 15 -.5665398 0 -.5665398 -.5665398
-> country = USA
u1 | 15 .0018777 0 .0018777 .0018777
u2 | 15 -.2234658 0 -.2234658 -.2234658
61

A Plot

62

Wilson and Butler

  • Survey of papers using TSCS data and methods(?)
  • Vast majority do nothing about space or time.
  • Does it matter?

  • Table 3

  • Table 4

  • What do we do? Raise the bar for positive findings and look at multiple models trying to tease out the role of particular assumptions as necessary and/or sufficient for results.

63

More on xtpcse

64

Holding on to data

  • preserve
  • restore
65

Testing the Null Hypothesis of No Random Effects

. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects:
growth[country,t] = Xb + u[country] + e[country,t]
Estimated results:
| Var sd = sqrt(Var)
---------+-----------------------------
growth | 6.041246 2.457895
e | 4.147091 2.036441
u | .0473477 .2175953
Test: Var(u) = 0
chi2(1) = 4.39
Prob > chi2 = 0.0361
66

xttest

. xttest1
Tests for the error component model:
growth[country,t] = Xb + u[country] + v[country,t]
v[country,t] = rho v[country,(t-1)] + e[country,t]
Estimated results:
Var sd = sqrt(Var)
---------+-----------------------------
growth | 6.041246 2.457895
e | 4.037869 2.0094449
u | .13335 .36517121
Tests:
Random Effects, Two Sided:
LM(Var(u)=0) = 1.00 Pr>chi2(1) = 0.3174
ALM(Var(u)=0) = 0.54 Pr>chi2(1) = 0.4610
Random Effects, One Sided:
LM(Var(u)=0) = 1.00 Pr>N(0,1) = 0.1587
ALM(Var(u)=0) = 0.74 Pr>N(0,1) = 0.2305
Serial Correlation:
LM(rho=0) = 0.74 Pr>chi2(1) = 0.3906
ALM(rho=0) = 0.28 Pr>chi2(1) = 0.5961
Joint Test:
LM(Var(u)=0,rho=0) = 1.28 Pr>chi2(2) = 0.5271
* We cannot reject the null hypothesis of no variation in the random effects.
Also no evidence of serial correlation.
Remember, with the lagged endogenous variable on the right hand side,
the random effects are included if they are there.
67

xttest1

  • LM test for random effects, assuming no serial correlation
  • Adjusted LM test for random effects, which works even under serial correlation
  • One-sided version of the LM test for random effects
  • One-sided version of the adjusted LM test for random effects
  • LM joint test for random effects and serial correlation
  • LM test for first-order serial correlation, assuming no random effects
  • Adjusted test for first-order serial correlation, which works even under random effects
68

xtgls

  • corr: t structure ([ar] or [ps]ar) is ρ common or not.
  • panels: i structure (iid, [h]eteroscedastic, [c]orrelated (and [h]))
  • rhotype: regress (regression using lags), dw - Durbin-Watson, freg (forward regression uses leads), nagar, theil, tscorr
  • igls (iterate or two-step)
  • force for unbalanced.
69

xttest2 and xttest3

After fe or xtgls, we have two tests pre-programmed.

  • We have a test of independence (within) in xttest2
  • We have a test of homoscedasticity (within) in xttest3
70

xtserial

Wooldridge presents a test for serial correlation.
71

xtcsd

How do we test for cross-sectional dependence?

  • Generally used for small T and large N settings.
  • Three methods: \texttt{xtcsd, pesaran friedman frees}
  • This is the panel correction in PCSE
72

xtscc

Driscoll and Kraay (1998) describe a robust covariance matrix estimator for pooled and fixed effects regression models that contain a large time dimension. The approach is robust to heteroscedasticity, autocorrelation, and spatial correlation.

73

We're Here for Fancy Estimators, Why is Everything OLS?

There are limitation imposed by what people have programmed in terms of regression diagnostics. However, if we can fit the same model by OLS, we can use standard regression diagnostics post-estimation to avoid calculating the diagnostics by hand. Many diagnostics are pre-programmed.

74

OLS Diagnostics

  • We could also use other standard diagnostics in the OLS framework. If you are going to intensively use Stata, books like Statistics with Stata are quite useful.
  • estat ovtest, [rhs] will give us Ramsey's RESET test. The option gives us RHS variables, otherwise we just use fitted values. The default is a Wald test applied to the regression yit=Xitβ+ˆy2γ1+ˆy3γ2+ˆy4γ3+ϵit and with option rhs the powers are applied to the right-hand side variables.
  • predict ... , dfits and dfbeta: We also have the various dffits and dfbeta statistics for use in diagnosing leverage. The dfit is the studentized residual multiplied by the square root of hj over (1hj); basically a scaled measure of the difference between in-sample and out-of-sample predictions. The dfit is obtained as a post-regression prediction using predict. Define dfbeta as: DFBETAj=rjvjv2(1hj) where h is the jth item in P, rj is the studentized residual, vj are the residuals from a regression not containing the regressor in question, and v2 is their sum of squares. Suggested cutoffs are 2kN for dfit and 2N for dfbeta. There is also the Cook's distance (\texttt{cooksd}) and Welsch distance ($\texttt{welsch}$).
  • estat hettest [varlist] [, rhs [normal | iid | fstat] mtest[(spec)]] gives us a variety of tests for heteroscedasticity. The rhs option gives structure from covariates. mtest is important because we are doing multiple testing (often).
75

continued

  • estat vif gives us some collinearity diagnostics. The statistic is essentially 11R2(k).
  • estat imtest [, preserve white] where the default is Cameron-Trivedi, we can request White's version, and preserve maintains the original data (saves time often). As a general misspecification test, the Information Matrix test is shown by Hall (1987) to decompose into heteroscedasticity, skewness, and kurtosis of residuals and has some suboptimal properties.
76

Plots

  • avplot: added-variable plot
  • avplots: all added-variable plots in one image
  • cprplot: component-plus-residual plot
  • lvr2plot: leverage-versus-squared-residual plot
  • rvfplot: residual-versus-fitted plot
  • rvpplot: residual-versus-predictor plot
77

Panel Unit Root Testing in Stata

  • Levin-Lin-Chu ( xtunitroot llc ): trend nocons (unit specific) demean (within transform) lags. Under (crucial) cross-sectional independence, the test is an advancement on the generic Dickey-Fuller theory that allows the lag lengths to vary by cross-sections. The test relies on specifying a kernel (beyond our purposes) and a lag length (upper bound). The test statistic has a standard normal basis with asymptotics in NTT ( T grows faster than N ). The test is of either all series containing unit roots ( H0 ) or all stationary; this is a limitation. It is recommended for moderate to large T and N.

  • Perform separate ADF regressions: Δyit=ρiΔyi,t1+piL=1θiLΔyi,t=L+αmidmt+ϵit with dmt as the vector of deterministic variables (none, drift, drift and trend). Select a max L and use t on ˆθiL to attempt to simplify. Then use Δyit=Δyi,tL and dmt for residuals

78
  • Harris-Tzavalis ( xtunitroot ht ): trend nocons (unit specific) demean (within transform) altt (small sample adjust) Similar to the previous, they show that T faster than N (rather than T fixed) leads to size distortions.

  • Breitung ( xtunitroot breitung ): trend nocons (unit specific) demean (within transform) robust (CSD) lags. Similar to LLC with a common statistic across all i.

  • Im, Pesaran, Shin ( xtunitroot ips ): trend demean (within transform) lags. They free ρ to be ρi and average individual unit root statistics. The null is that all contain unit roots while the alternative specifies at least some to be stationary. The test relies on sequential asymptotics (first T, then N). Better in small samples than LLC, but note the differences in the alternatives.

  • Fisher type tests ( xtunitroot fisher ): dfuller pperron demean lags.

  • Hadri (LM) ( xtunitroot hadri ): trend demean robust

All but the last are null hypothesis unit-root tests. Most assume balance but the fisher and IPS versions can work for unbalanced panels.

79

ADL/Canonical models

We can consider some very basic time series models.

  • Koyck/Geometric decay: short run and long-run effects are parametrically identified (given M).
  • Almon (more arbitrary decay): yit=TFtA=0ρtAxttA+ϵt with coefficients that are ordinates of some general polynomial of degree TF>>q. The ρtA=TFk=0γktk.
  • Prais-Winston, etc. are basically FGLS implementations of AR(1).
80

Prais-Winsten/Cochrane-Orcutt

yit=Xitβ+ϵit where ϵit=ρϵi,t1+νit and νitN(0,σ2ν) with stationarity forcing |ρ|<1. We will use iterated FGLS.

  1. First, estimate the regression recalling our unbiasedness condition.
  2. Then regress ˆϵit on ˆϵi,t1.
  3. Rinse and repeat until ρ doesn't change. The transformation applied to the first observation is distinct, you can look this up....

In general, the transformed regression is: yitρyi,t1=α(1ρ)+β(XitρXi,t1)+νit with ν white noise.

81

Beck

  • Static model: Instantaneous impact. yi,t=Xi,tβ+νi,t

  • Finite distributed lag: lags of x finite horizon impact (defined by lags). yi,t=Xi,tβ+Kk=1Xi,tkβk+νi,t

  • AR(1): Errors decay geometrically, X instantaneous. (Suppose unmeasured x and think this through). yi,t=Xi,tβ+νi,t+θϵi,t1

  • Lagged dependent variable: lags of y [common geometric decay] yi,t=Xi,tβ+ϕyi,t1+νi,t

  • ADL: current and lagged x and lagged y. yi,t=Xi,tβ+Xi,t1γ+ϕyi,t1+ϵi,t

  • Panel versions of transfer function models from Box and Jenkins time series. (each x has an impact and decay function)
82

Brief Comment on Hurwicz/Nickell Bias

  • Bias is of stochastic order 1T.
  • Less bad as more T
83

Interpretation of dynamic models

  • Do it.
  • Whitten and Williams dynsim uses Clarify NB: If you do not know what Clarify is, please ask: estimate, set, simulate to do this.
  • Their paper is But Wait, There's More! Maximizing Substantive Inferences from TSCS Models. Easy to find on the web and on the website.
84

Details

yit=α+γyi,t1+Xitβ+ϵit yit=α+γ[α+γyi,t2+Xi,t1β+ϵi,t1]+Xitβ+ϵit yit=α+γ[α+γ(α+γyi,t3+Xi,t2β+ϵit)+Xi,t1β+ϵi,t1]+Xitβ+ϵit

We can continue substituting through to conclude that we have a geometrically decaying impact so that the long-run effect of a one-unit change in X is β1γ

But γ has uncertainty, it is an estimate. To show the realistic long-run impact, we need to incorporate that uncertainty.

85

Day 6: Models for Heterogeneity

  • With models, we need a model for comparison.
  • Some Useful Notation
  • Fixed and Random Effects
  • Comparing FE and RE with Hausman
  • The Multilevel generalization (Bell and Jones and others)
  • Backfilling details
2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow