By Harry Robertson,2014-06-20 00:58
15 views 0

    CHAPTER 14


    My preference is to view the fixed and random effects methods of estimation as applying to the same underlying unobserved effects model. The name “unobserved effect” is neutral to the issue of whether the time-constant effects should be treated as fixed parameters or random variables. With large N and relatively small T, it almost always makes sense to treat them as random

    variables, since we can just view the unobserved a as being drawn from the population along i

    with the observed variables. Especially for undergraduates and master’s students, it seems sensible to not raise the philosophical issues underlying the professional debate. In my mind, the key issue in most applications is whether the unobserved effect is correlated with the observed explanatory variables. The fixed effects transformation eliminates the unobserved effect entirely whereas the random effects transformation accounts for the serial correlation in the composite error via GLS. (Alternatively, the random effects transformation only eliminates a fraction of the unobserved effect.)

As a practical matter, the fixed effects and random effects estimates are closer when T is large or

    when the variance of the unobserved effect is large relative to the variance of the idiosyncratic error. I think Example 14.4 is representative of what often happens in applications that apply pooled OLS, random effects, and fixed effects, at least on the estimates of the marriage and union wage premiums. The random effects estimates are below pooled OLS and the fixed effects estimates are below the random effects estimates.

    Choosing between the fixed effects transformation and first differencing is harder, although useful evidence can be obtained by testing for serial correlation in the first-difference estimation. If the AR(1) coefficient is significant and negative (say, less than .3, to pick a not quite

    arbitrary value), perhaps fixed effects is preferred.

    Matched pairs samples have been profitably used in recent economic applications, and differencing or random effects methods can be applied. In an equation such as (14.12), there is probably no need to allow a different intercept for each sister provided that the labeling of sisters is random. The different intercepts might be needed if a certain feature of a sister that is not included in the observed controls is used to determine the ordering. A statistically significant intercept in the differenced equation would be evidence of this.



    2) = Var(u u) = Var(u) + Var(u) = , where we use 14.1 First, for each t > 1, Var(u2ititi,t-1iti,t-1uthe assumptions of no serial correlation in {u} and constant variance. Next, we find the t

    covariance between u and u. Because these each have a zero mean, the covariance is iti,t+12?E(uu) = E[(u u)(u u)] = E(uu) E() E(uu) + E(uu) = uiti,t+1iti,t-1i,t+1ititi,t+1i,t-1i,t+1i,t-1itit

    22E() = because of the no serial correlation assumption. Because the variance is constant uitu

    22across t, by Problem 11.1, Corr(u, u) = Cov(u, u)/Var(?u) = = .5. ;;/(2)iti,t+1iti,t+1ituu

14.2 (i) The between estimator is just the OLS estimator from the cross-sectional regression of

     on (including an intercept). Because we just have a single explanatory variable and the yxxiiierror term is a + , we have, from Section 5.1, uii

    plim() = + Cov(,a + )/Var(). xux1i1iiiBut E(a + ) = 0 so Cov(,a + ) = E((a + )] = E(a) + E() = E(a) because uxuxuxxuxiiiiiiiiiiiiii

    T1TxaE()E() = Cov(,) = 0 by assumption. Now E(a) = = . Therefore, xuxuxixa?itiiiiiit1

    plim() = + /Var(), x1xa1i

    which is what we wanted to show.

    22 (ii) If {x} is serially uncorrelated with constant variance then Var() = /T, and so xitixx

    22plim = + /(/T) = + T(/). 1xa1xa1xx

     (iii) As part (ii) shows, when the x are pairwise uncorrelated the magnitude of the it

    inconsistency actually increases linearly with T. The sign depends on the covariance between x itand a. i

14.3 (i) E(e) = E(v ) = E(v) E() = 0 because E(v) = 0 for all t. vvititititii

    2222?? (ii) Var(v ) = Var(v) + Var() 2Cov(v, ) = + E(v) 2E(v). vvvvititititiiiiiv

    T1122222222TTEvv()Now, ;;;??;E()v and E(v) = = [ + + + ( + ) + vit?itisvitauiaaaus1

    T1222222TEvv() + ] = + /T. Therefore, E(v) = = + /T. Now, we can collect ?itiiaauau1t


    2222222()(/)2(/);;;;;;;;;;;;;TT Var(v v) = . itauauaui

    222(,/Now, it is convenient to write = 1 , where ( ? /T and ? + /T. Then uau


    2222222 ) = ( + ) 2( + /T) + ( + /T) Var(vvitiuauaua

    222 = ( + ) 2(1 ) + (1 ) (,/(,/au

    22 = ( + ) 2 + 2 + (1 2 + (/) (,?(,/au

    22 = ( + ) 2 + 2 + (1 2 + (/) (,?(,/au

    22 = ( + ) 2 + 2 (,? + 2(,? + ( au

    222 = ( + ) + ( = . auu

This is what we wanted to show.

     (iii) We must show that E(ee) = 0 for t ~ s. Now E(ee) = E[(v )(v )] = vvitisitisitisii

    222222222E(vv) E(v) E(v) + E() = 2( + /T) + E() = 2( + vvvvitisisitiiiiaaaau

    2222/T) + ( + /T). The rest of the proof is very similar to part (ii): uau

    222222 E(ee) = 2( + /T) + ( + /T) itisaauau

    22(,/(,/ = 2(1 ) + (1 ) a

    2(,?(,/ = 2 + 2 + (1 2 + (/) a

    2(,?(,/ = 2 + 2 + (1 2 + (/) a

    2(,?(,? = 2 + 2 + 2 + ( a

    2 = + ( = 0. a

    14.4 (i) Men’s athletics are still the most prominent, although women’s sports, especially basketball but also gymnastics, softball, and volleyball, are very popular at some universities. Winning percentages for football and men’s and women’s basketball are good possibilities, as well as indicators for whether teams won conference championships, went to a visible bowl game (football), or did well in the NCAA basketball tournament (such as making the Sweet 16). We must be sure that we use measures of athletic success that are available prior to application deadlines. So, we would probably use football success from the previous school year; basketball success might have to be lagged one more year.

     (ii) Tuition could be important: ceteris paribus, higher tuition should mean fewer applications. Measures of university quality that change over time, such as student/faculty ratios or faculty grant money, could be important.


     (iii) An unobserved effects model is

    ) = d90 + d95 + athsucc + log(tuition) + + a + u, t = 1,2,3. log(appsit1t2t1it2itiit

The variable athsucc is shorthand for a measure of athletic success; we might include several it

    measures. If, for example, athsucc is football winning percentage, then 100 is the percentage it1

    change in applications given a one percentage point increase in winning percentage. It is likely that a is correlated with athletic success, tuition, and so on, so fixed effects estimation is i

    appropriate. Alternatively, we could first difference to remove a, as discussed in Chapter 13. i

    14.5 (i) For each student we have several measures of performance, typically three or four, the number of classes taken by a student that have final exams. When we specify an equation for each standardized final exam score, the errors in the different equations for the same student are certain to be correlated: students who have more (unobserved) ability tend to do better on all tests.

     (ii) An unobserved effects model is

     score = + atndrte + major + SAT + cumGPA + a + u, scc1sc2sc3s4sssc

where a is the unobserved student effect. Because SAT score and cumulative GPA depend only s

    on the student, and not on the particular class he/she is taking, these do not have a c subscript.

    The attendance rates do generally vary across class, as does the indicator for whether a class is in the student’s major. The term denotes different intercepts for different classes. Unlike with a c

    panel data set, where time is the natural ordering of the data within each cross-sectional unit, and the aggregate time effects apply to all units, intercepts for the different classes may not be needed. If all students took the same set of classes then this is similar to a panel data set, and we would want to put in different class intercepts. But with students taking different courses, the class we label as “1” for student A need have nothing to do with class “1” for student B. Thus, the different class intercepts based on arbitrarily ordering the classes for each student probably are not needed. We can replace with , an intercept constant across classes. c0

     (iii) Maintaining the assumption that the idiosyncratic error, u, is uncorrelated with all sc

    explanatory variables, we need the unobserved student heterogeneity, a, to be uncorrelated with s

    atndrte. The inclusion of SAT score and cumulative GPA should help in this regard, as a, is scs

    the part of ability that is not captured by SAT and cumGPA. In other words, controlling for ss

    SAT and cumGPA could be enough to obtain the ceteris paribus effect of class attendance. ss

     (iv) If SAT and cumGPA are not sufficient controls for student ability and motivation, a is sss

    correlated with atndrte, and this would cause pooled OLS to be biased and inconsistent. We sc

    could use fixed effects instead. Within each student we compute the demeaned data, where, for each student, the means are computed across classes. The variables SAT and cumGPA drop out ss

    of the analysis.

    14.6 (i) The fully robust standard errors are larger in each case, roughly double for the time-constant regressors educ, black, and hispan. On the time-varying explanatory variables married


and union, the fully robust standard errors are roughly 60 percent larger. The differences are 2 but hardly trivial. We expect this if we think the composite error smaller for exper and exper

    term, , contains an unobserved effect, . This induces positive serial correlation and, as we vaiti

    saw in Section 12.1 for time series, the usual OLS standard errors tend to understate the actual sampling variation in the OLS estimates. The same holds true for pooled OLS with panel data.

     (ii) On the time constant explanatory variables educ, black, and hispan, the RE standard

    errors and the robust standard errors for pooled OLS are roughly the same. (The coefficient estimates are very similar, too.) The main differences arise in the standard errors (and coefficients) on the time-varying explanatory variables. For example, the RE standard errors on the married and union coefficients are .017 and .018, respectively, compared with the robust standard errors for pooled OLS of .026 and .027. We expect this to be true because, under the under the RE assumptions, RE is more efficient than pooled OLS.


C14.1 (i) This is done in Computer Exercise 13.5(i).

     (ii) See Computer Exercise 13.5(ii).

     (iii) See Computer Exercise 13.5(iii).

     (iv) This is the only new part. The fixed effects estimates, reported in equation form, are

    log()rent = .386 y90 + .072 log(pop) + .310 log(avginc) + .0112 pctstu, titititit

     (.037) (.088) (.066) (.0041)

     N = 64, T = 2.

    (There are N = 64 cities and T = 2 years.) We do not report an intercept because it gets removed by the time demeaning. The coefficient on y90 is identical to the intercept from the first t

    difference estimation, and the slope coefficients and standard errors are identical to first differencing. We do not report an R-squared because none is comparable to the R-squared

    obtained from first differencing.

    [Instructor’s Note: Some econometrics packages do report an intercept for fixed effects estimation; if so, it is usually the average of the estimated intercepts for the cross-sectional units, and it is not especially informative. If one obtains the FE estimates via the dummy variable regression, an intercept is reported for the base group, which is usually an arbitrarily chosen cross-sectional unit.]

C14.2 (i) We report the fixed effects estimates in equation form as


    log()crmrte = .013 d82 .079 d83 .118 d84 .112 d85 ttttit

     (.022) (.021) (.022) (.022)

     .082 d86 .040 d87 .360 log(prbarr) .286 log(prbconv) ttitit

     (.021) (.021) (.032) (.021)

     .183 log(prbpris) .0045 log(avgsen) + .424 log(polpc) ititit

     (.032) (.0264) (.026)

     N = 90, T = 7.

    There is no intercept because it gets swept away in the time demeaning. If your econometrics package reports a constant or intercept, it is choosing one of the cross-sectional units as the base group, and then the overall intercept is for the base unit in the base year. This overall intercept is

    ˆnot very informative because, without obtaining each , we cannot compare across units. ai

     Remember that the coefficients on the year dummies are not directly comparable with those in the first-differenced equation because we did not difference the year dummies in (13.33). The fixed effects estimates are unbiased estimators of the parameters on the time dummies in the original model.

     The first-difference and fixed effects slope estimates are broadly consistent. The variables that are significant with first differencing are significant in the FE estimation, and the signs are all the same. The magnitudes are also similar, although, with the exception of the insignificant variable log(avgsen), the FE estimates are all larger in absolute value. But we conclude that the estimates across the two methods paint a similar picture.

     (ii) When the nine log wage variables are added and the equation is estimated by fixed effects, very little of importance changes on the criminal justice variables. The following table contains the new estimates and standard errors.

    Independent Standard

    Variable Coefficient Error

    log(prbarr) .356 .032

    log(prbconv) .286 .021

    log(prbpris) .175 .032

    log(avgsen) .0029 .026

    log(polpc) .423 .026

    The changes in these estimates are minor, even though the wage variables are jointly significant.

    ??The F statistic, with 9 and N(T 1) k = 90(6) 20 = 520 df, is F 2.47 with p-value .0090.

    C14.3 (i) 135 firms are used in the FE estimation. Because there are three years, we would have a total of 405 observations if each firm had data on all variables for all three years. Instead, due to missing data, we can use only 390 observations in the FE estimation. The fixed effects estimates are


    hrsemp = 1.10 d88 + 4.09 d89 + 34.23 grant ttitit

     (1.98) (2.48) (2.86)

     + .504 grant .176 log(employ) i,t-1it

     (4.127) (4.288)

     n = 390, N = 135, T = 3.

     (ii) The coefficient on grant means that if a firm received a grant for the current year, it trained each worker an average of 34.2 hours more than it would have otherwise. This is a practically large effect, and the t statistic is very large.

     (iii) Since a grant last year was used to pay for training last year, it is perhaps not surprising that the grants does not carry over into more training this year. It would if inertia played a role in training workers.

     (iv) The coefficient on the employees variable is very small: a 10% increase in employ

    ?increases predicted hours per employee by only about .018. [Recall: hrsemp (.176/100)

    (%employ).] This is very small, and the t statistic is practically zero.

C14.4 (i) Write the equation for times t and t 1 as

     log(uclms) = a + ct + ez + u, itii1itit

     log(uclms) = a + c(t 1) + ez + u i,t-1ii1i,t-1i,t-1

and subtract the second equation from the first. The a are eliminated and ct c(t 1) = c. So, iiii

    for each t ? 2, we have

     log(uclms) = c + ez + u. iti1itit

     (ii) Because the differenced equation contains the fixed effect c, we estimate it by FE. We i

    ˆˆget = .251, se() = .121. The estimate is actually larger in magnitude than we obtain in 11

    ˆˆExample 13.8 [where = 1.82, se() = .078], but we have not yet included year dummies. 11

    In any case, the estimated effect of an EZ is still large and statistically significant.

     (iii) Adding the year dummies reduces the estimated EZ effect, and makes it more comparable to what we obtained without ct in the model. Using FE on the first-differenced i

    ˆˆequation gives = .192, se() = .085, which is fairly similar to the estimates without the 11

    city-specific trends.

    C14.5 (i) Different occupations are unionized at different rates, and wages also differ by occupation. Therefore, if we omit binary indicators for occupation, the union wage differential may simply be picking up wage differences across occupations. Because some people change occupation over the period, we should include these in our analysis.


     (ii) Because the nine occupational categories (occ1 through occ9) are exhaustive, we must

    choose one as the base group. Of course the group we choose does not affect the estimated union wage differential. The fixed effect estimate on union, to four decimal places, is .0804 with

    standard error = .0194. There is practically no difference between this estimate and standard

    ˆ= .0800, error and the estimate and standard error without the occupational controls (union

    se = .0193).

    ?C14.6 First, the random effects estimate on union becomes .174 (se .031), while the it

    ??coefficient on the interaction term uniont is about .0155 (se .0057). Therefore, the it

    ?interaction between the union dummy and time trend is very statistically significant (t statistic

    2.72), and is important economically. While at a given point in time there is a large union differential, the projected wage growth is less for unionized workers (on the order of 1.6% less per year).

    ? The fixed effects estimate on union becomes .148 (se .031), while the coefficient on the it

    ??interaction uniont is about .0157 (se .0057). Therefore, the story is very similar to that for it

    the random effects estimates.

C14.7 (i) If there is a deterrent effect then < 0. The sign of is not entirely obvious, 12

    although one possibility is that a better economy means less crime in general, including violent crime (such as drug dealing) that would lead to fewer murders. This would imply > 0. 2

     (ii) The pooled OLS estimates using 1990 and 1993 are

     = 5.28 2.07 d93 + .128 exec + 2.53 unemmrdrteittitit

     (4.43) (2.14) (.263) (0.78)

     2 N = 51, T = 2, R = .102

There is no evidence of a deterrent effect, as the coefficient on exec is actually positive (though

    not statistically significant).

     (iii) The first-differenced equation is

     = .413 .104 exec .067 unemmrdrteiii

     (.209) (.043) (.159)

     2 n = 51, R = .110

    Now, there is a statistically significant deterrent effect: 10 more executions is estimated to reduce the murder rate by 1.04, or one murder per 100,000 people. Is this a large effect? Executions are relatively rare in most states, but murder rates are relatively low on average, too. In 1993, the average murder rate was about 8.7; a reduction of one would be nontrivial. For the (unknown) people whose lives might be saved via a deterrent effect, it would seem important.


     is .017. Somewhat surprisingly, (iv) The heteroskedasticity-robust standard error for execi

    this is well below the nonrobust standard error. If we use the robust standard error, the statistical evidence for the deterrent effect is quite strong (t ? 6.1). See also Computer Exercise 13.12.

     (v) Texas had by far the largest value of exec, 34. The next highest state was Virginia, with

    11. These are three-year totals.

     (vi) Without Texas in the estimation, we get the following, with heteroskedasticity-robust standard errors in [?]:

     = .413 .067 exec .070 unemmrdrteiii

     (.211) (.105) (.160)

     [.200] [.079] [.146]

     2 n = 50, R = .013

    Now the estimated deterrent effect is smaller. Perhaps more importantly, the standard error on exec has increased by a substantial amount. This happens because when we drop Texas, we i

    lose much of the variation in the key explanatory variable, exec. i

     (vii) When we apply fixed effects using all three years of data and all states we get

     = 1.73 d90 + 1.70 d93 .054 exec + .395 unemmrdrteitttitit

     (.75) (.71) (.160) (.285)

     2 N = 51, T = 3, R = .068

The size of the deterrent effect is only about half as big as when 1987 is not used. Plus, the t

    statistic, about .34, is very small. The earlier finding of a deterrent effect is not robust to the time period used. Oddly, adding another year of data causes the standard error on the exec

    coefficient to markedly increase.

C14.8 (i) The pooled OLS estimates are

     = 31.66 + 6.38 y94 + 18.65 y95 + 18.03 y96 + 15.34 y97 + 30.40 y98 math4

     (10.30) (.74) (.79) (.77) (.78) (.78)

     + .534 log(rexpp) + 9.05 log(rexpp) + .593 log(enrol) .407 lunch -1

     (2.428) (2.31) (.205) (.014)

     2 N = 550, T = 6, R = .505

     (ii) The lunch variable is the percent of students in the district eligible for free or reduced-price lunches, which is determined by poverty status. Therefore, lunch is effectively a poverty


    rate. We see that the district poverty rate has a large impact on the math pass rate: a one percentage point increase in lunch reduces the pass rate by about .41 percentage points.

     using the years 1994 through 1998 (since the (iii) I ran the pooled OLS regression vv on itit,1

    ˆˆresiduals are first available for 1993). The coefficient on is .504 (se = .017), so there is vit,1

    very strong evidence of positive serial correlation. There are many reasons for positive serial correlation. In the context of panel data, it indicates the presences of a time-constant unobserved effect, a. i

     (iv) The fixed effects estimates are

     = 6.18 y94 + 18.09 y95 + 17.94 y96 + 15.19 y97 + 29.88 y98 math4

     (.56) (.69) (.76) (.80) (.84)

     .411 log(rexpp) + 7.00 log(rexpp) + .245 log(enrol) + .062 lunch -1

     (2.458) (2.37) (1.100) (.051)

     2 N = 550, T = 6, R = .603

The coefficient on the lagged spending variable has gotten somewhat smaller, but its t statistic is

    still almost three. Therefore, there is still evidence of a lagged spending effect after controlling for unobserved district effects.

     (v) The change in the coefficient and significance on the lunch variable is most dramatic.

    Both enrol and lunch are slow to change over time, which means that their effects are largely captured by the unobserved effect, a. Plus, because of the time demeaning, their coefficients are i

    hard to estimate. The spending coefficients can be estimated more precisely because of a policy change during this period, where spending shifted markedly in 1994 after the passage of Proposal A in Michigan, which changed the way schools were funded.

    ˆˆ (vi) The estimated long-run spending effect is = 6.59, se() = 2.64. 11

C14.9 (i) The OLS estimates are

    pctstck 128.54 + 11.74 choice + 14.34 prftshr + 1.45 female 1.50 age

     (55.17) (6.23) (7.23) (6.77) (.78)

     + .70 educ 15.29 finc25 + .19 finc35 3.86 finc50

     (1.20) (14.23) (14.69) (14.55)

     13.75 finc75 2.69 finc100 25.05 finc101 .0026 wealth89

     (16.02) (15.72) (17.80) (.0128)

    + 6.67 stckin89 7.50 irain89


Report this document

For any questions or suggestions please email