Economics 1123

By Mario Hawkins,2014-04-30 07:32
15 views 0
Economics 1123Econ

    Nonlinear Regression Functions

    (SW Ch. 6)

( Everything so far has been linear in the X’s

    ( The approximation that the regression function is

    linear might be good for some variables, but not for


    ( The multiple regression framework can be extended to

    handle regression functions that are nonlinear in one

    or more X.


The TestScore STR relation looks approximately



But the TestScore average district income relation

looks like it is nonlinear.


If a relation between Y and X is nonlinear:

    ( The effect on Y of a change in X depends on the value

    of X that is, the marginal effect of X is not constant

    ( A linear regression is mis-specified the functional

    form is wrong

    ( The estimator of the effect on Y of X is biased it

    needn’t even be right on average.

    ( The solution to this is to estimate a regression

    function that is nonlinear in X


     The General Nonlinear Population Regression Function

    Y = f(X,X,…,X) + u, i = 1,…, n i1i2ikii


    1. E(u| X,X,…,X) = 0 (same); implies that f is the i1i2iki

    conditional expectation of Y given the X’s.

    2. (X,…,X,Y) are i.i.d. (same). 1ikii

    3. “enough” moments exist (same idea; the precise

    statement depends on specific f).

    4. No perfect multicollinearity (same idea; the precise

    statement depends on the specific f).



    Nonlinear Functions of a Single Independent Variable

    (SW Section 6.2)

We’ll look at two complementary approaches:

    1. Polynomials in X

    The population regression function is approximated

    by a quadratic, cubic, or higher-degree polynomial 2. Logarithmic transformations

    ( Y and/or X is transformed by taking its logarithm

    ( this gives a “percentages” interpretation that makes

    sense in many applications


1. Polynomials in X

    Approximate the population regression function by a polynomial:


    Y = + X + +…+ + u XXi01i2riii

    ( This is just the linear multiple regression model

    except that the regressors are powers of X!

    ( Estimation, hypothesis testing, etc. proceeds as in the

    multiple regression model using OLS

    ( The coefficients are difficult to interpret, but the

    regression function itself is interpretable


    Example: the TestScore Income relation


    Income = average district income in the i district i

     (thousdand dollars per capita)

    Quadratic specification:


    TestScore = + Income + (Income) + u i01i2ii

    Cubic specification:


    TestScore = + Income + (Income) i01i2i


    + (Income) + u 3ii


    Estimation of the quadratic specification in STATA

generate avginc2 = avginc*avginc; Create a new regressor

    reg testscr avginc avginc2, r;

    Regression with robust standard errors Number of obs = 420

     F( 2, 417) = 428.52

     Prob > F = 0.0000

     R-squared = 0.5562

     Root MSE = 12.724


     | Robust

     testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+----------------------------------------------------------------

     avginc | 3.850995 .2680941 14.36 0.000 3.32401 4.377979

     avginc2 | -.0423085 .0047803 -8.85 0.000 -.051705 -.0329119

     _cons | 607.3017 2.901754 209.29 0.000 601.5978 613.0056 ------------------------------------------------------------------------------


    The t-statistic on Income is -8.85, so the hypothesis of

    linearity is rejected against the quadratic alternative at the

    1% significance level.


Report this document

For any questions or suggestions please email