Sampling Variance of the OLS estimator

Introduction to Econometrics

  Ekki Syamsulhakim Undergraduate Program Department of Economics Universitas Padjadjaran Sampling Variance of the OLS estimator

  • •   • We know that when is not biased

  • • The variance of can be computed

    using the formula:

  Sampling Variance of the OLS estimator

  A MUST READ

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator Variance, Perfect

Collinearity, and Multicollinearity

  Estimator of  

  Estimator of  

  Estimator of   Standard Error of  

  

Inference

  • We assume that unobserved error is normally distributed in the population
    • – Called “Normality Assumption”

  Inference

  Inference

  Inference Testing Hypotheses About A Single Population Parameter: The t-Test

  Hypothesis Testing

  • t-test or (later) F-test (individual coefcient vs overall model tests)
  • two sided vs one sided test
    • – Your hypothesis
    • – check the theory
    • – our research question

  • 2 methods
    • – t-stat method

  

Hypothesis Testing

  • The long steps:
    • – State the null and alternative hypothesis
    • – Choose the level of signifcance
    • – For t-test method: observe t-statistics and compute t-critical
    • – For p-value method: compute p-value
    • – State the decision rule
    • – State the conclusion

  Regression reg rent room sqrm if rent<4000000 & sqrm<3000 & room<30

Model | 5.4331e+14 2 2.7166e+14 Prob > F = 0.0000

-------------+------------------------------ F( 2, 11040) = 434.33

Source | SS df MS Number of obs = 11043

Total | 7.4484e+15 11042 6.7455e+11 Root MSE = 7.9e+05

-------------+------------------------------ Adj R-squared = 0.0728

Residual | 6.9051e+15 11040 6.2546e+11 R-squared = 0.0729

-------------+----------------------------------------------------------------

rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]

------------------------------------------------------------------------------

_cons | 277297.8 18998.01 14.60 0.000 240058.3 314537.4

sqrm | 735.5316 141.3077 5.21 0.000 458.5433 1012.52

room | 88841.68 3935.845 22.57 0.000 81126.72 96556.64

  • ------------------------------------------------------------------------------

  t-statistics Two sided t-test (ex: t-stat app)

  •   State the hypothesis:

  H : =0 number of room has no impact o on rent

H : ≠ 0 number of room has an impact

  A on rent Choose signifcance level: observe t-stat

Two sided t-test (ex: t-stat app)

  Compute t-crit t-crit (a, df=n-k-1=11043-2-1=11040) = 1.960179 Rejection criteria: Reject H if |t-stat |> |t-crit| Conclusion: Since our |t-stat| > |t-crit| or 22.57> 1.960179, we reject H . Conclusion: Since our t-stat > t-crit (22.57 > 1.960179) we reject H .

  

Therefore we have sufcient evidence that number of room

has an impact on rent

  Two sided t-test (ex: p-value app)

  State the hypothesis:

  •  

  H : =0 number of room has no impact on o rent H : ≠0 number of room has an impact on A rent Choose signifcance level: Compute (observe) p-value

  Two sided t-test (ex: p-value app)

  Rejection criteria: •   Reject H if p-value < Conclusion: Since our p-value=0.0000… is less than =0.05, we reject H .

  Therefore we have sufcient evidence that number of room has an impact on rent

  

One sided t-test (ex: t-stat

app)

  • As number of room increases, it is

  sensible to think that the rent also increases (probably based on theory)

  • We can (should) use 1 tail test
    • – We must compute new t-critical as the output of STATA /

      GRETL / EVIEWS / EXCEL is t-critical for 2-sided test

  

One sided t-test (ex: t-stat

app)

  •   State the hypothesis:

  H : =0 number of room has no impact on rent o H : >0 number of room has a positive impact A on rent Choose signifcance level:  For computing new t-crit  1.64 observe t-stat t-stat = 22.57

  One sided t-test (ex: t-stat

app)

Compute t-crit for 1 sided

t-crit (2a, df=n-k-1=11043-2-1=11040) = 1.645 (positive side)

Rejection criteria: Reject H if |t-stat |> |t-crit| Conclusion: Since our |t-stat| > |t-crit| or 22.57> 1.645, we reject H .

  Therefore we have sufcient evidence that number of

  

One sided t-test (ex: p-value

app)

  • As number of room increases, it is sensible to think that the rent also increases
  • We can (should) use 1 tail test
    • – We must divide p-value by 2
    One sided t-test (ex: p-value app) State the hypothesis:

  •  

  H : =0 number of room has no impact on rent o H : >0 number of room has a positive impact on A rent Choose signifcance level: Compute (observe) p-value

Because we are doing 1 tail test, P-value given by

Econometric Software must be divided by 2; Hence calculated =0.0000…

  Example:   p-value method – t

  •   Rejection criteria:

  Reject H if Conclusion: Since our is less than =0.05, we reject H .

  Therefore we have sufcient evidence that number of room has a positive impact on rent

  

Testing Other Hypotheses About

 

  •   • Consider a simple model relating the

  

annual number of crimes on college

campuses (crime) to student enrollment (enroll)

  • This is a constant elasticity model, where is the elasticity of crime with

  Testing Other Hypotheses  

  About

  • It is not much use to test H : , a
  •  

  expect the total number of crimes to

  increase as the size of the campus increases

  • A more interesting hypothesis to test would be that the elasticity of crime with respect to enrollment is one

  H :

  • – This means that a 1% increase in enrollment

  

Testing Other Hypotheses

 

  

About

  •  

   : ,

  • A noteworthy alternative is H

  1

which implies that a 1% increase in

enrollment increases campus crime

by more than 1%

  • • If , then, in a relative sense—not just

    an absolute sense—crime is more of

    a problem on larger campuses.

  Testing Other Hypotheses About  

  • • The estimated elasticity of crime with

    respect to enroll, 1.27, is in the direction of the alternative .
  • But is there enough evidence to
  •  

  Testing Other Hypotheses

 

  

About

  •   if the null is stated as H :
  • >

    • where is our hypothesized value of ,

    then the appropriate t statistic is
  • • The usual t statistic is obtained when

    .

  Testing Other Hypotheses  

  About

  •   • The correct t statistic is
  • • The one-sided 5% critical value for a

    t distribution with df is about 1.66

  • So we clearly reject in favor of at the 5% level

F-test (F-stat approach)

  H : b = b =0 all coefcients are zero (or: all independent 1 2

variables do not afect dependent variables; or: room and sqrm

do not afect rent) H A : At least one of b i is NOT zero (or: at least one independent variable is NOT zero)

    F-stat = 434.33 F-crit (a=0.05,k=2,n-k-1=26)2.99 Because F-stat > F crit, reject H Conclusion: we have sufcient evidence that at least one of our independent variable is useful in explaining house rent

  

F-test (p-value approach)

H : = =0 all coefcients are zero b b

  1

  b A

i

  Using p-value approach, we can see that our p-value for F- test is 0.000… which is less than our (default) a=0.05 Hence, reject H Conclusion: we have sufcient evidence that at least one of Joint / Multiple hypothesis

test

  •  
  • We often test hypotheses involving more than one of the population parameters.
    • – test a single hypothesis involving more than one of the .
    • – test multiple hypotheses (multiple linear restrictions – the F-test)

  

Testing Multiple Linear Restrictions:

The F -Test

  • We begin with the leading case of testing whether a set of independent variables has no partial efect on a dependent variable
    • – we want to test whether a group of variables has no efect on the dependent variable.
    • – the null hypothesis is that a set of variables has no efect on y, once another set of

  

Testing Multiple Linear Restrictions:

The F -Test

  • • consider the following model that explains major

  •  

  league baseball players’ salaries: (4.28)

salary is the 1993 total salary, years is years in the league, gamesyr is average games played per year, bavg is career batting average (for example, bavg = 250), hrunsyr is home runs per year, and rbisyr is runs batted in per year. Testing Multiple Linear Restrictions:

The F -Test

  • Suppose we want to test the null hypothesis that, once years in the league and games per year have been controlled for, the statistics measuring performance —bavg, hrunsyr, and rbisyr—have no efect on salary.
  • Essentially, the null hypothesis states that

  productivity as measured by baseball statistics has no efect on salary.

  

Testing Multiple Linear Restrictions:

The F -Test

  • In terms of the parameters of the model, the
  •  

  null hypothesis is stated as (4.29) The null (4.29) constitutes three exclusion restrictions: If (4.29) is true, then bavg, hrunsyr, and rbisyr have no efect on log(salary), after years and gamesyr have

been controlled for, and therefore should be excluded from the model. Testing Multiple Linear Restrictions:

The F -Test

  • • What should be the alternative to (4.29)? If what

  •  

  

we have in mind is that “performance statistics

matter, even after controlling for years in the league and games per year,” then the appropriate alternative is simply

is not true

  The alternative (4.30) holds if at least one of or is

diferent from zero. (Any or all could be diferent from zero.) Testing Multiple Linear Restrictions:

The F -Test

  • The steps to be done:
  •  

  

1. Conduct a regression for the unrestricted model (in

the example above, the model with all performance variables included) 2

  • Note the SSR and R

  2. Conduct a regression for the restricted model (in the example above, the model with none of the

  • Note the SSR and R performance variables included)
  • 2

      Where is numerator degree of freedom = and Example: MLB1

    Testing Multiple Linear Restrictions: The F -Test

    • The outcome of the joint test may seem surprising in light of the insignifcant t - statistics for the three variables.
    • What is happening is that the two variables hrunsyr and rbisyr are highly correlated, and this multicollinearity makes it difcult to uncover the partial efect of each variable; this is refected in the individual t statistics.

      

    The R-Squared form of the F

    Statistic

    • • It is often more convenient to have a form of the

      F statistic that can be computed using the R- squareds from the restricted and unrestricted models.

    • One reason for this is that the R-squared is always between zero and one, whereas the SSRs can be very large depending on the unit of measurement of y, making the calculation based on the SSRs tedious.

      

    The R-Squared form of the F

    Statistic

    •   • Using the fact that

      we can substitute into F-stat formula

    above and get the R-squared form of

    F-statistics

      

    test (bavg=0) (hrunsyr=0) (rbisyr=0)