☰ Kategori

Sampling Variance of the OLS estimator

Introduction to Econometrics

Ekki Syamsulhakim Undergraduate Program Department of Economics Universitas Padjadjaran Sampling Variance of the OLS estimator

• • We know that when is not biased
• The variance of can be computed
using the formula:

Sampling Variance of the OLS estimator

A MUST READ

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator Variance, Perfect

Collinearity, and Multicollinearity

Estimator of

Estimator of

Estimator of Standard Error of

Inference

We assume that unobserved error is normally distributed in the population

– Called “Normality Assumption”

Inference

Inference

Inference Testing Hypotheses About A Single Population Parameter: The t-Test

Hypothesis Testing

t-test or (later) F-test (individual coefcient vs overall model tests)
two sided vs one sided test

– Your hypothesis
– check the theory
– our research question

2 methods

– t-stat method

Hypothesis Testing

The long steps:

– State the null and alternative hypothesis
– Choose the level of signifcance
– For t-test method: observe t-statistics and compute t-critical
– For p-value method: compute p-value
– State the decision rule
– State the conclusion

Regression reg rent room sqrm if rent<4000000 & sqrm<3000 & room<30

_{Model | 5.4331e+14 2 2.7166e+14 Prob > F = 0.0000}

_{-------------+------------------------------ F( 2, 11040) = 434.33}

Source | SS df MS Number of obs = 11043

_{Total | 7.4484e+15 11042 6.7455e+11 Root MSE = 7.9e+05}

_{-------------+------------------------------ Adj R-squared = 0.0728}

Residual | 6.9051e+15 11040 6.2546e+11 R-squared = 0.0729

_{-------------+----------------------------------------------------------------}

_{rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]}

------------------------------------------------------------------------------

_{_cons | 277297.8 18998.01 14.60 0.000 240058.3 314537.4}

_{sqrm | 735.5316 141.3077 5.21 0.000 458.5433 1012.52}

room | 88841.68 3935.845 22.57 0.000 81126.72 96556.64

------------------------------------------------------------------------------

t-statistics Two sided t-test (ex: t-stat app)

State the hypothesis:

H : =0 number of room has no impact o on rent

H : ≠ 0 number of room has an impact

A on rent Choose signifcance level: observe t-stat

Two sided t-test (ex: t-stat app)

Compute t-crit t-crit (a, df=n-k-1=11043-2-1=11040) = 1.960179 Rejection criteria: Reject H if |t-stat |> |t-crit| Conclusion: Since our |t-stat| > |t-crit| or 22.57> 1.960179, we reject H . Conclusion: Since our t-stat > t-crit (22.57 > 1.960179) we reject H .

Therefore we have sufcient evidence that number of room

has an impact on rent

Two sided t-test (ex: p-value app)

State the hypothesis:

H : =0 number of room has no impact on _o rent H : ≠0 number of room has an impact on _A rent Choose signifcance level: Compute (observe) p-value

Two sided t-test (ex: p-value app)

Rejection criteria: • Reject H if p-value < Conclusion: Since our p-value=0.0000… is less than =0.05, we reject H .

Therefore we have sufcient evidence that number of room has an impact on rent

One sided t-test (ex: t-stat

app)

As number of room increases, it is

sensible to think that the rent also increases (probably based on theory)

We can (should) use 1 tail test

– We must compute new t-critical as the output of STATA /
GRETL / EVIEWS / EXCEL is t-critical for 2-sided test

One sided t-test (ex: t-stat

app)

State the hypothesis:

H : =0 number of room has no impact on rent _o H : >0 number of room has a positive impact _A on rent Choose signifcance level:  For computing new t-crit  1.64 observe t-stat t-stat = 22.57

One sided t-test (ex: t-stat

app)

Compute t-crit for 1 sided

t-crit (2a, df=n-k-1=11043-2-1=11040) = 1.645 (positive side)

Rejection criteria: Reject H if |t-stat |> |t-crit| Conclusion: Since our |t-stat| > |t-crit| or 22.57> 1.645, we reject H .

Therefore we have sufcient evidence that number of

One sided t-test (ex: p-value

app)

As number of room increases, it is sensible to think that the rent also increases
We can (should) use 1 tail test

– We must divide p-value by 2

H : =0 number of room has no impact on rent _o H : >0 number of room has a positive impact on _A rent Choose signifcance level: Compute (observe) p-value

Because we are doing 1 tail test, P-value given by

Econometric Software must be divided by 2; Hence calculated =0.0000…

Example: p-value method – t

Rejection criteria:

Reject H if Conclusion: Since our is less than =0.05, we reject H .

Therefore we have sufcient evidence that number of room has a positive impact on rent

Testing Other Hypotheses About

• Consider a simple model relating the

annual number of crimes on college

campuses (crime) to student enrollment (enroll)

This is a constant elasticity model, where is the elasticity of crime with

Testing Other Hypotheses

About

It is not much use to test H : , a

expect the total number of crimes to

increase as the size of the campus increases

A more interesting hypothesis to test would be that the elasticity of crime with respect to enrollment is one

H :

– This means that a 1% increase in enrollment

Testing Other Hypotheses

About

: ,

A noteworthy alternative is H

1

which implies that a 1% increase in

enrollment increases campus crime

by more than 1%

• If , then, in a relative sense—not just

an absolute sense—crime is more of
a problem on larger campuses.

Testing Other Hypotheses About

• The estimated elasticity of crime with
respect to enroll, 1.27, is in the direction of the alternative .
But is there enough evidence to

Testing Other Hypotheses

About

if the null is stated as H :

• where is our hypothesized value of ,

• The usual t statistic is obtained when
.

Testing Other Hypotheses

About

• The correct t statistic is
• The one-sided 5% critical value for a

t distribution with df is about 1.66
So we clearly reject in favor of at the 5% level

F-test (F-stat approach)

H : b = b =0 all coefcients are zero (or: all independent ₁ ₂

variables do not afect dependent variables; or: room and sqrm

do not afect rent) H A : At least one of b i is NOT zero (or: at least one independent variable is NOT zero)

F-stat = 434.33 F-crit (a=0.05,k=2,n-k-1=26)2.99 Because F-stat > F crit, reject H Conclusion: we have sufcient evidence that at least one of our independent variable is useful in explaining house rent

F-test (p-value approach)

H : = =0 all coefcients are zero b b

1

b A

i

Using p-value approach, we can see that our p-value for F- test is 0.000… which is less than our (default) a=0.05 Hence, reject H Conclusion: we have sufcient evidence that at least one of Joint / Multiple hypothesis

test

We often test hypotheses involving more than one of the population parameters.

– test a single hypothesis involving more than one of the .
– test multiple hypotheses (multiple linear restrictions – the F-test)

Testing Multiple Linear Restrictions:

The F -Test

We begin with the leading case of testing whether a set of independent variables has no partial efect on a dependent variable

– we want to test whether a group of variables has no efect on the dependent variable.
– the null hypothesis is that a set of variables has no efect on y, once another set of

Testing Multiple Linear Restrictions:

The F -Test

• consider the following model that explains major

league baseball players’ salaries: (4.28)

salary is the 1993 total salary, years is years in the league, gamesyr is average games played per year, bavg is career batting average (for example, bavg = 250), hrunsyr is home runs per year, and rbisyr is runs batted in per year. Testing Multiple Linear Restrictions:

The F -Test

Suppose we want to test the null hypothesis that, once years in the league and games per year have been controlled for, the statistics measuring performance —bavg, hrunsyr, and rbisyr—have no efect on salary.
Essentially, the null hypothesis states that

productivity as measured by baseball statistics has no efect on salary.

Testing Multiple Linear Restrictions:

The F -Test

In terms of the parameters of the model, the

null hypothesis is stated as (4.29) The null (4.29) constitutes three exclusion restrictions: If (4.29) is true, then bavg, hrunsyr, and rbisyr have no efect on log(salary), after years and gamesyr have

been controlled for, and therefore should be excluded from the model. Testing Multiple Linear Restrictions:

The F -Test

• What should be the alternative to (4.29)? If what

we have in mind is that “performance statistics

matter, even after controlling for years in the league and games per year,” then the appropriate alternative is simply

is not true

The alternative (4.30) holds if at least one of or is

diferent from zero. (Any or all could be diferent from zero.) Testing Multiple Linear Restrictions:

The F -Test

The steps to be done:

1. Conduct a regression for the unrestricted model (in

the example above, the model with all performance variables included) ₂

Note the SSR and R

2. Conduct a regression for the restricted model (in the example above, the model with none of the

_{• Note the SSR and R} performance variables included)

₂

Where is numerator degree of freedom = and Example: MLB1

Testing Multiple Linear Restrictions: The F -Test

The outcome of the joint test may seem surprising in light of the insignifcant t - statistics for the three variables.
What is happening is that the two variables hrunsyr and rbisyr are highly correlated, and this multicollinearity makes it difcult to uncover the partial efect of each variable; this is refected in the individual t statistics.

The R-Squared form of the F

Statistic

• It is often more convenient to have a form of the

F statistic that can be computed using the R- squareds from the restricted and unrestricted models.

One reason for this is that the R-squared is always between zero and one, whereas the SSRs can be very large depending on the unit of measurement of y, making the calculation based on the SSRs tedious.

The R-Squared form of the F

Statistic

• Using the fact that

we can substitute into F-stat formula

above and get the R-squared form of

test (bavg=0) (hrunsyr=0) (rbisyr=0)