Sampling Variance of the OLS estimator
Introduction to Econometrics
Ekki Syamsulhakim Undergraduate Program Department of Economics Universitas Padjadjaran Sampling Variance of the OLS estimator
• • We know that when is not biased
• The variance of can be computed
using the formula:
Sampling Variance of the OLS estimator
A MUST READ
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator Variance, Perfect
Collinearity, and Multicollinearity
Estimator of
Estimator of
Estimator of Standard Error of
Inference
- We assume that unobserved error is normally distributed in the population
- – Called “Normality Assumption”
Inference
Inference
Inference Testing Hypotheses About A Single Population Parameter: The t-Test
Hypothesis Testing
- t-test or (later) F-test (individual coefcient vs overall model tests)
- two sided vs one sided test
- – Your hypothesis
- – check the theory
- – our research question
- 2 methods
- – t-stat method
Hypothesis Testing
- The long steps:
- – State the null and alternative hypothesis
- – Choose the level of signifcance
- – For t-test method: observe t-statistics and compute t-critical
- – For p-value method: compute p-value
- – State the decision rule
- – State the conclusion
Regression reg rent room sqrm if rent<4000000 & sqrm<3000 & room<30
Model | 5.4331e+14 2 2.7166e+14 Prob > F = 0.0000
-------------+------------------------------ F( 2, 11040) = 434.33
Source | SS df MS Number of obs = 11043
Total | 7.4484e+15 11042 6.7455e+11 Root MSE = 7.9e+05
-------------+------------------------------ Adj R-squared = 0.0728
Residual | 6.9051e+15 11040 6.2546e+11 R-squared = 0.0729
-------------+----------------------------------------------------------------
rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]
------------------------------------------------------------------------------
_cons | 277297.8 18998.01 14.60 0.000 240058.3 314537.4
sqrm | 735.5316 141.3077 5.21 0.000 458.5433 1012.52
room | 88841.68 3935.845 22.57 0.000 81126.72 96556.64
------------------------------------------------------------------------------
t-statistics Two sided t-test (ex: t-stat app)
- State the hypothesis:
H : =0 number of room has no impact o on rent
H : ≠ 0 number of room has an impact
A on rent Choose signifcance level: observe t-stat
Two sided t-test (ex: t-stat app)
Compute t-crit t-crit (a, df=n-k-1=11043-2-1=11040) = 1.960179 Rejection criteria: Reject H if |t-stat |> |t-crit| Conclusion: Since our |t-stat| > |t-crit| or 22.57> 1.960179, we reject H . Conclusion: Since our t-stat > t-crit (22.57 > 1.960179) we reject H .
Therefore we have sufcient evidence that number of room
has an impact on rentTwo sided t-test (ex: p-value app)
State the hypothesis:
H : =0 number of room has no impact on o rent H : ≠0 number of room has an impact on A rent Choose signifcance level: Compute (observe) p-value
Two sided t-test (ex: p-value app)
Rejection criteria: • Reject H if p-value < Conclusion: Since our p-value=0.0000… is less than =0.05, we reject H .
Therefore we have sufcient evidence that number of room has an impact on rent
One sided t-test (ex: t-stat
app)
- As number of room increases, it is
sensible to think that the rent also increases (probably based on theory)
- We can (should) use 1 tail test
- – We must compute new t-critical as the output of STATA /
GRETL / EVIEWS / EXCEL is t-critical for 2-sided test
One sided t-test (ex: t-stat
app)
- State the hypothesis:
H : =0 number of room has no impact on rent o H : >0 number of room has a positive impact A on rent Choose signifcance level: For computing new t-crit 1.64 observe t-stat t-stat = 22.57
One sided t-test (ex: t-stat
app)
Compute t-crit for 1 sidedt-crit (2a, df=n-k-1=11043-2-1=11040) = 1.645 (positive side)
Rejection criteria: Reject H if |t-stat |> |t-crit| Conclusion: Since our |t-stat| > |t-crit| or 22.57> 1.645, we reject H .Therefore we have sufcient evidence that number of
One sided t-test (ex: p-value
app)- As number of room increases, it is sensible to think that the rent also increases
- We can (should) use 1 tail test
- – We must divide p-value by 2
H : =0 number of room has no impact on rent o H : >0 number of room has a positive impact on A rent Choose signifcance level: Compute (observe) p-value
Because we are doing 1 tail test, P-value given by
Econometric Software must be divided by 2; Hence calculated =0.0000…Example: p-value method – t
- Rejection criteria:
Reject H if Conclusion: Since our is less than =0.05, we reject H .
Therefore we have sufcient evidence that number of room has a positive impact on rent
Testing Other Hypotheses About
- • Consider a simple model relating the
annual number of crimes on college
campuses (crime) to student enrollment (enroll)- This is a constant elasticity model, where is the elasticity of crime with
Testing Other Hypotheses
About
- It is not much use to test H : , a
expect the total number of crimes to
increase as the size of the campus increases
- A more interesting hypothesis to test would be that the elasticity of crime with respect to enrollment is one
H :
- – This means that a 1% increase in enrollment
Testing Other Hypotheses
About
: ,
- A noteworthy alternative is H
1
which implies that a 1% increase in
enrollment increases campus crime
by more than 1%• If , then, in a relative sense—not just
an absolute sense—crime is more of
a problem on larger campuses.
Testing Other Hypotheses About
• The estimated elasticity of crime with
respect to enroll, 1.27, is in the direction of the alternative .- But is there enough evidence to
Testing Other Hypotheses
About
- if the null is stated as H : >
• The usual t statistic is obtained when
.
• where is our hypothesized value of ,
then the appropriate t statistic isTesting Other Hypotheses
About
- • The correct t statistic is
• The one-sided 5% critical value for a
t distribution with df is about 1.66
- So we clearly reject in favor of at the 5% level
F-test (F-stat approach)
H : b = b =0 all coefcients are zero (or: all independent 1 2
variables do not afect dependent variables; or: room and sqrm
do not afect rent) H A : At least one of b i is NOT zero (or: at least one independent variable is NOT zero)F-stat = 434.33 F-crit (a=0.05,k=2,n-k-1=26)2.99 Because F-stat > F crit, reject H Conclusion: we have sufcient evidence that at least one of our independent variable is useful in explaining house rent
F-test (p-value approach)
H : = =0 all coefcients are zero b b1
b A
i
Using p-value approach, we can see that our p-value for F- test is 0.000… which is less than our (default) a=0.05 Hence, reject H Conclusion: we have sufcient evidence that at least one of Joint / Multiple hypothesis
test
- We often test hypotheses involving more than one of the population parameters.
- – test a single hypothesis involving more than one of the .
- – test multiple hypotheses (multiple linear restrictions – the F-test)
Testing Multiple Linear Restrictions:
The F -Test
- We begin with the leading case of testing whether a set of independent variables has no partial efect on a dependent variable
- – we want to test whether a group of variables has no efect on the dependent variable.
- – the null hypothesis is that a set of variables has no efect on y, once another set of
Testing Multiple Linear Restrictions:
The F -Test
• consider the following model that explains major
league baseball players’ salaries: (4.28)
salary is the 1993 total salary, years is years in the league, gamesyr is average games played per year, bavg is career batting average (for example, bavg = 250), hrunsyr is home runs per year, and rbisyr is runs batted in per year. Testing Multiple Linear Restrictions:
The F -Test
- Suppose we want to test the null hypothesis that, once years in the league and games per year have been controlled for, the statistics measuring performance —bavg, hrunsyr, and rbisyr—have no efect on salary.
- Essentially, the null hypothesis states that
productivity as measured by baseball statistics has no efect on salary.
Testing Multiple Linear Restrictions:
The F -Test
- In terms of the parameters of the model, the
null hypothesis is stated as (4.29) The null (4.29) constitutes three exclusion restrictions: If (4.29) is true, then bavg, hrunsyr, and rbisyr have no efect on log(salary), after years and gamesyr have
been controlled for, and therefore should be excluded from the model. Testing Multiple Linear Restrictions:
The F -Test
• What should be the alternative to (4.29)? If what
we have in mind is that “performance statistics
matter, even after controlling for years in the league and games per year,” then the appropriate alternative is simplyis not true
The alternative (4.30) holds if at least one of or is
diferent from zero. (Any or all could be diferent from zero.) Testing Multiple Linear Restrictions:
The F -Test
- The steps to be done:
1. Conduct a regression for the unrestricted model (in
the example above, the model with all performance variables included) 2- Note the SSR and R
2. Conduct a regression for the restricted model (in the example above, the model with none of the
- • Note the SSR and R performance variables included) 2
- The outcome of the joint test may seem surprising in light of the insignifcant t - statistics for the three variables.
- What is happening is that the two variables hrunsyr and rbisyr are highly correlated, and this multicollinearity makes it difcult to uncover the partial efect of each variable; this is refected in the individual t statistics.
• It is often more convenient to have a form of the
- One reason for this is that the R-squared is always between zero and one, whereas the SSRs can be very large depending on the unit of measurement of y, making the calculation based on the SSRs tedious.
- • Using the fact that
Where is numerator degree of freedom = and Example: MLB1
Testing Multiple Linear Restrictions: The F -Test
The R-Squared form of the F
Statistic
F statistic that can be computed using the R- squareds from the restricted and unrestricted models.
The R-Squared form of the F
Statistic
we can substitute into F-stat formula
above and get the R-squared form of
F-statistics
test (bavg=0) (hrunsyr=0) (rbisyr=0)