Estimating Model Parameters

12.2 Estimating Model Parameters

  We will assume in this and the next several sections that the variables x and y are

  related according to the simple linear regression model. The values of b 0 ,b 1 , and

  s 2 will almost never be known to an investigator. Instead, sample data consisting

  of n observed pairs sx 1 ,y 1 d,…, sx n ,y n d will be available, from which the model

  parameters and the true regression line itself can be estimated. These observa- tions are assumed to have been obtained independently of one another. That is,

  y i is the observed value of Y i , where Y i 5b 0 1b 1 x i 1e i and the n deviations

  e 1 ,e 2 ,…, e n are independent rv’s. Independence of Y 1 ,Y 2 ,…, Y n follows from

  independence of the e i ’s.

  According to the model, the observed points will be distributed about the true regression line in a random manner. Figure 12.6 shows a typical plot of observed pairs along with two candidates for the estimated regression line. Intuitively, the line

  y5a 0 1 a 1 x is not a reasonable estimate of the true line y 5 b 0 1b 1 x because, if

  y5a 0 1 a 1 x were the true line, the observed points would almost surely have been

  closer to this line. The line y 5 b 0 1 b 1 x is a more plausible estimate because the

  observed points are scattered rather closely about this line.

  y y 5b 0 1b 1 x

  y 5a 0 1a 1 x

  Figure 12.6 Two different estimates of the true regression line

  Figure 12.6 and the foregoing discussion suggest that our estimate of

  y5b 0 1b 1 x should be a line that provides in some sense a best fit to the observed data points. This is what motivates the principle of least squares, which can be traced back to the German mathematician Gauss (1777–1855). According to this principle,

  a line provides a good fit to the data if the vertical distances (deviations) from the observed points to the line are small (see Figure 12.7). The measure of the goodness of fit is the sum of the squares of these deviations. The best-fit line is then the one having the smallest possible sum of squared deviations.

  12.2 estimating Model parameters 497

  Time to failure (hr)

  x

  10 20 30 40 Applied stress (kgmm 2 )

  Figure 12.7 Deviations of observed data from line y5b 0 1 b 1 x

  Principle of Least Squares

  The vertical deviation of the point (x i ,y i ) from the line y 5 b 0 1 b 1 x is height of point 2 height of line 5 y i 2 (b 0 1 b 1 x i ) The sum of squared vertical deviations from the points (x 1 ,y 1 ),…, (x n ,y n ) to

  the line is then

  n

  f (b 0 ,b 1 )5

  [y i 2 (b 0 1 b x o 2 1 i )]

  i5 1

  The point estimates of b 0 and b 1 , denoted by bˆ 0 and bˆ 1 and called the least squares estimates, are those values that minimize f(b 0 ,b 1 ). That is, bˆ 0 and bˆ 1 are such that f(bˆ 0 , bˆ 1 ) f(b 0 ,b 1 ) for any b 0 and b 1 . The estimated

  regression line or least squares line is then the line whose equation is

  y 5 bˆ 0 1 bˆ 1 x .

  The minimizing values of b 0 and b 1 are found by taking partial derivatives of

  f sb 0 ,b 1 d with respect to both b 0 and b 1 , equating them both to zero [analogously to

  f 9 sbd 5 0 in univariate calculus], and solving the equations

  − f sb 0 ,b 1 d

  0 o

  5 2 sy i 2 b 0 2 b

  1 x

  d s21d 5 0

  − b i

  − f sb 0 ,b 1 d

  − b o sy i

  Cancellation of the 22 factor and rearrangement gives the following system of equa- tions, called the normal equations :

  nb 0 1 _ x i + b 1 5 o y o i

  o 2

  _ x

  i + b 0 1 _ x i + b 1 5 o x i y o i

  These equations are linear in the two unknowns b 0 and b 1 . Provided that not all x i ’s

  are identical, the least squares estimates are the unique solution to this system.

  498 Chapter 12 Simple Linear regression and Correlation

  pROpOSITION

  The least squares estimate of the slope coefficient b 1 of the true regression line is

Dokumen yang terkait

AN ALIS IS YU RID IS PUT USAN BE B AS DAL AM P E RKAR A TIND AK P IDA NA P E NY E RTA AN M E L AK U K A N P R AK T IK K E DO K T E RA N YA NG M E N G A K IB ATK AN M ATINYA P AS IE N ( PUT USA N N O MOR: 9 0/PID.B /2011/ PN.MD O)

0 82 16

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

Anal isi s L e ve l Pe r tanyaan p ad a S oal Ce r ita d alam B u k u T e k s M at e m at ik a Pe n u n jang S MK Pr ogr a m Keahl ian T e k n ologi , Kese h at an , d an Pe r tani an Kelas X T e r b itan E r lan gga B e r d asarkan T ak s on om i S OL O

2 99 16

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22