6 Japan’s high population density has resulted in a multitude of resource-usage

Example 12.6 Japan’s high population density has resulted in a multitude of resource-usage

  problems. One especially serious difficulty concerns waste removal. The article “Innovative Sludge Handling Through Pelletization Thickening” (Water Research, 1999: 3245–3252) reported the development of a new compression machine for

  CHAPTER 12 Simple Linear Regression and Correlation

  processing sewage sludge. An important part of the investigation involved relating the moisture content of compressed pellets (y, in ) to the machine’s filtration rate (x, in kg-DSmhr). The following data was read from a graph in the article:

  Relevant summary quantities (summary statistics) are gx i 5 2817.9, g y i 5 1574.8,

  gx 2 i 5 415,949.85, g x i y i 5 222,657.88 , and gy 2 i 5 124,039.58 , from which

  x 5 140.895, y 5 78.74, S xx 5 18,921.8295 , and S xy 5 776.434 . Thus

  b ˆ 0 5 78.74 2 (.04103377)(140.895) 5 72.958547 < 72.96

  from which the equation of least squares line is y 5 72.96 1 .041x . For numerical accuracy, the fitted values are calculated from yˆ i 5 72.958547 1 .04103377x i :

  yˆ 1 5 72.958547 1 .04103377(125.3) < 78.100, y 1 2 yˆ 1 < 2.200, etc.

  Nine of the 20 residuals are negative, so the corresponding nine points in a scatter plot of the data lie below the estimated regression line. All predicted values (fits) and residuals appear in the accompanying table.

  Obs

  Filtrate

  Moistcon

  Fit

  Residual

  ■

  12.2 Estimating Model Parameters

  In much the same way that the deviations from the mean in a one-sample sit- uation were combined to obtain the estimate s 2 5 g (x i 2x) 2 (n 2 1) , the estimate of s 2 in regression analysis is based on squaring and summing the residuals. We will continue to use the symbol s 2 for this estimated variance, so don’t confuse it with our

  previous s 2 .

  DEFINITION

  The error sum of squares (equivalently, residual sum of squares), denoted by SSE, is

  SSE 5 g (y

  2 yˆ i ) 5 g [y i 2 (b ˆ 0 1b ˆ 2 1 x i )]

  i

  and the estimate of s 2 is

  The divisor n22 in s 2 is the number of degrees of freedom (df ) associated with SSE and the estimate s 2 . This is because to obtain s 2 , the two parameters b 0 and b 1 must

  first be estimated, which results in a loss of 2 df (just as m had to be estimated in one-

  sample problems, resulting in an estimated variance based on n21 df ). Replacing each y in the formula for s 2 i 2 by the rv Y i gives the estimator S . It can be shown that S 2 is an unbiased estimator for s 2 (though the estimator S is not unbiased for ␴). An

  interpretation of s here is similar to what we suggested earlier for the sample standard deviation: Very roughly, it is the size of a typical vertical deviation within the sample from the estimated regression line.

  Example 12.7 The residuals for the filtration rate–moisture content data were calculated previously.

  The corresponding error sum of squares is

  SSE 5 (2.200) 2 1 (2.188) 2 1 c 1 (1.099) 2 5 7.968 The estimate of s 2 is then s ˆ 2 5s 2 5 7.968(20 2 2) 5 .4427 , and the estimated

  standard deviation is s ˆ 5 s 5 1.4427 5 .665 . Roughly speaking, .665 is the mag- nitude of a typical deviation from the estimated regression line—some points are closer to the line than this and others are further away.

  ■ Computation of SSE from the defining formula involves much tedious

  arithmetic, because both the predicted values and residuals must first be calculated. Use of the following computational formula does not require these quantities.

  SSE 5 g y 2 i 2b ˆ 0 gy i 2b ˆ 1 gx i y i

  This expression results from substituting yˆ i 5 bˆ 0 1 bˆ 1 x i into g(y

  2 i

  2 yˆ i ) , squaring

  the summand, carrying through the sum to the resulting three terms, and simplify- ing. This computational formula is especially sensitive to the effects of rounding in

  bˆ 0 and bˆ 1 , so carrying as many digits as possible in intermediate computations will protect against round-off error.

  CHAPTER 12 Simple Linear Regression and Correlation

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

STUDI AREA TRAFFIC CONTROL SYSTEM (ATCS) PADA PERSIMPANGAN DI KOTA MALANG (JALAN A. YANI – L. A. SUCIPTO – BOROBUDUR)

6 78 2

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

FENOLOGI KEDELAI BERDASARKAN KRITERIA FEHR-CAVINESS PADA DELAPAN PERSILANGAN SERTA EMPAT TETUA KEDELAI (Glycine max. L. Merrill)

0 46 16

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22