Other issues in Multiple Regression

13.5 Other issues in Multiple Regression

In this section, we touch upon a number of issues that may arise when a multiple regression analysis is carried out. Consult the chapter references for a more exten- sive treatment of any particular topic.

transformations

Sometimes, theoretical considerations suggest a nonlinear relation between a dependent variable and two or more independent variables, whereas on other occa- sions diagnostic plots indicate that some type of nonlinear function should be used. Frequently a transformation will linearize the model.

ExamplE 13.18

Natural single crystal diamond has been widely used in ultraprecision machin- ing. However, its application to the cutting of ferrous metals has been problematic

due to significant tool wear. The article “Investigation on Frictional Wear of

Single Crystal Diamond Against Ferrous Metals” (Intl. J. of Refractory Metals

and Hard Materials, 2013: 174–179) presented the accompanying data on x 1 5 mechanical force (N), x 2 5 sliding velocity (ms), x 3 5 carbon content (), and y 5

graphitized degree, a measure of diamond wear. Obs

The investigators proposed and fit the multiplicative power regression model

Y 5 ax b 1 x b 2 x b 3 e . Taking the natural logarithm of both sides of this equation gives

ln(Y) 5 ln(a) 1 b 1 ln (x 1 )1b 2 ln (x 2 )1b 3 ln (x 3 ) 1 ln (e) (13.21)

which is our general additive multiple regression equation with the dependent varia-

ble being the natural log of graphitized degree and predictors ln(x 1 ), ln(x 2 ), and ln(x 3 ).

Presuming that e in the original model equation has a lognormal distribution, the random error in our transformed model will be normally distributed. The plausibility of this assumption can be checked with a normal probability plot of the standardized residuals resulting from fitting the transformed model.

Table 13.4 shows Minitab output from fitting (13.21). The R 2 value is quite

impressive—about 98 of the observed variation in ln(y) can be attributed to

the model relationship—and adjusted R 2 is only slightly smaller than R 2 itself.

Furthermore, the P-value for the model utility F test is .000 (the area under the F 3,5 curve to the right of 81.16), implying a useful relationship between ln(y) and at least

one of the three predictors. The point estimates of b 1 ,b 2 , and b 3 are .36557, .59366,

and −.02074, respectively. The point estimate of ln(a) is 22.53727, so the point

estimate of a itself is e 2 2.53727 5 .079082. The estimated original regression function

is then .079x .366 1 x .594 2 x 2 3 .021 ; this appears in the cited article.

596 Chapter 13 Nonlinear and Multiple regression

Table 13.4 Minitab output for the transformed regression in Example 13.18

The regression equation is ln(y) 5 2 2.54 1 0.366 ln(x1) 1 0.594 ln(x2) 2 0.0207 ln(x3)

Predictor

Coef

SE Coef

ln(x1)

ln(x2)

ln(x3)

S 5 0.0372066 R-Sq 5 98.0 R-Sq(adj) 5 96.8 Analysis of Variance

Residual Error

Predicted Values for New Observations New Obs

A point prediction of the value of graphitized degree when force 5 20, velocity 5

1, and carbon content 5 .25 requires that we first obtain a point prediction of ln(Y) by substituting ln(20), ln(0), and ln(.25) into the estimated regression equation in Table 13.4. The result is ln(yˆ) 5 −1.4134, which appears in the last line of Minitab

output. Then yˆ 5 e 2 1.4134 5 .243. Similarly, the output gives a 95 PI for ln(Y), so

a PI for Y itself is (e 2 1.5150 ,e 2 1.3118 ) 5 (.220, .269).

The normal probability plot of Figure 13.20 exhibits a substantial linear pattern, validating the normality assumption for ln(e). And the plot of standardized residuals versus predicted values [of ln(y)] does not show any pattern other than pure randomness, indicating no violation of model assumptions. However,

looking back at Table 13.4, the P-value for testing H 0 : b 3 5 0 is .246. Thus it appears that as long as ln(x 1 ) and ln(x 2 ) remain in the model, there is no useful

information about the response variable contained in the natural log of carbon

2 0.5 10 Standardized residual 2 1.0 5 2 1.5 1

Fitted value

Standardized residual

Figure 13.20 Standardized residual plot and normal probability plot for Example 13.18

13.5 Other Issues in Multiple Regression 597

content. Deleting that predictor and refitting gives R 2 5 .973 and a model utility F ratio of 107.87. The estimates of b 1 and b 2 are almost identical to those

for the three-predictor model. Also, the multiple exponential regression model

Y 5 ae b 1 x 1 1b 2 x 2 « [for which ln(Y) is regressed against x 1 and x 2 rather than against ln(x 1 ) and ln(x 2 )] fits the data about as well as does the power model. None of this

was mentioned in the cited article.

The logistic regression model was introduced in Section 13.2 to relate a dichotomous variable y to a single predictor. This model can be extended in an obvi- ous way to incorporate more than one predictor. The probability of success p is now

a function of the predictors x 1 ,x 2 , …, x k :

e b 0 1b 1 x 1 1…1b k x k p (x 1 , …, x k )5

11e b 0 1b 1 x 1 1…1b k x k

Simple algebra yields an expression for the odds: p (x 1 , …, x k )

5 e a1b 1 x 1 1…1b k x k

1 2 p(x 1 , …, x k )

The interpretation of b i (i 5 1, …, k) is analogous to the interpretation for b 1 given

in the logit function containing only a single predictor x. That is, the following argu-

ment shows that the odds change by the multiplicative factor e bi when x i increases

by 1 unit and all other predictors remain fixed. p (x 1 , …, x i 1 1, …, x k )

5 e a1b 1 x 1 1…b i (x i 1 1)1…1b k x k

1 2 p(x 1 , …, x i 1 1, …, x k )

5 e a1b 1 x 1 1… b i x i 1…1b k x k 1b i p (x 1 , …, x k )

5 e b i

1 2 p(x 1 , …, x k )

Again, statistical software must be used to estimate parameters, calculate relevant standard deviations, and provide other inferential information.

ExamplE 13.19

Data was obtained from 189 women who gave birth during a particular period at the Bayside Medical Center in Springfield, MA, in order to identify factors associated with low birth weight. The accompanying Minitab output resulted from a logistic regression in which the dependent variable indicated whether (1) or not (0) a child had low birth weight (,2500 g), and predictors were weight of the mother at her last menstrual period, age of the mother, and an indicator variable for whether (1) or not (0) the mother had smoked during pregnancy.

Logistic Regression Table

Odds 95 CI

Predictor

Coef SE Coef

Ratio Lower Upper

It appears that age is not an important predictor of LBW, provided that the two other predictors are retained. The other two predictors do appear to be informative. The point estimate of the odds ratio associated with smoking status is 1.92 [ratio of the odds of LBW for a smoker to the odds for a nonsmoker, where odds 5 P sY 5 1dyPsY 5 0d];

598 ChApter 13 Nonlinear and Multiple regression

at the 95 confidence level, the odds of a low-birth-weight child could be as much as 3.7 times higher for a smoker what it is for a nonsmoker.

Please see one of the chapter references for more information on logistic regression, including methods for assessing model effectiveness and adequacy.

Other issues in Multiple Regression

13.5 Other issues in Multiple Regression

Parts

Dokumen yang terkait

AN ALIS IS YU RID IS PUT USAN BE B AS DAL AM P E RKAR A TIND AK P IDA NA P E NY E RTA AN M E L AK U K A N P R AK T IK K E DO K T E RA N YA NG M E N G A K IB ATK AN M ATINYA P AS IE N ( PUT USA N N O MOR: 9 0/PID.B /2011/ PN.MD O)

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

Anal isi s L e ve l Pe r tanyaan p ad a S oal Ce r ita d alam B u k u T e k s M at e m at ik a Pe n u n jang S MK Pr ogr a m Keahl ian T e k n ologi , Kese h at an , d an Pe r tani an Kelas X T e r b itan E r lan gga B e r d asarkan T ak s on om i S OL O

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dukungan

Links

Other issues in Multiple Regression

13.5 Other issues in Multiple Regression

Parts

Dokumen yang terkait

AN ALIS IS YU RID IS PUT USAN BE B AS DAL AM P E RKAR A TIND AK P IDA NA P E NY E RTA AN M E L AK U K A N P R AK T IK K E DO K T E RA N YA NG M E N G A K IB ATK AN M ATINYA P AS IE N ( PUT USA N N O MOR: 9 0/PID.B /2011/ PN.MD O)

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

Anal isi s L e ve l Pe r tanyaan p ad a S oal Ce r ita d alam B u k u T e k s M at e m at ik a Pe n u n jang S MK Pr ogr a m Keahl ian T e k n ologi , Kese h at an , d an Pe r tani an Kelas X T e r b itan E r lan gga B e r d asarkan T ak s on om i S OL O

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dokumen yang Anda mencari sudah siap untuk unduhkan