17 Figure 13.16 shows a normal probability plot of the standardized residuals for the

Example 13.17 Figure 13.16 shows a normal probability plot of the standardized residuals for the

  adsorption data and fitted model given in Example 13.15. The straightness of the plot casts little doubt on the assumption that the random deviation is normally distributed. P

  Standardized residual 1.5

  z percentile

  A normal probability plot of the standardized residu-

  als for the data and model of Example 13.15

  Figure 13.17 shows the other suggested plots for the adsorption data. Given that there are only 13 observations in the data set, there is not much evidence of a pat- tern in any of the first three plots other than randomness. The point at the bottom of each of these three plots corresponds to the observation with the large residual. We will say more about such observations subsequently. For the moment, there is no compelling reason for remedial action.

  CHAPTER 13 Nonlinear and Multiple Regression

  Standardized residual

  Standardized residual

  Standardized residual

  Predicted y

  Predicted y

  Figure 13.17 Diagnostic plots for the adsorption data: (a) standardized residual versus x 1 ; (b) standardized resid-

  ual versus x 2 ; (c) standardized residual versus ; (d) versus yˆ y

  ■

  EXERCISES Section 13.4 (36–54)

  36. Cardiorespiratory fitness is widely recognized as a major

  a. Interpret b 1 and b 3 .

  component of overall physical well-being. Direct measure-

  b. What is the expected value of VO 2 max when weight is 76

  ment of maximal oxygen uptake (VO 2 max) is the single best

  kg, age is 20 yr, walk time is 12 min, and heart rate is 140

  measure of such fitness, but direct measurement is

  bm?

  time-consuming and expensive. It is therefore desirable to

  c. What is the probability that VO 2 max will be between

  have a prediction equation for VO 2 max in terms of easily

  1.00 and 2.60 for a single observation made when the

  obtained quantities. Consider the variables

  values of the predictors are as stated in part (b)? y 5 VO 2 max (Lmin ) x 1 5 weight (kg) 37. A trucking company considered a multiple regression

  model for relating the dependent variable y 5 total daily x 2 5 age (yr) travel time for one of its drivers (hours) to the predictors

  x 3 5 time necessary to walk 1 mile (min)

  x 1 5 distance traveled (miles) and x 2 5 the number of

  x 4 5 heart rate at the end of the walk (beatsmin )

  deliveries made. Suppose that the model equation is

  Y 5 2.800 1 .060x

  1 .900x 1P

  Here is one possible model, for male students, consistent

  with the information given in the article “Validation of the

  a. What is the mean value of travel time when distance trav-

  Rockport Fitness Walking Test in College Males and

  eled is 50 miles and three deliveries are made?

  Females” (Research Quarterly for Exercise and Sport,

  b. How would you interpret b 1 5 .060 , the coefficient of

  1994: 152–158):

  the predictor x 1 ? What is the interpretation of b 2 5 .900 ? c. If s 5 .5 hour, what is the probability that travel time

  Y 5 5.0 1 .01x 1 2 .05x 2 2 .13x 3 2 .01x 4 1P

  will be at most 6 hours when three deliveries are made

  s 5 .4

  and the distance traveled is 50 miles?

  13.4 Multiple Regression Analysis

  38. Let y 5 wear life of a bearing, x 1 5 oil viscosity , and

  e. The estimated model was based on n 5 30 observations,

  x 2 5 load . Suppose that the multiple regression model

  with SST 5 39.2 and SSE 5 20.0 . Calculate and inter-

  relating life to viscosity and load is

  pret the coefficient of multiple determination, and then carry out the model utility test using

  a 5 .05 .

  Y 5 125.0 1 7.75x 1 1 .0950x 2 2 .0090x 1 x 2 1P

  41. The ability of ecologists to identify regions of greatest species

  a. What is the mean value of life when viscosity is 40 and

  richness could have an impact on the preservation of genetic

  load is 1100?

  diversity, a major objective of the World Conservation

  b. When viscosity is 30, what is the change in mean life

  Strategy. The article “Prediction of Rarities from Habitat

  associated with an increase of 1 in load? When viscosity

  Variables: Coastal Plain Plants on Nova Scotian Lakeshores”

  is 40, what is the change in mean life associated with an

  (Ecology, 1992: 1852–1859) used a sample of n 5 37 lakes

  increase of 1 in load?

  to obtain the estimated regression equation

  39. Let y 5 sales at a fast-food outlet (1000s of ), x 1 5 number

  y 5 3.89 1 .033x

  1 .024x

  1 .023x

  of competing outlets within a 1-mile radius, x 2 5 population

  within a 1-mile radius (1000s of people), and x

  be an indicator

  2 .0080x 4 2 .13x 5 2 .72x

  variable that equals 1 if the outlet has a drive-up window and 0

  where species y5 richness, x 1 5 watershed area,

  otherwise. Suppose that the true regression model is

  x 2 5 shore width , x 3 5 poor drainage (), x 4 5 water color

  Y 5 10.00 2 1.2x 1 1 6.8x 2 1 15.3x 3 1P

  (total color units), x 5 sand () , and x 6 5 alkalinity . The coefficient of multiple determination was reported as

  a. What is the mean value of sales when the number of

  R 2 5 .83 . Carry out a test of model utility.

  competing outlets is 2, there are 8000 people within a 1-mile radius, and the outlet has a drive-up window?

  42. An investigation of a die-casting process resulted in the

  b. What is the mean value of sales for an outlet without a

  accompanying data on x 1 5 furnace temperature , x 2 5

  drive-up window that has three competing outlets and

  die close time , and y 5 temperature difference on the die

  5000 people within a 1-mile radius?

  surface (“A Multiple-Objective Decision-Making Approach

  c. Interpret b 3 .

  for Assessing Simultaneous Improvement in Die Life and Casting Quality in a Die Casting Process,” Quality En-

  40. The article “Readability of Liquid Crystal Displays: A Re-

  gineering, 1994: 371–383).

  sponse Surface” (Human Factors, 1983: 185–190) used a multiple regression model with four independent variables

  to study accuracy in reading liquid crystal displays. The

  x 2 6 7 6 7 6

  variables were

  y5 error percentage for subjects reading a four- digit

  x 1

  liq uid crystal display

  x 1 5 level of backlight (ranging from 0 to 122 cdm )

  x 2 5 character subtense (ranging from .0258 to 1.348)

  x 3 5 viewing angle (ranging from 08 to 608)

  Minitab output from fitting the multiple regression model

  x 4 5 level of ambient light (ranging from 20 to 1500 lux)

  with predictors x 1 and x 2 is given here.

  The model fit to data was Y5b 0 1b 1 x 1 1b 2 x 2 1b 3 x 3 1 The regression equation is

  b 4 x 4 1P . The resulting estimated coefficients were

  tempdiff 52 200 1 0.210 furntemp

  bˆ 0 5 1.52, bˆ 1 5 .02, bˆ 2 5 21.40, bˆ 3 5 .02 , and bˆ 4 5 1 3.00 clostime

  Stdev t-ratio p

  a. Calculate an estimate of expected error percentage when

  b. Estimate the mean error percentage associated with a

  s 5 1.058

  R - sq 5 99.1 R - sq(adj) 5 98.8

  backlight level of 20, character subtense of .5, viewing angle of 10, and ambient light level of 30.

  Analysis of Variance

  c. What is the estimated expected change in error percentage

  when the level of ambient light is increased by 1 unit while

  Regression

  all other variables are fixed at the values given in part (a)?

  Error

  Answer for a 100-unit increase in ambient light level.

  Total

  d. Explain why the answers in part (c) do not depend on the

  a. Carry out the model utility test.

  fixed values of x 1 , x 2 , and x 3 . Under what conditions

  b. Calculate and interpret a 95 confidence interval for b 2 ,

  would there be such a dependence?

  the population regression coefficient of x 2 .

  CHAPTER 13 Nonlinear and Multiple Regression

  c. When x 1 5 1300 and x 2 57 , the estimated standard

  b. Since bˆ 1 5 246.0 , is it legitimate to conclude that if

  deviation of is Yˆ s Yˆ

  5 .353 . Calculate a 95 confidence

  cobalt content increases by 1 unit while the values of the

  interval for true average temperature difference when

  other predictors remain fixed, surface area can be ex-

  furnace temperature is 1300 and die close time is 7.

  pected to decrease by roughly 46 units? Explain your

  d. Calculate a 95 prediction interval for the temperature

  reasoning.

  difference resulting from a single experimental run with

  c. Does there appear to be a useful linear relationship be-

  a furnace temperature of 1300 and a die close time of 7.

  tween y and the predictors?

  43. An experiment carried out to study the effect of the mole

  d. Given that mole contents and calcination temperature

  contents of cobalt (x 1 ) and the calcination temperature (x 2 )

  remain in the model, does the interaction predictor x 3

  on the surface area of an iron-cobalt hydroxide catalyst (y)

  provide useful information about y? State and test the

  resulted in the accompanying data (“Structural Changes and

  appropriate hypotheses using a significance level of .01.

  Surface Properties of Co x Fe O 4 Spinels,” J. of Chemical

  e. The estimated standard deviation of when mole con- Yˆ

  32x

  Tech. and Biotech., 1994: 161–170). A request to the SAS

  tents is 2.0 and calcination temperature is 500 is package to fit b 0 1b 1 x 1 1b 2 x 2 1b 3 x 3 , where x 3 5x 1 x 2 s Yˆ

  5 4.69 . Calculate a 95 confidence interval for the

  (an interaction predictor) yielded the output below.

  mean value of surface area under these circumstances. 44. The accompanying Minitab regression output is based on

  x 1 .6

  1.0 data that appeared in the article “Application of Design of

  Experiments for Modeling Surface Roughness in Ultrasonic

  y

  Vibration Turning” (J. of Engr. Manuf., 2009: 641–652). The response variable is surface roughness (mm), and the x 1 2.6 independent variables are vibration amplitude (mm), depth

  of cut (mm), feed rate (mmrev), and cutting speed (mmin),

  y

  19.6 17.8 9.1 53.1 52.0 43.4 42.4 respectively.

  a. How many observations were there in the data set?

  x 1 2.6 2.8

  b. Interpret the coefficient of multiple determination. c. Carry out a test of hypotheses to decide if the model

  specifies a useful relationship between the response

  y

  31.6 40.9 37.9 27.5 27.3 19.0 variable and at least one of the predictors.

  a. Predict the value of surface area when cobalt content is

  d. Interpret the number 18.2602 that appears in the Coef

  2.6 and temperature is 250, and calculate the value of the

  column.

  corresponding residual.

  SAS output for Exercise 43

  Dependent Variable: SURFAREA

  Analysis of Variance

  Source

  DF Sum of Squares

  Mean Square

  F Value

  Prob . F

  C Total

  Root MSE

  Dep Mean

  Adj R-sq

  C.V. 34.07314

  Parameter Estimates

  DF Estimate

  Error

  Parameter 5 0 .uTu

  INTERCEP

  COBCON

  TEMP

  CONTEMP

  13.4 Multiple Regression Analysis

  e. At significance level .10, can any single one of the

  c. Calculate and interpret a 95 CI for b 2 .

  predictors be eliminated from the model provided that all

  d. The estimated standard deviation of a prediction for re-

  of the other predictors are retained?

  pair time when elapsed time is 6 months and the repair is

  f. The estimated SD of Yˆ when the values of the four

  electrical is .192. Predict repair time under these circum-

  predictors are 10, .5, .25, and 50, respectively, is .1178.

  stances by calculating a 99 prediction interval. Does

  Calculate both a CI for true average roughness and a PI

  the interval suggest that the estimated model will give an

  for the roughness of a single specimen, and compare

  accurate prediction? Why or why not?

  these two intervals.

  47. Efficient design of certain types of municipal waste inciner-

  The regression equation is

  ators requires that information about energy content of the Ra 52 0.972 2 0.0312 a 1 0.557 d 1 18.3 f 1 0.00282 v waste be available. The authors of the article “Modeling the

  Predictor

  Coef

  SE Coef

  T

  P

  Energy Content of Municipal Solid Waste Using Multiple

  Regression Analysis” (J. of the Air and Waste Mgmnt.

  Assoc., 1996: 650–656) kindly provided us with the accom-

  panying data on y 5 energy content (kcalkg) , the three

  physical composition variables x 1 5 plastics by weight,

  R - Sq 5 88.6 R - Sq(adj) 88.0 x 2 5 paper by weight, and x 3 5 5 garbage by weight, and the proximate analysis variable x 4 5 moisture by

  weight for waste specimens obtained from a certain region.

  Residual Error 76 51.36 0.68

  Garbage Water Content

  45. The article “Analysis of the Modeling Methodologies for Predicting the Strength of Air-Jet Spun Yarns” (Textile Res.

  J., 1997: 39–44) reported on a study carried out to relate

  yarn tenacity (y, in gtex) to yarn count (x , in tex), percent-

  age polyester (x ), first nozzle pressure (x , in kgcm 2 ), and

  second nozzle pressure (x , in kgcm 2 ). The estimate of the

  constant term in the corresponding multiple regression

  equation was 6.121. The estimated coefficients for the four

  predictors were

  8 23.97 19.39 44.11 43.82 .082, .113, .256, and .219, respectively, 1656

  and the coefficient of multiple determination was .946.

  a. Assuming that the sample size was

  , state and test

  the appropriate hypotheses to decide whether the fitted

  model specifies a useful linear relationship between the

  dependent variable and at least one of the four model

  b. Again using n 5 25 , calculate the value of adjusted R 2 .

  c. Calculate a 99 confidence interval for true mean yarn

  tenacity when yarn count is 16.5, yarn contains 50

  polyester, first nozzle pressure is 3, and second nozzle

  pressure is 5 if the estimated standard deviation of

  predicted tenacity under these circumstances is .350.

  46. A regression analysis carried out to relate y 5 repair time for

  a water filtration system (hr) to x 1 5 elapsed time since the

  previous service (months) and x 2 5 type of repair (1 if elec-

  trical and 0 if mechanical) yielded the following model based

  25 17.74 23.61 37.36 49.92 1205 on observations: n 5 12 y 5 .950 1 .400x 1 1 1.250x 2 . In 26 20.54 26.58 35.40 53.58 1221

  addition, , SST 5 12.72, SSE 5 2.09 and s bˆ 2 5 .312 . 27 18.25 13.77 51.32 51.38 1138

  a. Does there appear to be a useful linear relationship be-

  tween repair time and the two model predictors? Carry

  out a test of the appropriate hypotheses using a signifi-

  cance level of .05.

  b. Given that elapsed time since the last service remains in the model, does type of repair provide useful information

  Using Minitab to fit a multiple regression model with the

  about repair time? State and test the appropriate hypothe-

  four aforementioned variables as predictors of energy con-

  ses using a significance level of .01.

  tent resulted in the following output:

  CHAPTER 13 Nonlinear and Multiple Regression

  The regression equation is

  Obs

  enercont 5 2245 1 28.9 plastics 1 7.64 paper 1 4.30 garbage x 1 0

  A multiple regression model with k59 predictors— x ,x ,

  x 3 ,x 4 5x 2 1 ,x 5 5x 2 ,x 6 5x 2 3 ,x 7 5x 1 x 2 ,x 8 5x 1 x 3 , and s 5 31.48 R - Sq 5 96.4 R - Sq(adj) 5 95.8 x 9 5x 2 x 3 —was fit to the data, resulting in bˆ 0 5 21.967,

  Analysis of Variance

  bˆ 1 5 2.8125, bˆ 2 5 1.2750, bˆ 3 5 3.4375, bˆ 4 5 22.208,

  bˆ 5 5 1.867, bˆ 6 5 24.208, bˆ 7 5 2.975, bˆ 8 5 23.750,

  bˆ 9 5 22.325, SSE 5 23.379 , and R 2 5 .938 .

  a. Does this model specify a useful relationship? State and test

  Total

  the appropriate hypotheses using a significance level of .01.

  a. Interpret the values of the estimated regression

  b. The estimated standard deviation of mˆ Y when coefficients and . bˆ 1 bˆ 4 x 1 5c5x 9 50 (i.e., when temperature 5 100 ,

  b. State and test the appropriate hypotheses to decide

  time 5 75 , and concentration 5 8 ) is 1.248. Calculate a

  whether the model fit to the data specifies a useful linear

  95 CI for expected weight loss when temperature,

  relationship between energy content and at least one of

  time, and concentration have the specified values.

  the four predictors.

  c. Calculate a 95 PI for a single weight-loss value to be

  c. Given that plastics, paper, and water remain in

  observed when temperature, time, and concentration

  the model, does garbage provide useful information

  have values 100, 75, and 8, respectively.

  about energy content? State and test the appropriate

  d. Fitting the model with only x 1 , x 2 , and x 3 as predictors

  hypotheses using a significance level of .05.

  gave R 2 5 .456 and SSE 5 203.82 . Does at least one of

  d. Use the fact that s Yˆ

  5 7.46 when x 1 5 20, x 2 5 25,

  the second-order predictors provide additional useful

  x 3 5 40 , and x 4 5 45 to calculate a 95 confidence

  information? State and test the appropriate hypotheses.

  interval for true average energy content under these

  49. The article “The Influence of Temperature and Sunshine on

  circumstances. Does the resulting interval suggest that

  the Alpha-Acid Contents of Hops (Agric. Meteor. 1974:

  mean energy content has been precisely estimated?

  375–382) reports the following data on yield (y), mean tem-

  e. Use the information given in part (d) to predict energy

  perature over the period between date of coming into hops and

  content for a waste sample having the specified charac-

  date of picking (x 1 ), and mean percentage of sunshine during

  teristics, in a way that conveys information about preci-

  the same period (x 2 ) for the Fuggle variety of hop:

  sion and reliability.

  x 1 16.7 17.4 18.4 16.8 18.9 17.1

  48. An experiment to investigate the effects of a new technique for degumming of silk yarn was described in the article “Some

  x 2 30 42 47 43 41

  Studies in Degumming of Silk with Organic Acids” (J. Society of Dyers and Colourists, 1992: 79–86). One response variable

  of interest was y 5 weight loss () . The experimenters made

  x 1 17.3 18.2 21.3 21.2 20.7 18.5

  observations on weight loss for various values of three inde-

  pendent variables: x 1 5 temperature (8C) 5 90, 100, 110 ;

  x 2 48 44 43 50 56 60

  x 2 5 time of teatment (min) 5

  30, 75, 120 ; tartaric x 3 5 y

  acid concentration (gL) 5 0,8,16 . In the regression analyses, the three values of each variable were coded as

  1, 0, and Here is partial Minitab output from fitting the first-order

  1, respectively, giving the accompanying data (the value

  model Y5b 0 1b 1 x 1 1b 2 x 2 1P used in the article:

  y 8 5 19.3 was reported, but our value y 8 5 20.3 results in

  regression output identical to that appearing in the article).

  t-ratio P

  1 2 3 4 5 6 7 8 s 24.45 R-sq 76.8 R-sq(adj) 71.6

  x 1

  x 2 1 0 a. What is mˆ Y 18.9,43 , and what is the corresponding residual? x 3 0 1 b. Test versus H 0 :b 1 5b 2 50 H a : either b 1 or at b 2 0

  y

  18.3 22.2 23.0 3.3 19.3 20.3 level .05.

  13.4 Multiple Regression Analysis

  c. The estimated standard deviation of bˆ 0 1 bˆ 1 x 1 1 bˆ 2 x 2 a. Do plots of e versus x 1 , e versus x 2 , and e versus yˆ

  when x 1 5 18.9 and x 2 5 43 is 8.20. Use this to obtain

  suggest that the full quadratic model should be modi-

  a 95 CI for m Y 18.9,43 .

  fied? Explain your answer.

  d. Use the information in part (c) to obtain a 95 PI for yield

  b. The value of R 2 for the full quadratic model is .759. Test

  in a future experiment when x 1 = 18.9 and x 2 5 43 .

  at level .05 the null hypothesis stating that there is no lin-

  e. Minitab reported that a 95 PI for yield when x 1 5 18

  ear relationship between the dependent variable and any

  and x 1 5 45 is (35.94, 151.63). What is a 90 PI in this

  of the five predictors.

  situation?

  c. It can be shown that V(Y) 5 s 2 5 V(Yˆ ) 1 V(Y 2 Yˆ ) .

  f. Given that x 2 is in the model, would you retain x 1 ?

  The estimate of s is s ˆ 5 s 5 6.99 (from the full quad-

  g. When the model Y5b 0 1b 2 x 2 1P is fit, the resulting

  ratic model). First obtain the estimated standard devia-

  value of R 2 is .721. Verify that the F statistic for testing

  tion of Y 2 Yˆ , and then estimate the standard deviation

  H 0 :Y5b 0 1b 2 x 2 1P versus the alternative hypothe-

  of (i.e., Yˆ bˆ 0 1 bˆ 1 x 1 1 bˆ 2 x 2 1 bˆ 3 x 1 2 1 bˆ 4 x 2 1 bˆ 5 x 1 x 2 ) sis satisfies H a :Y5b 0 1b 1 x 1 1b 2 x 2 1P t 2 5f , when x 1 5 8.0 and x 2 5 33.1 . Finally, compute a 95

  where t is the value of the t statistic from part (f).

  CI for mean strength. [Hint: What is (y 2 yˆ)e ?] 50. a. When the model Y5b 0 1b 1 x 1 1b 2 x 2 1b 3 x 2 1 1 d. Fitting the first-order model with regression function

  b 4 x 2 1b 5 x 1 x 2 1P is fit to the hops data of Exercise

  m Y x 1 x 2 5b 0 1b 1 x 1 1b 2 x 2 results in SSE 5 894.95 .

  49, the estimate of b 5 is bˆ 5 5 .557 with estimated stan-

  Test at level .05 the null hypothesis that states that all

  dard deviation s bˆ 5 5 .94 . Test H 0 :b 5 50 versus

  quadratic terms can be deleted from the model.

  H a :b 5 2 0 .

  52. Utilization of sucrose as a carbon source for the production of

  b. Each t ratio bˆ i s bˆ i (i 5 1, 2, 3, 4, 5) for the model of part

  chemicals is uneconomical. Beet molasses is a readily avail-

  (a) is less than 2 in absolute value, yet R 2 5 .861 for this

  able and low-priced substitute. The article “Optimization of

  model. Would it be correct to drop each term from the

  the Production of b-Carotene from Molasses by Blakeslea

  model because of its small t-ratio? Explain.

  Trispora (J. of Chem. Tech. and Biotech. 2002: 933–943) car-

  c. Using R 2 5 .861 for the model of part (a), test

  ried out a multiple regression analysis to relate the dependent

  H 0 :b 3 5b 4 5b 5 50 (which says that all second-order

  variable of y 5 amount b-carotene (gdm 3 ) to the three pre-

  terms can be deleted).

  dictors amount of lineolic acid, amount of kerosene, and

  amount of antioxidant (all gdm 51. The article “The Undrained Strength of Some Thawed 3 ). Permafrost Soils” (Canadian Geotechnical J., 1979:

  420–427) contains the following data on undrained shear

  Antiox Betacaro

  strength of sandy soil (y, in kPa), depth (x 1 , in m), and water

  content (x 2 , in ).

  The predicted values and residuals were computed by fitting

  a full quadratic model, which resulted in the estimated regression function

  a. Fitting the complete second-order model in the three pre- 2 dictors resulted in R 2 5 .987 and adjusted R 2 5 .974 y 5 2151.36 2 16.22x , 1 1 13.48x 2 1 .094x 1

  whereas fitting the first-order model gave R 2 5 .016 . 2 .253x 2 1 .492x 1 x 2 What would you conclude about the two models?

  CHAPTER 13 Nonlinear and Multiple Regression

  b. For x 1 5x 2 5 30, x 3 5 10 , a statistical software package

  54. The use of high-strength steels (HSS) rather than aluminum

  reported that yˆ 5 .66573, s Yˆ 5 .01785 , based on the

  and magnesium alloys in automotive body structures reduces

  complete second-order model. Predict the amount of

  vehicle weight. However, HSS use is still problematic

  b-carotene that would result from a single experimental

  because of difficulties with limited formability, increased

  run with the designated values of the independent vari-

  springback, difficulties in joining, and reduced die life. The

  ables, and do so in a way that conveys information about

  article “Experimental Investigation of Springback Variation

  precision and reliability.

  in Forming of High Strength Steels” (J. of Manuf. Sci. and

  53. Snowpacks contain a wide spectrum of pollutants that may

  Engr., 2008: 1–9) included data on y 5 springback from the

  represent environmental hazards. The article “Atmospheric

  wall opening angle and x 1 5 blank holder pressure. Three

  PAH Deposition: Deposition Velocities and Washout

  different material suppliers and three different lubrication

  Ratios” (J. of Environmental Engineering, 2002: 186–195)

  regimens (no lubrication, lubricant 1, and lubricant 2)

  focused on the deposition of polyaromatic hydrocarbons.

  were also utilized.

  The authors proposed a multiple regression model for relat-

  a. What predictors would you use in a model to incorporate

  ing deposition over a specified time period (y, in mgm 2 ) to

  supplier and lubrication information in addition to BHP? two rather complicated predictors x 1 (mg-secm 3 ) and x 2 b. The accompanying Minitab output resulted from fitting

  (mgm 2 ), defined in terms of PAH air concentrations for var-

  the model of (a) (the article’s authors also used Minitab;

  ious species, total time, and total amount of precipitation.

  amusingly, they employed a significance level of .06 in

  Here is data on the species fluoranthene and corresponding

  various tests of hypotheses). Does there appear to be a

  Minitab output:

  useful relationship between the response variable and at least one of the predictors? Carry out a formal test of

  c. When BHP is 1000, material is from supplier 1, and no

  22.65 lubrication is used, s Yˆ 5 .524 . Calculate a 95 PI for the

  28.68 spingback that would result from making an additional

  32.66 observation under these conditions.

  27.69 d. From the output, it appears that lubrication regimen may

  14.18 not be providing useful information. A regression with

  20.64 the corresponding predictors removed resulted in

  20.60 SSE 5 48.426 . What is the coefficient of multiple deter-

  15.08 mination for this model, and what would you conclude

  18.05 about the importance of the lubrication regimen?

  99.71 e. A model with predictors for BHP, supplier, and lubrica-

  58.97 tion regimen, as well as predictors for interactions

  between BHP and both supplier and lubrication regi-

  44.25 ment, resulted in SSE 5 28.216 and R 2 5 .849 . Does

  The regression equation is

  this model appear to improve on the model with just

  flth

  33.5 0.00205 x1

  29836 x2 BHP and predictors for supplier?

  Predictor

  Coef

  SE Coef

  SE Coef

  2.87 0.007 R-Sq Suppl_1 92.3 R-Sq(adj) 91.2

  Analysis of Variance

  S 1.18413 R-Sq 77.5 R-Sq(adj) 73.8

  Residual Error 14 27454

  Formulate questions and perform appropriate analyses to

  Residual Error 30 42.065

  draw conclusions.

  Total