.098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

1(.6)(.4)25 5 .098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

  ■ When the point estimator has approximately a normal distribution, which will uˆ

  often be the case when n is large, then we can be reasonably confident that the true value of lies within approximately 2 standard errors (standard deviations) of . Thus u uˆ if a sample of n ⫽ 36 component lifetimes gives m ˆ 5 x 5 28.50 and s ⫽ 3.60, then s1n 5 .60 , so within 2 estimated standard errors, m ˆ translates to the interval

  28.50 ⫾ (2)(.60) ⫽ (27.30, 29.70). If is not necessarily approximately normal but is unbiased, then it can be uˆ

  shown that the estimate will deviate from by as much as 4 standard errors at most u

  6 of the time. We would then expect the true value to lie within 4 standard errors of (and this is a very conservative statement, since it applies to any unbiased ). Summarizing, the standard error tells us roughly within what distance of we can uˆ expect the true value of to lie. u

  The form of the estimator may be sufficiently complicated so that standard

  statistical theory cannot be applied to obtain an expression for . This is true, for s uˆ example, in the case u ⫽ s, uˆ 5 S ; the standard deviation of the statistic S, s S , cannot in general be determined. In recent years, a new computer-intensive

  method called the bootstrap has been introduced to address this problem. Suppose that the population pdf is f (x; ), a member of a particular parametric family, u

  and that data x 1 ,x 2 ,...,x n gives uˆ 5 21.7 . We now use the computer to obtain

  “bootstrap samples” from the pdf f(x; 21.7), and for each sample we calculate a “bootstrap estimate” uˆ :

  CHAPTER 6 Point Estimation

  First bootstrap sample: x 1 , x 2 , c, x ; estimate 5 uˆ n 1 Second bootstrap sample: x 1 , x , c, x 2 ; estimate 5 uˆ n 2

  ( Bth bootstrap sample: 1 , x x 2 , c, x n ; estimate 5 uˆ B

  B ⫽ 100 or 200 is often used. Now let u 5 ⌺uˆ i B , the sample mean of the bootstrap

  estimates. The bootstrap estimate of ’s standard error is now just the sample stan- u ˆ dard deviation of the uˆ i ’s :

  (In the bootstrap literature, B is often used in place of B ⫺ 1; for typical values of B, there is usually little difference between the resulting estimates.)

  Example 6.11

  A theoretical model suggests that X, the time to breakdown of an insulating fluid between electrodes at a particular voltage, has f (x; l) ⫽ le ⫺lx , an exponential distri- bution. A random sample of n ⫽ 10 breakdown times (min) gives the following data:

  Since E(X) ⫽ 1l, E( ) ⫽ 1l, so a reasonable estimate of l is X lˆ 5 1 x 5 155.087

  5 .018153 . We then used a statistical computer package to obtain B ⫽ 100 bootstrap

  samples, each of size 10, from f (x; .018153). The first such sample was 41.00, 109.70, 16.78, 6.31, 6.76, 5.62, 60.96, 78.81, 192.25, 27.61, from which

  gx i 5 545.8 and lˆ 1 5 154.58 5 .01832 . The average of the 100 bootstrap esti-

  mates is l 5 .02153 , and the sample standard deviation of these 100 estimates is

  s l ˆ 5 .0091 , the bootstrap estimate of ˆl ’s standard error. A histogram of the 100 ˆl i ’s

  was somewhat positively skewed, suggesting that the sampling distribution of ˆl also has this property.

  ■

  Sometimes an investigator wishes to estimate a population characteristic without assuming that the population distribution belongs to a particular parametric family. An instance of this occurred in Example 6.7, where a 10 trimmed mean was proposed for estimating a symmetric population distribution’s center . The data of Example 6.2 gave u uˆ 5 x tr(10) 5 27.838 , but now there is no assumed f (x; ), so how can we obtain a boot- u strap sample? The answer is to regard the sample itself as constituting the population (the n ⫽ 20 observations in Example 6.2) and take B different samples, each of size n, with replacement from this population. The book by Bradley Efron and Robert Tibshirani or the one by John Rice listed in the chapter bibliography provides more information.

  EXERCISES Section 6.1 (1–19)

  1. The accompanying data on flexural strength (MPa) for con-

  a. Calculate a point estimate of the mean value of strength

  crete beams of a certain type was introduced in Example 1.2.

  for the conceptual population of all beams manufactured in this fashion, and state which estimator you used. [Hint:

  5.9 7.2 7.3 6.3 8.1 6.8 7.0 ⌺x i ⫽ 219.8.]

  7.6 6.8 6.5 7.0 6.3 7.9 9.0 b. Calculate a point estimate of the strength value that sepa- rates the weakest 50 of all such beams from the

  8.2 8.7 7.8 9.7 7.4 7.7 9.7 strongest 50, and state which estimator you used.

  6.1 Some General Concepts of Point Estimation

  c. Calculate and interpret a point estimate of the population

  a. Use rules of expected value to show that X ⫺ is an unbi- Y

  standard deviation s. Which estimator did you use? [Hint:

  ased estimator of 1 ⫺ m 2 . Calculate the estimate for the

  gx i 2

  given data.

  d. Calculate a point estimate of the proportion of all such

  b. Use rules of variance from Chapter 5 to obtain an expres-

  beams whose flexural strength exceeds 10 MPa. [Hint:

  sion for the variance and standard deviation (standard

  Think of an observation as a “success” if it exceeds 10.]

  error) of the estimator in part (a), and then compute the

  e. Calculate a point estimate of the population coefficient of

  estimated standard error.

  variation s , and state which estimator you used.

  c. Calculate a point estimate of the ratio s 1 s 2 of the two

  2. A sample of 20 students who had recently taken elementary

  standard deviations.

  statistics yielded the following information on the brand of

  d. Suppose a single beam and a single cylinder are randomly

  calculator owned (T ⫽ Texas Instruments, H ⫽ Hewlett

  selected. Calculate a point estimate of the variance of the dif-

  Packard, C ⫽ Casio, S ⫽ Sharp):

  ference X ⫺ Y between beam strength and cylinder strength.

  T

  H T C T

  S C H 5. As an example of a situation in which several different statis- tics could reasonably be used to calculate a point estimate,

  consider a population of N invoices. Associated with each

  a. Estimate the true proportion of all such students who own

  invoice is its “book value,” the recorded amount of that

  a Texas Instruments calculator.

  invoice. Let T denote the total book value, a known amount.

  b. Of the 10 students who owned a TI calculator, 4 had

  Some of these book values are erroneous. An audit will be

  graphing calculators. Estimate the proportion of students

  carried out by randomly selecting n invoices and determining

  who do not own a TI graphing calculator.

  the audited (correct) value for each one. Suppose that the sample gives the following results (in dollars).

  3. Consider the following sample of observations on coating thickness for low-viscosity paint (“Achieving a Target Value

  Invoice

  for a Manufacturing Process: A Case Study,” J. of Quality Technology, 1992: 22–26):

  Book value

  Audited value

  Assume that the distribution of coating thickness is normal (a normal probability plot strongly supports this assumption).

  Let

  a. Calculate a point estimate of the mean value of coating thickness, and state which estimator you used.

  Y ⫽ sample mean book value

  b. Calculate a point estimate of the median of the coating

  X ⫽ sample mean audited value

  D

  thickness distribution, and state which estimator you used.

  ⫽ sample mean error

  c. Calculate a point estimate of the value that separates the

  largest 10 of all values in the thickness distribution from the

  Propose three different statistics for estimating the total

  remaining 90, and state which estimator you used. [Hint:

  audited (i.e., correct) value—one involving just N and , X

  Express what you are trying to estimate in terms of m and s.]

  another involving T, N, and , and the last involving T and D

  d. Estimate P(X ⬍ 1.5), i.e., the proportion of all thickness

  X . If N Y ⫽ 5000 and T ⫽ 1,761,300, calculate the three

  values less than 1.5. [Hint: If you knew the values of

  corresponding point estimates. (The article “Statistical

  and s, you could calculate this probability. These values

  Models and Analysis in Auditing,” Statistical Science, 1989:

  are not available, but they can be estimated.]

  2–33 discusses properties of these estimators.)

  e. What is the estimated standard error of the estimator that

  6. Consider the accompanying observations on stream flow

  you used in part (b)?

  (1000s of acre-feet) recorded at a station in Colorado for the

  4. The article from which the data in Exercise 1 was extracted also

  period April 1–August 31 over a 31-year span (from an arti-

  gave the accompanying strength observations for cylinders:

  cle in the 1974 volume of Water Resources Research). 6.1 5.8 7.8 7.1 7.2 9.2 6.6 8.3 7.0 8.3 127.96 210.07 203.24 108.91 178.21

  Prior to obtaining data, denote the beam strengths by X ,...,

  X m and the cylinder strengths by Y 1 ,...,Y n . Suppose that

  the X i ’s constitute a random sample from a distribution with

  mean m 1 and standard deviation s 1 and that the Y i ’s form a

  random sample (independent of the X i ’s) from another

  distribution with mean m 2 and standard deviation s 2 .

  CHAPTER 6 Point Estimation

  An appropriate probability plot supports the use of the log-

  a. Find an unbiased estimator of m and compute the estimate

  normal distribution (see Section 4.5) as a reasonable model

  for the data. [Hint: E(X ) ⫽ m for X Poisson, so E( ) ⫽ ?] X

  for stream flow.

  b. What is the standard deviation (standard error) of your

  a. Estimate the parameters of the distribution. [Hint:

  estimator? Compute the estimated standard error. [Hint:

  Remember that X has a lognormal distribution with

  2 X 5m parameters and for X Poisson.] s if ln(X) is normally distributed with

  s 2

  s

  mean and variance 2 .]

  10. Using a long rod that has length , you are going to lay out

  a square plot in which the length of each side is . Thus the

  b. Use the estimates of part (a) to calculate an estimate of the

  area of the plot will be 2 . However, you do not know the

  expected value of stream flow. [Hint: What is E(X)?]

  value of , so you decide to make n independent measure-

  7. a. A random sample of 10 houses in a particular area, each

  ments X 1 ,X 2 ,...,X n of the length. Assume that each X i has

  of which is heated with natural gas, is selected and the

  mean m (unbiased measurements) and variance s 2 .

  amount of gas (therms) used during the month of January

  a. Show that 2 is not an unbiased estimator for m 2 . [Hint: For

  is determined for each house. The resulting observations

  any rv Y, E(Y 2 ) ⫽ V(Y) ⫹ [E(Y)] 2 . Apply this with Y X

  ⫽ .] ⫺ kS unbiased

  are 103, 156, 118, 89, 125, 147, 122, 109, 138, 99. Let

  b. For what value of k is the estimator X 2 2

  denote the average gas usage during January by all houses

  for m 2 ? [Hint: Compute E( X 2 ⫺ kS 2 ).]

  in this area. Compute a point estimate of . m

  11. Of n randomly selected male smokers, X smoked filter cig-

  b. Suppose there are 10,000 houses in this area that use nat-

  arettes, whereas of n

  2 randomly selected female smokers, X 2

  ural gas for heating. Let t denote the total amount of gas

  smoked filter cigarettes. Let p

  1 and p 2

  denote the probabili-

  used by all of these houses during January. Estimate t

  ties that a randomly selected male and female, respectively,

  using the data of part (a). What estimator did you use in

  smoke filter cigarettes.

  computing your estimate?

  a. Show that (X

  1 n 1 ) ⫺ (X 2 n 2 ) is an unbiased estimator for

  c. Use the data in part (a) to estimate p, the proportion of all

  p ⫺p 2 . [Hint: E(X i ) ⫽n i p

  1 i for i ⫽ 1, 2.]

  houses that used at least 100 therms.

  b. What is the standard error of the estimator in part (a)?

  d. Give a point estimate of the population median usage (the

  c. How would you use the observed values x 1 and x

  2 to esti-

  middle value in the population of all houses) based on the

  mate the standard error of your estimator?

  sample of part (a). What estimator did you use?

  d. If n 1 ⫽n 2 ⫽ 200, x 1 ⫽ 127, and x 2 ⫽ 176, use the esti-

  8. In a random sample of 80 components of a certain type, 12

  mator of part (a) to obtain an estimate of p 1 ⫺p 2 .

  are found to be defective.

  e. Use the result of part (c) and the data of part (d) to esti-

  a. Give a point estimate of the proportion of all such compo-

  mate the standard error of the estimator.

  nents that are not defective.

  12. Suppose a certain type of fertilizer has an expected yield per

  b. A system is to be constructed by randomly selecting two

  acre of 1 with variance s 2 , whereas the expected yield for

  of these components and connecting them in series, as

  a second type of fertilizer is m 2 with the same variance s 2 .

  shown here.

  Let S 1 2 and S 2 denote the sample variances of yields based on sample sizes n 1 and n 2 , respectively, of the two fertilizers.

  Show that the pooled (combined) estimator

  The series connection implies that the system will function if and only if neither component is defective (i.e., both com- 2 (n

  sˆ 2 5 1 2 1)S 1 1 (n 2 2 1)S 2

  ponents work properly). Estimate the proportion of all such

  n 1 1n 2 22

  systems that work properly. [Hint: If p denotes the probabil- ity that a component works properly, how can P(system

  is an unbiased estimator of s 2 .

  works) be expressed in terms of p?]

  13. Consider a random sample X 1 ,...,X n from the pdf

  9. Each of 150 newly manufactured items is examined and the number of scratches per item is recorded (the items are sup-

  f (x; ) ⫽ .5(1 ⫹ x)

  ⫺1 ⱕ x ⱕ 1

  posed to be free of scratches), yielding the following data:

  where ⫺1 ⱕ ⱕ 1 (this distribution arises in particle u

  Number of

  physics). Show that uˆ 5 3X is an unbiased estimator of . u

  [Hint: First determine m ⫽ E(X) ⫽ E( ).] X

  scratches

  per item

  0 1 2 3 4 5 6 7 14. A sample of n captured Pandemonium jet fighters results in serial numbers x

  1 ,x 2 ,x 3 ,...,x n . The CIA knows that the air- craft were numbered consecutively at the factory starting with

  Observed

  frequency

  18 37 42 30 13 7 2 1 a and ending with b, so that the total number of planes manu- factured is b ⫺ a ⫹ 1 (e.g., if a ⫽ 17 and b ⫽ 29, then 29 ⫺

  Let X ⫽ the number of scratches on a randomly chosen

  17 ⫹ 1 ⫽ 13 planes having serial numbers 17, 18, 19, . . . ,

  item, and assume that X has a Poisson distribution with

  28, 29 were manufactured). However, the CIA does not know

  parameter m.

  the values of a or b. A CIA statistician suggests using the

  6.2 Methods of Point Estimation

  estimator max(X i ) ⫺ min(X i ) ⫹ 1 to estimate the total number

  a. Suppose that r ⱖ 2. Show that

  of planes manufactured.

  ⫽ (r ⫺ 1)(X ⫹ r ⫺ 1)

  a. If n ⫽ 5, x 1 ⫽ 237, x 2 ⫽ 375, x 3 ⫽ 202, x 4 ⫽ 525, and

  x 5 ⫽ 418, what is the corresponding estimate?

  is an unbiased estimator for p. [Hint: Write out E( ) and pˆ

  cancel x

  ⫹ r ⫺ 1 inside the sum.]

  b. Under what conditions on the sample will the value of

  b. A reporter wishing to interview five individuals who

  the estimate be exactly equal to the true total number of

  support a certain candidate begins asking people whether

  planes? Will the estimate ever be larger than the true

  (S) or not (F ) they support the candidate. If the sequence

  total? Do you think the estimator is unbiased for estimat-

  of responses is SFFSFFFSSS, estimate p

  ⫽ the true pro-

  ing b

  ⫺ a ⫹ 1? Explain in one or two sentences.

  portion who support the candidate.

  15. Let X 1 ,X 2 ,...,X n represent a random sample from a

  18. Let X

  1 ,X 2 ,...,X n

  be a random sample from a pdf f (x) that is symmetric about m, so that X | is an unbiased estimator of m . If n is large, it can be shown that V( ) | X 2

  Rayleigh distribution with pdf

  ⬇ 1(4n[ f(m)] ).

  f (x; u) 5 |

  x 2x 2

  e (2u)

  x.0

  a. Compare V( ) to V( ) when the underlying distribution is normal.

  u

  a. It can be shown that E(X 2 )

  ⫽ 2 . Use this fact to con- b. When the underlying pdf is Cauchy (see Example 6.7),

  V( ) X ⫽ , so is a terrible estimator. What is V( ) in ` X | X

  struct an unbiased estimator of based on u gX 2 (and use

  i

  rules of expected value to show that it is unbiased).

  this case when n is large?

  b. Estimate from the following n u ⫽ 10 observations on

  19. An investigator wishes to estimate the proportion of stu-

  vibratory stress of a turbine blade under specified

  dents at a certain university who have violated the honor

  conditions:

  code. Having obtained a random sample of n students, she realizes that asking each, “Have you violated the honor

  16.88 10.23 4.59 6.66 13.68 code?” will probably result in some untruthful responses. 14.23 19.87 9.40 6.51 10.95 Consider the following scheme, called a randomized

  16. Suppose the true average growth

  of one type of plant

  response technique. The investigator makes up a deck of

  during a 1-year period is identical to that of a second type,

  100 cards, of which 50 are of type I and 50 are of type II.

  but the variance of growth for the first type is s 2 , whereas

  Type I: Have you violated the honor code (yes or no)? for the second type the variance is 4s 2 . Let X 1 ,...,X m be Type II: Is the last digit of your telephone number a 0, 1,

  m independent growth observations on the first type [so

  or 2 (yes or no)?

  E(X i ) ⫽ m, V(X i ) ⫽s 2 ], and let Y 1 ,...,Y n

  be n independ-

  ent growth observations on the second type [E(Y i ) ⫽, m

  Each student in the random sample is asked to mix the deck,

  draw a card, and answer the resulting question truthfully.

  i

  a. Show that for any d between 0 and 1, the estimator

  Because of the irrelevant question on type II cards, a yes

  m 5 dX 1 (1 2 d)Y ˆ is unbiased for . m

  response no longer stigmatizes the respondent, so we assume

  b. For fixed m and n, compute V( ˆ m) , and then find the value

  that responses are truthful. Let p denote the proportion of

  of d that minimizes V( ˆ m) . [Hint: Differentiate with V( ˆ m)

  honor-code violators (i.e., the probability of a randomly

  selected student being a violator), and let l ⫽ P(yes response). Then l and p are related by l ⫽ .5p ⫹ (.5)(.3).

  respect to d.]

  17. In Chapter 3, we defined a negative binomial rv as the num-

  a. Let Y denote the number of yes responses, so Y ⬃ Bin

  ber of failures that occur before the rth success in a

  (n, l). Thus Yn is an unbiased estimator of l. Derive an

  sequence of independent and identical successfailure trials.

  estimator for p based on Y. If n ⫽ 80 and y ⫽ 20, what is

  The probability mass function (pmf) of X is

  your estimate? [Hint: Solve l ⫽ .5p ⫹ .15 for p and then substitute Yn for l.]

  nb(x; r, p) ⫽

  b. Use the fact that E(Yn) ⫽ l to show that your estimator

  x1r21

  pˆ is unbiased.

  c. If there were 70 type I and 30 type II cards, what would

  x

  be your estimator for p?

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

STUDI AREA TRAFFIC CONTROL SYSTEM (ATCS) PADA PERSIMPANGAN DI KOTA MALANG (JALAN A. YANI – L. A. SUCIPTO – BOROBUDUR)

6 78 2

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

FENOLOGI KEDELAI BERDASARKAN KRITERIA FEHR-CAVINESS PADA DELAPAN PERSILANGAN SERTA EMPAT TETUA KEDELAI (Glycine max. L. Merrill)

0 46 16

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22