.098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

1(.6)(.4)25 5 .098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

■ When the point estimator has approximately a normal distribution, which will uˆ

often be the case when n is large, then we can be reasonably confident that the true value of lies within approximately 2 standard errors (standard deviations) of . Thus u uˆ if a sample of n ⫽ 36 component lifetimes gives m ˆ 5 x 5 28.50 and s ⫽ 3.60, then s1n 5 .60 , so within 2 estimated standard errors, m ˆ translates to the interval

28.50 ⫾ (2)(.60) ⫽ (27.30, 29.70). If is not necessarily approximately normal but is unbiased, then it can be uˆ

shown that the estimate will deviate from by as much as 4 standard errors at most u

6 of the time. We would then expect the true value to lie within 4 standard errors of (and this is a very conservative statement, since it applies to any unbiased ). Summarizing, the standard error tells us roughly within what distance of we can uˆ expect the true value of to lie. u

The form of the estimator may be sufficiently complicated so that standard

statistical theory cannot be applied to obtain an expression for . This is true, for s uˆ example, in the case u ⫽ s, uˆ 5 S ; the standard deviation of the statistic S, s S , cannot in general be determined. In recent years, a new computer-intensive

method called the bootstrap has been introduced to address this problem. Suppose that the population pdf is f (x; ), a member of a particular parametric family, u

and that data x 1 ,x 2 ,...,x n gives uˆ 5 21.7 . We now use the computer to obtain

“bootstrap samples” from the pdf f(x; 21.7), and for each sample we calculate a “bootstrap estimate” uˆ :

CHAPTER 6 Point Estimation

First bootstrap sample: x 1 , x 2 , c, x ; estimate 5 uˆ n 1 Second bootstrap sample: x 1 , x , c, x 2 ; estimate 5 uˆ n 2

( Bth bootstrap sample: 1 , x x 2 , c, x n ; estimate 5 uˆ B

B ⫽ 100 or 200 is often used. Now let u 5 ⌺uˆ i B , the sample mean of the bootstrap

estimates. The bootstrap estimate of ’s standard error is now just the sample stan- u ˆ dard deviation of the uˆ i ’s :

(In the bootstrap literature, B is often used in place of B ⫺ 1; for typical values of B, there is usually little difference between the resulting estimates.)

Example 6.11

A theoretical model suggests that X, the time to breakdown of an insulating fluid between electrodes at a particular voltage, has f (x; l) ⫽ le ⫺lx , an exponential distribution. A random sample of n ⫽ 10 breakdown times (min) gives the following data:

Since E(X) ⫽ 1l, E( ) ⫽ 1l, so a reasonable estimate of l is X lˆ 5 1 x 5 155.087

5 .018153 . We then used a statistical computer package to obtain B ⫽ 100 bootstrap

samples, each of size 10, from f (x; .018153). The first such sample was 41.00, 109.70, 16.78, 6.31, 6.76, 5.62, 60.96, 78.81, 192.25, 27.61, from which

gx i 5 545.8 and lˆ 1 5 154.58 5 .01832 . The average of the 100 bootstrap esti-

mates is l 5 .02153 , and the sample standard deviation of these 100 estimates is

s l ˆ 5 .0091 , the bootstrap estimate of ˆl ’s standard error. A histogram of the 100 ˆl i ’s

was somewhat positively skewed, suggesting that the sampling distribution of ˆl also has this property.

■

Sometimes an investigator wishes to estimate a population characteristic without assuming that the population distribution belongs to a particular parametric family. An instance of this occurred in Example 6.7, where a 10 trimmed mean was proposed for estimating a symmetric population distribution’s center . The data of Example 6.2 gave u uˆ 5 x tr(10) 5 27.838 , but now there is no assumed f (x; ), so how can we obtain a boot- u strap sample? The answer is to regard the sample itself as constituting the population (the n ⫽ 20 observations in Example 6.2) and take B different samples, each of size n, with replacement from this population. The book by Bradley Efron and Robert Tibshirani or the one by John Rice listed in the chapter bibliography provides more information.

EXERCISES Section 6.1 (1–19)

1. The accompanying data on flexural strength (MPa) for con-

a. Calculate a point estimate of the mean value of strength

crete beams of a certain type was introduced in Example 1.2.

for the conceptual population of all beams manufactured in this fashion, and state which estimator you used. [Hint:

5.9 7.2 7.3 6.3 8.1 6.8 7.0 ⌺x i ⫽ 219.8.]

7.6 6.8 6.5 7.0 6.3 7.9 9.0 b. Calculate a point estimate of the strength value that separates the weakest 50 of all such beams from the

8.2 8.7 7.8 9.7 7.4 7.7 9.7 strongest 50, and state which estimator you used.

6.1 Some General Concepts of Point Estimation

c. Calculate and interpret a point estimate of the population

a. Use rules of expected value to show that X ⫺ is an unbi- Y

standard deviation s. Which estimator did you use? [Hint:

ased estimator of 1 ⫺ m 2 . Calculate the estimate for the

gx i 2

given data.

d. Calculate a point estimate of the proportion of all such

b. Use rules of variance from Chapter 5 to obtain an expres-

beams whose flexural strength exceeds 10 MPa. [Hint:

sion for the variance and standard deviation (standard

Think of an observation as a “success” if it exceeds 10.]

error) of the estimator in part (a), and then compute the

e. Calculate a point estimate of the population coefficient of

estimated standard error.

variation s , and state which estimator you used.

c. Calculate a point estimate of the ratio s 1 s 2 of the two

2. A sample of 20 students who had recently taken elementary

standard deviations.

statistics yielded the following information on the brand of

d. Suppose a single beam and a single cylinder are randomly

calculator owned (T ⫽ Texas Instruments, H ⫽ Hewlett

selected. Calculate a point estimate of the variance of the dif-

Packard, C ⫽ Casio, S ⫽ Sharp):

ference X ⫺ Y between beam strength and cylinder strength.

H T C T

S C H 5. As an example of a situation in which several different statistics could reasonably be used to calculate a point estimate,

consider a population of N invoices. Associated with each

a. Estimate the true proportion of all such students who own

invoice is its “book value,” the recorded amount of that

a Texas Instruments calculator.

invoice. Let T denote the total book value, a known amount.

b. Of the 10 students who owned a TI calculator, 4 had

Some of these book values are erroneous. An audit will be

graphing calculators. Estimate the proportion of students

carried out by randomly selecting n invoices and determining

who do not own a TI graphing calculator.

the audited (correct) value for each one. Suppose that the sample gives the following results (in dollars).

3. Consider the following sample of observations on coating thickness for low-viscosity paint (“Achieving a Target Value

Invoice

for a Manufacturing Process: A Case Study,” J. of Quality Technology, 1992: 22–26):

Book value

Audited value

Assume that the distribution of coating thickness is normal (a normal probability plot strongly supports this assumption).

Let

a. Calculate a point estimate of the mean value of coating thickness, and state which estimator you used.

Y ⫽ sample mean book value

b. Calculate a point estimate of the median of the coating

X ⫽ sample mean audited value

thickness distribution, and state which estimator you used.

⫽ sample mean error

c. Calculate a point estimate of the value that separates the

largest 10 of all values in the thickness distribution from the

Propose three different statistics for estimating the total

remaining 90, and state which estimator you used. [Hint:

audited (i.e., correct) value—one involving just N and , X

Express what you are trying to estimate in terms of m and s.]

another involving T, N, and , and the last involving T and D

d. Estimate P(X ⬍ 1.5), i.e., the proportion of all thickness

X . If N Y ⫽ 5000 and T ⫽ 1,761,300, calculate the three

values less than 1.5. [Hint: If you knew the values of

corresponding point estimates. (The article “Statistical

and s, you could calculate this probability. These values

Models and Analysis in Auditing,” Statistical Science, 1989:

are not available, but they can be estimated.]

2–33 discusses properties of these estimators.)

e. What is the estimated standard error of the estimator that

6. Consider the accompanying observations on stream flow

you used in part (b)?

(1000s of acre-feet) recorded at a station in Colorado for the

4. The article from which the data in Exercise 1 was extracted also

period April 1–August 31 over a 31-year span (from an arti-

gave the accompanying strength observations for cylinders:

cle in the 1974 volume of Water Resources Research). 6.1 5.8 7.8 7.1 7.2 9.2 6.6 8.3 7.0 8.3 127.96 210.07 203.24 108.91 178.21

Prior to obtaining data, denote the beam strengths by X ,...,

X m and the cylinder strengths by Y 1 ,...,Y n . Suppose that

the X i ’s constitute a random sample from a distribution with

mean m 1 and standard deviation s 1 and that the Y i ’s form a

random sample (independent of the X i ’s) from another

distribution with mean m 2 and standard deviation s 2 .

CHAPTER 6 Point Estimation

An appropriate probability plot supports the use of the log-

a. Find an unbiased estimator of m and compute the estimate

normal distribution (see Section 4.5) as a reasonable model

for the data. [Hint: E(X ) ⫽ m for X Poisson, so E( ) ⫽ ?] X

for stream flow.

b. What is the standard deviation (standard error) of your

a. Estimate the parameters of the distribution. [Hint:

estimator? Compute the estimated standard error. [Hint:

Remember that X has a lognormal distribution with

2 X 5m parameters and for X Poisson.] s if ln(X) is normally distributed with

s 2

mean and variance 2 .]

10. Using a long rod that has length , you are going to lay out

a square plot in which the length of each side is . Thus the

b. Use the estimates of part (a) to calculate an estimate of the

area of the plot will be 2 . However, you do not know the

expected value of stream flow. [Hint: What is E(X)?]

value of , so you decide to make n independent measure-

7. a. A random sample of 10 houses in a particular area, each

ments X 1 ,X 2 ,...,X n of the length. Assume that each X i has

of which is heated with natural gas, is selected and the

mean m (unbiased measurements) and variance s 2 .

amount of gas (therms) used during the month of January

a. Show that 2 is not an unbiased estimator for m 2 . [Hint: For

is determined for each house. The resulting observations

any rv Y, E(Y 2 ) ⫽ V(Y) ⫹ [E(Y)] 2 . Apply this with Y X

⫽ .] ⫺ kS unbiased

are 103, 156, 118, 89, 125, 147, 122, 109, 138, 99. Let

b. For what value of k is the estimator X 2 2

denote the average gas usage during January by all houses

for m 2 ? [Hint: Compute E( X 2 ⫺ kS 2 ).]

in this area. Compute a point estimate of . m

11. Of n randomly selected male smokers, X smoked filter cig-

b. Suppose there are 10,000 houses in this area that use nat-

arettes, whereas of n

2 randomly selected female smokers, X 2

ural gas for heating. Let t denote the total amount of gas

smoked filter cigarettes. Let p

1 and p 2

denote the probabili-

used by all of these houses during January. Estimate t

ties that a randomly selected male and female, respectively,

using the data of part (a). What estimator did you use in

smoke filter cigarettes.

computing your estimate?

a. Show that (X

1 n 1 ) ⫺ (X 2 n 2 ) is an unbiased estimator for

c. Use the data in part (a) to estimate p, the proportion of all

p ⫺p 2 . [Hint: E(X i ) ⫽n i p

1 i for i ⫽ 1, 2.]

houses that used at least 100 therms.

b. What is the standard error of the estimator in part (a)?

d. Give a point estimate of the population median usage (the

c. How would you use the observed values x 1 and x

2 to esti-

middle value in the population of all houses) based on the

mate the standard error of your estimator?

sample of part (a). What estimator did you use?

d. If n 1 ⫽n 2 ⫽ 200, x 1 ⫽ 127, and x 2 ⫽ 176, use the esti-

8. In a random sample of 80 components of a certain type, 12

mator of part (a) to obtain an estimate of p 1 ⫺p 2 .

are found to be defective.

e. Use the result of part (c) and the data of part (d) to esti-

a. Give a point estimate of the proportion of all such compo-

mate the standard error of the estimator.

nents that are not defective.

12. Suppose a certain type of fertilizer has an expected yield per

b. A system is to be constructed by randomly selecting two

acre of 1 with variance s 2 , whereas the expected yield for

of these components and connecting them in series, as

a second type of fertilizer is m 2 with the same variance s 2 .

shown here.

Let S 1 2 and S 2 denote the sample variances of yields based on sample sizes n 1 and n 2 , respectively, of the two fertilizers.

Show that the pooled (combined) estimator

The series connection implies that the system will function if and only if neither component is defective (i.e., both com- 2 (n

sˆ 2 5 1 2 1)S 1 1 (n 2 2 1)S 2

ponents work properly). Estimate the proportion of all such

n 1 1n 2 22

systems that work properly. [Hint: If p denotes the probability that a component works properly, how can P(system

is an unbiased estimator of s 2 .

works) be expressed in terms of p?]

13. Consider a random sample X 1 ,...,X n from the pdf

9. Each of 150 newly manufactured items is examined and the number of scratches per item is recorded (the items are sup-

f (x; ) ⫽ .5(1 ⫹ x)

⫺1 ⱕ x ⱕ 1

posed to be free of scratches), yielding the following data:

where ⫺1 ⱕ ⱕ 1 (this distribution arises in particle u

Number of

physics). Show that uˆ 5 3X is an unbiased estimator of . u

[Hint: First determine m ⫽ E(X) ⫽ E( ).] X

scratches

per item

0 1 2 3 4 5 6 7 14. A sample of n captured Pandemonium jet fighters results in serial numbers x

1 ,x 2 ,x 3 ,...,x n . The CIA knows that the air- craft were numbered consecutively at the factory starting with

Observed

frequency

18 37 42 30 13 7 2 1 a and ending with b, so that the total number of planes manufactured is b ⫺ a ⫹ 1 (e.g., if a ⫽ 17 and b ⫽ 29, then 29 ⫺

Let X ⫽ the number of scratches on a randomly chosen

17 ⫹ 1 ⫽ 13 planes having serial numbers 17, 18, 19, . . . ,

item, and assume that X has a Poisson distribution with

28, 29 were manufactured). However, the CIA does not know

parameter m.

the values of a or b. A CIA statistician suggests using the

6.2 Methods of Point Estimation

estimator max(X i ) ⫺ min(X i ) ⫹ 1 to estimate the total number

a. Suppose that r ⱖ 2. Show that

of planes manufactured.

⫽ (r ⫺ 1)(X ⫹ r ⫺ 1)

a. If n ⫽ 5, x 1 ⫽ 237, x 2 ⫽ 375, x 3 ⫽ 202, x 4 ⫽ 525, and

x 5 ⫽ 418, what is the corresponding estimate?

is an unbiased estimator for p. [Hint: Write out E( ) and pˆ

cancel x

⫹ r ⫺ 1 inside the sum.]

b. Under what conditions on the sample will the value of

b. A reporter wishing to interview five individuals who

the estimate be exactly equal to the true total number of

support a certain candidate begins asking people whether

planes? Will the estimate ever be larger than the true

(S) or not (F ) they support the candidate. If the sequence

total? Do you think the estimator is unbiased for estimat-

of responses is SFFSFFFSSS, estimate p

⫽ the true pro-

ing b

⫺ a ⫹ 1? Explain in one or two sentences.

portion who support the candidate.

15. Let X 1 ,X 2 ,...,X n represent a random sample from a

18. Let X

1 ,X 2 ,...,X n

be a random sample from a pdf f (x) that is symmetric about m, so that X | is an unbiased estimator of m . If n is large, it can be shown that V( ) | X 2

Rayleigh distribution with pdf

⬇ 1(4n[ f(m)] ).

f (x; u) 5 |

x 2x 2

e (2u)

x.0

a. Compare V( ) to V( ) when the underlying distribution is normal.

a. It can be shown that E(X 2 )

⫽ 2 . Use this fact to con- b. When the underlying pdf is Cauchy (see Example 6.7),

V( ) X ⫽ , so is a terrible estimator. What is V( ) in ` X | X

struct an unbiased estimator of based on u gX 2 (and use

rules of expected value to show that it is unbiased).

this case when n is large?

b. Estimate from the following n u ⫽ 10 observations on

19. An investigator wishes to estimate the proportion of stu-

vibratory stress of a turbine blade under specified

dents at a certain university who have violated the honor

conditions:

code. Having obtained a random sample of n students, she realizes that asking each, “Have you violated the honor

16.88 10.23 4.59 6.66 13.68 code?” will probably result in some untruthful responses. 14.23 19.87 9.40 6.51 10.95 Consider the following scheme, called a randomized

16. Suppose the true average growth

of one type of plant

response technique. The investigator makes up a deck of

during a 1-year period is identical to that of a second type,

100 cards, of which 50 are of type I and 50 are of type II.

but the variance of growth for the first type is s 2 , whereas

Type I: Have you violated the honor code (yes or no)? for the second type the variance is 4s 2 . Let X 1 ,...,X m be Type II: Is the last digit of your telephone number a 0, 1,

m independent growth observations on the first type [so

or 2 (yes or no)?

E(X i ) ⫽ m, V(X i ) ⫽s 2 ], and let Y 1 ,...,Y n

be n independ-

ent growth observations on the second type [E(Y i ) ⫽, m

Each student in the random sample is asked to mix the deck,

draw a card, and answer the resulting question truthfully.

a. Show that for any d between 0 and 1, the estimator

Because of the irrelevant question on type II cards, a yes

m 5 dX 1 (1 2 d)Y ˆ is unbiased for . m

response no longer stigmatizes the respondent, so we assume

b. For fixed m and n, compute V( ˆ m) , and then find the value

that responses are truthful. Let p denote the proportion of

of d that minimizes V( ˆ m) . [Hint: Differentiate with V( ˆ m)

honor-code violators (i.e., the probability of a randomly

selected student being a violator), and let l ⫽ P(yes response). Then l and p are related by l ⫽ .5p ⫹ (.5)(.3).

respect to d.]

17. In Chapter 3, we defined a negative binomial rv as the num-

a. Let Y denote the number of yes responses, so Y ⬃ Bin

ber of failures that occur before the rth success in a

(n, l). Thus Yn is an unbiased estimator of l. Derive an

sequence of independent and identical successfailure trials.

estimator for p based on Y. If n ⫽ 80 and y ⫽ 20, what is

The probability mass function (pmf) of X is

your estimate? [Hint: Solve l ⫽ .5p ⫹ .15 for p and then substitute Yn for l.]

nb(x; r, p) ⫽

b. Use the fact that E(Yn) ⫽ l to show that your estimator

x1r21

pˆ is unbiased.

c. If there were 70 type I and 30 type II cards, what would

be your estimator for p?

.098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

1(.6)(.4)25 5 .098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

STUDI AREA TRAFFIC CONTROL SYSTEM (ATCS) PADA PERSIMPANGAN DI KOTA MALANG (JALAN A. YANI – L. A. SUCIPTO – BOROBUDUR)

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

FENOLOGI KEDELAI BERDASARKAN KRITERIA FEHR-CAVINESS PADA DELAPAN PERSILANGAN SERTA EMPAT TETUA KEDELAI (Glycine max. L. Merrill)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dukungan

Links

.098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

1(.6)(.4)25 5 .098 . Alternatively, since the largest value of pq is attained when p ⫽ q ⫽ .5, an upper bound on the standard error is 11(4n) 5 .10.

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

STUDI AREA TRAFFIC CONTROL SYSTEM (ATCS) PADA PERSIMPANGAN DI KOTA MALANG (JALAN A. YANI – L. A. SUCIPTO – BOROBUDUR)

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

FENOLOGI KEDELAI BERDASARKAN KRITERIA FEHR-CAVINESS PADA DELAPAN PERSILANGAN SERTA EMPAT TETUA KEDELAI (Glycine max. L. Merrill)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dokumen yang Anda mencari sudah siap untuk unduhkan