Chapter07 - Sampling and Sampling Distributions
Statistics for Managers Using Microsoft® Excel 5th Edition
Learning Objectives
In this chapter, you will learn:
To distinguish between different survey sampling methods
The concept of the sampling distribution
To compute probabilities related to the sample mean and the sample proportion
The importance of the Central Limit Theorem
Why Sample?
than selecting every item in the population (census). Selecting a sample is less costly than
Selecting a sample is less time-consuming
selecting every item in the population.
An analysis of a sample is less cumbersome
and more practical than an analysis of the entire population.
Types of Samples Quota Samples Non-Probability Samples Judgment Chunk Probability Samples Simple Random Systematic Stratified Cluster Convenience
Types of Samples In a nonprobability sample , items included
are chosen without regard to their probability of occurrence.
based only on the fact that they are easy, inexpensive, or convenient to sample. In a judgment sample, you get the opinions of
In convenience sampling, items are selected
pre-selected experts in the subject matter. Types of Samples In a probability sample, items in the sample
are chosen on the basis of known probabilities.
Probability Samples Simple Systematic Stratified Cluster Random
Simple Random Sampling
Every individual or item from the frame has an
equal chance of being selected Selection may be with replacement (selected
individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame).
Samples obtained from table of random numbers
or computer random number generators.
Systematic Sampling
Decide on sample size: n
Divide frame of N individuals into groups of k individuals: k=N/n
Randomly select one individual from the 1st group
Select every kth individual thereafter For example, suppose you were sampling n = 9 individuals from a population of N = 72. So, the population would be divided into k = 72/9 = 8 groups. Randomly select a member from group 1, say individual 3. Then, select every 8 th individual thereafter (i.e. 3, 11, 19, 27, 35, 43, 51, 59, 67)
Stratified Sampling
(called strata) according to some common characteristic. A simple random sample is selected from each
Divide population into two or more subgroups
subgroup, with sample sizes proportional to strata sizes. Samples from subgroups are combined into one.
This is a common technique when sampling
population of voters, stratifying across racial or socio-economic lines.
Cluster Sampling
Population is divided into several “clusters,” each
representative of the population. A simple random sample of clusters is selected.
All items in the selected clusters can be used, or items
can be chosen from a cluster using another probability sampling technique. A common application of cluster sampling involves
election exit polls, where certain election districts are selected and sampled.
Comparing Sampling Methods
Simple random sample and Systematic sample Simple to use
May not be a good representation of the population’s
underlying characteristics Stratified sample
Ensures representation of individuals across the entire population
More cost effective
Cluster sample
Less efficient (need larger sample to acquire the same
level of precision)
Evaluating Survey Worthiness
What is the purpose of the survey?
Were data collected using a non-probability
sample or a probability sample? Coverage error – appropriate frame?
Nonresponse error – follow up
Measurement error – good questions elicit good
responses Sampling error – always exists
Types of Survey Errors
Coverage error or selection bias
Exists if some groups are excluded from the frame
and have no chance of being selected Non response error or bias
People who do not respond may be different from
those who do respond Sampling error
Chance (luck of the draw) variation from sample to
sample.
Measurement error
Due to weaknesses in question design, respondent
error, and interviewer’s impact on the respondent
Sampling Distributions
possible values of a statistic for a given size sample selected from a population.
A sampling distribution is a distribution of all of the
For example, suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are
interested in the distribution of all potential mean GPA
we might calculate for any given sample of 50 students.Sampling Distributions Sample Mean Example Suppose your population (simplified) was
four people at your institution. Population size N=4
Random variable, X, is age of individuals
Values of X: 18, 20, 22, 24 (years) Sampling Distributions Sample Mean Example Summary Measures for the Population Distribution:
21
4
24
22
20
X μ i
2.236 N μ) (X σ 2 i
.3 .2 .1 18 20 22 24 A B C D P(x) x
Uniform Distribution
Sampling Distributions Sample Mean Example
23
21
22
22
20
21
22
24
19
21
22
23
24
16 Sample Means 16 possible samples
(sampling with replacement)
20
1 st Obs.
2 nd Observation
2 nd Observation
18
20
22
24 18 18,18 18,20 18,22 18,24 20 20,18 29,20 20,22 20,24 22 22,18 22,20 22,22 22,24 24 24,18 24,20 24,22 24,24
Now consider all possible samples of size n=2
1 st Obs.
18
21
20
22
24
18
18
19
20
20 Sampling Distributions Sample Mean Example Sampling Distribution of All Sample Means
1 st Obs
20
Distribution
(no longer uniform) Sample Means
X
18 19 20 21 22 23 24 .1 .2 .3
16 Sample Means
24
23
22
21
24
23
22
21
22
2 nd Observation
22
21
20
19
20
21
20
19
18
18
24
22
20
18
_
Sampling Distributions Sample Mean Example
21
1.58
Summary Measures of this Sampling Distribution:
2 2 2 2 X i X
16
21) - (24 21) - (19 21) - (18 N ) μ
16
X
X μ i
19
21
24
X ( σ
Sampling Distributions Sample Mean Example Population N = 4
1.58 σ 21 μ
X
X 2.236 σ 21 μ
Sample Means Distribution n = 2
18 20 22 24 A B C D .1 .2 .3
P(X) X 18 19 20 21 22 23 24
.1 .2 .3
P(X)
X _
_ Sampling Distributions Standard Error Different samples of the same size from the same population
will yield different sample means. A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean: σ σ
X n
sample size increases.
Note that the standard error of the mean decreases as the
Sampling Distributions
Standard Error: Normal Pop.
If a population is normal with mean μ and standard
deviation σ, the sampling distribution of the mean is also normally distributed with
σ σ
and
μ μ
X X n
(This assumes that sampling is with replacement or sampling is without replacement from an infinite population)
Sampling Distributions Z Value: Normal Pop.
Z-value for the sampling distribution of the sample mean:
( X μ ) ( X μ) X Z
σ σ X n where: = sample mean
X μ = population mean
= population standard deviation
σn = sample size Sampling Distributions Properties: Normal Pop.
Normal Population Distribution
μ μ x
μ x x
(i.e. is unbiased )
Normal Sampling Distribution (has the same mean)
μ x x Sampling Distributions Properties: Normal Pop.
For sampling with replacement:
As n increases,
decreases x
σ Larger sample size Smaller sample size x
μ Sampling Distributions Non-Normal Population
The Central Limit Theorem states that as the sample
size (that is, the number of values in each sample) gets large enough, the sampling distribution of the mean is approximately normally distributed. This is true regardless of the shape of the distribution of the individual values in the population. Measures of the sampling distribution:
σ σ μ μ
x
x n Sampling Distributions Non-Normal Population
Population Distribution
μ x
Sampling Distribution (becomes normal as n increases)
Larger Smaller sample sample size size x
μ x
Sampling Distributions Non-Normal Population
For most distributions, n > 30 will give a
sampling distribution that is nearly normal For fairly symmetric distributions, n > 15 will
give a sampling distribution that is nearly normal For normal population distributions, the
sampling distribution of the mean is always normally distributed
Sampling Distributions Example
Suppose a population has mean μ = 8 and standard
deviation σ = 3. Suppose a random sample of size n = 36 is selected. What is the probability that the sample mean is between
7.75 and 8.25? Even if the population is not normally distributed, the
central limit theorem can be used (n > 30).
So, the distribution of the sample mean is approximately normal with
σ
3 μ
8
x
σ x0.5 n
36
Sampling Distributions Example
5 .
36
3 8 - 8.25 5 .
36
3 8 -
7.75
Z Z First, compute Z values for both 7.75 and 8.25.
0.3830 0.5) Z P(-0.5 8.25) μ P(7.75 X
Now, use the cumulative normal table to compute the correct probability. Sampling Distributions Example Population Distribution
= 2(.5000-.3085) = 2(.1915) = 0.3830
μ 8 Sample
X Sampling Standardized Normal
Distribution Distribution 7.75 8.25 -0.5 0.5
μ Z x z
μ
8 X Sampling Distributions The Proportion The proportion of the population having some
characteristic is denoted π.
Sample proportion ( p ) provides an estimate of π: X number of items in the sample having the characteri stic of interest p
n sample size
p has a binomial distribution
0 ≤ p ≤ 1
(assuming sampling with replacement from a finite population or without replacement from an infinite population)
Sampling Distributions The Proportion
Standard error for the proportion: n ) (1
σ p n
) (1
σ Z p
p p
Z value for the proportion:
Sampling Distributions The Proportion: Example
If the true proportion of voters who support Proposition A is π = .4, what is the probability that a sample of size 200 yields a sample proportion between .40 and .45?
In other words, if π = .4 and n = 200, what is P(.40 ≤ p ≤ .45) ?
Sampling Distributions The Proportion: Example (1 ) .4(1 .4)
σ
.03464 p Find : σ p n 200
Convert to .40 .40 .45 .40
P(.40 .45) P Z
p standardized
.03464 .03464 normal:
P(0 Z 1.44) Sampling Distributions The Proportion: Example Use cumulative normal table: P(0 ≤ Z ≤ 1.44) = P(Z ≤ 1.44) – 0.5 = .4251
Standardized Sampling Distribution
Normal Distribution Standardize .4251
.40 .45
1.44
p Z Chapter Summary
Described different types of samples
Examined survey worthiness and types of survey errors
Introduced sampling distributions
Described the sampling distribution of the mean
For normal populations
Using the Central Limit Theorem
Described the sampling distribution of a proportion
Calculated probabilities using sampling distributions In this chapter, we have