SAMPLING THEORY
Chapter 5 SAMPLING THEORY
To proceed we shall recall the following definitions.
Let X 1 ,X 2 , ..., X k be k random variables all defined on the same probability
space (Ω, A, P [.]). The joint cumulative distribution function of X 1 ,X 2 , ..., X k ,
denoted by F X 1 ,X 2 ,...X n ( •, •, ..., •), is defined as
F X 1 ,X 2 ,...X k (x 1 ,x 2 , ..., x k ) = P [X 1 ≤x 1 ;X 2 ≤x 2 ; ...; X k ≤x k ]
for all (x 1 ,x 2 , ..., x k ) .
Let X 1 ,X 2 , ..., X k be k discrete random variables, then the joint discrete
density function of these, denoted by f X 1 ,X 2 ,...X k ( •, •, ..., •), is defined to be
f X 1 ,X 2 ,...X k (x 1 ,x 2 , ..., x k ) = P [X 1 =x 1 ;X 2 =x 2 ; ...; X k =x k ]
for (x 1 ,x 2 , ..., x k ) , a value of (X 1 ,X 2 , ..., X k ) and is 0 otherwise.
Let X 1 ,X 2 , ..., X k be k continuous random variables, then the joint contin-
uous density function of these, denoted by f X 1 ,X 2 ,...X k ( •, •, ..., •), is defined to be
a function such that
Z x k Z x 1
F X 1 ,X 2 ,...X k (x 1 ,x 2 , ..., x k )=
... f X 1 ,X 2 ,...X k (u 1 ,u 2 , ..., u k )du 1 ..du k
−∞
−∞
for all (x 1 ,x 2 , ..., x k ) . The totality of elements which are under discussion and about which infor- mation is desired will be called the target population. The statistical problem is
Sampling Theory
to find out something about a certain target population. It is generally impossible or impractical to examine the entire population, but one may examine a part of it (a sample from it) and, on the basis of this limited investigation, make inferences regarding the entire target population.
The problem immediately arises as to how the sample of the population should
be selected. Of practical importance is the case of a simple random sample, usually called a random sample, which can be defined as follows:
Let the random variables X 1 ,X 2 , ..., X n have a joint density f X 1 ,X 2 ,...X n (x 1 ,x 2 , ..., x n )
that factors as follows:
f X 1 ,X 2 ,...X n (x 1 ,x 2 , ..., x n ) = f (x 1 )f (x 2 )...f (x n )
where f (.) is the common density of each X i . Then X 1 ,X 2 , ..., X n is defined to be a
random sample of size n from a population with density f (.). Note that identical distribution can be weakened - could have different population for each j - reflecting heterogeneous individuals. Also, in time series we might want to allow dependence, i.e., X j and X k are dependent. When we are dealing with finite population, sampling
without replacement causes some heterogeneity since if X 1 =x 1 , then the distribution
of X 2 must be affected.
5.1 Sample Statistics
A sample statistic is a function of observable random variables, which is itself an observable random variable, which does not contain any unknown parameters, i.e. a
sample statistic is any quantity we can write as a measurable function, T (X 1 , ..., X n ) .
For example, let X 1 ,X 2 , ..., X k be a random sample from the density f (.). Then the
r th
sample moment , denoted by M
r , is defined as:
1 X r
M r =
X i .
n i=1
Means and Variances
In particular, if r = 1, we get the sample mean, which is usually denoted by X orX n ; that is:
Also the r th sample central moment (about X n ), denoted by M r , is defined
In particular, if r = 2, we get the sample variance, and the sample standard deviation
or maybe another sample statistic for the variance,
We can also get the sample Median,
M = median {X 1 , ..., X n }=
⎩ 1 £
¤
2 X (r) +X (r+1)
if n = 2r
the empirical cumulative distribution function
These are analogous of corresponding population characteristics and will be shown to be similar to them when n is large. We calculate the properties of these variables: (1) Exact properties; (2) Asymptotic.
5.2 Means and Variances We can prove the following theorems:
Sampling Theory
Theorem 10 Let X 1 ,X 2 , ..., X k
be a random sample from the density f (.). The
expected value of the r th sample moment is equal to the r population moment, i.e.
th
the r th sample moment is an unbiased estimator of the r population moment (Proof omitted).
th
Theorem 11 Let X 1 ,X 2 , ..., X n be a random sample from a density f (.), and let
1 P
X n = n X i be the sample mean. Then
var[X n ]= σ
n
where μ and σ 2 are the mean and variance of f (.), respectively. Notice that this is
true for any distribution f (.), provided that is not infinite.
n ] = E[ n X i ]= n E[X i ]= n μ= n nμ = μ . Also
1 var[X 1
n ] = var[ n X i ]= n 2 var[X i ]=
Theorem 12 2 Let X
1 ,X 2 , ..., X n be a random sample from a density f (.), and let s ∗
defined as above. Then
var[s ∗ ]=
where σ 2 and μ 4 are the variance and the 4 th central moment of f (.), respectively.
Notice that this is true for any distribution f (.), provided that μ 4 is not infinite.
Proof We shall prove first the following identity, which will be used latter:
P n
2 P n ¡
¢ 2 ¡
¢ 2
(X i − μ) =
X i −X n +n X n −μ
i=1
P i=1
2 P¡
¢ 2 P £¡
¢ ¡ ¢¤ 2
(X i − μ) =
X i −X n +X n −μ =
X i −X n + X n −μ =
P h¡
¢ 2 ¡
¢¡
¢ ¡
¢ 2 i
X i −X n
X i −X n X n −μ + X n −μ =
P¡
¢ 2 ¡
¢P¡
¢
¡
¢ 2
X i −X n
X n −μ
X i −X n +n X n −μ =
Sampling from the Normal Distribution
Using the above identity we obtain: ∙
σ − nvar X n =
The derivation of the variance of S 2
n is omitted.
Theorem 13 Let X 1 , ..., X n be a random sample from a population with mean μ,
variance σ 2 , skewness κ
3 , and kurtosis κ 4 . Then,
(3) E (F n (x)) = F (x), and V ar (F n (x)) = F (x) (1 − F (x)) n (4) Characteristic Function of X, ϕ n
X (t) = [ϕ X (tn)] .
E (F n (x)) = E n
1 (X i ≤ x) = E (1 (X i ≤ x)) = F (x) . Also V ar (F n (x)) =
E [F n (x) − F (x)] =E n
=E n 2 {1 [X i ≤ x] − F (x)} + n 2 {1 [X i ≤ x] − F (x)} {1 [X j ≤ x] − F (x)}
= n E {1 [X ≤ x] − F (x)} =
i
n {E {1 [X i ≤ x] − F (x) }} = n F (x) [1 − F (x)]
5.3 Sampling from the Normal Distribution Theorem 14 Let denote X n the sample mean of a random sample of size n from a
2 normal distribution with mean μ and variance σ 2 . Then (1) X v N(μ, σ n). (2) X and s 2 are independent.
(3) 2 (n−1)s ∗
2 σ 2 vχ n−1
(4) X−μ s ∗ √ n vt n−1 .
Proof
Sampling Theory
(1) From a Theorem above we have that
ϕ n
X (t) = [ϕ X (tn)] .
1 2 ¡ t ¢ 2 ´i Now ϕ X (tn) = exp iμt − 2 σ t . Hence ϕ X (t) = exp iμ n − 2 σ n
t , which is the cf of a normal distribution with mean μ and
2 n
σ variance 2
n .
X 1 (2) For n = 2 we have that if X +X 2
1 v N(0, 1) and X 2 v N(0, 1) then X =
2 (X 1 = X −X 2 ) and s 2 . Define Z
1 = 1 +X 2 4 X and Z 2 = 1 −X 2 . Then Z 1 and Z 2 are
uncorrelated and by normality independent.
5.3.1 The Gamma Function The gamma function is defined as:
Z ∞
Γ(t) = x t−1 e −x dx
f or t>0
Notice that Γ(t + 1) = tΓ(t),as
0 +t x de −x = tΓ(t)
and if t is an integer then Γ(t + 1) = t!. Also if t is again an integer then Γ(t + 1
1∗3∗5∗...(2t−1) √
1 √
π . Finally Γ( 2 )= π .
2 t
Recall that if X is a random variable with density
x 2 −1 e − 2 f or 0 Γ(k2) 2 where Γ(.) is the gamma function, then X is defined to have a chi-square distrib- ution with k degrees of freedom . Notice that X is distributed as above then: E[X] = k and var[X] = 2k We can prove the following theorem Sampling from the Normal Distribution Theorem 15 If the random variables X i , i = 1, 2, .., k are normally and indepen- dently distributed with means μ 2 i and variances σ i then has a chi-square distribution with k degrees of freedom. Proof omitted. Furthermore, Theorem 16 If the random variables X i , i = 1, 2, .., k are normally and indepen- dently distributed with mean μ and variance σ 1 2 , and let S 2 P where χ 2 n−1 is the chi-square distribution with n −1 degrees of freedom. Proof omitted. 5.3.2 The F Distribution If X is a random variable with density ³m m m2 ´ Γ[(m + n)2] x 2 −1 f X (x) = f or 0 Γ(m2)Γ(n2) n [1 + (mn)x] (m+n)2 where Γ(.) is the gamma function, then X is defined to have a F distribution with m and n degrees of freedom . Notice that if X is distributed as above then: n 2 2n (m + n − 2) E[X] = and var[X] = −2 2 − 2) (n − 4) m(n Theorem 17 If the random variables U and V are independently distributed as chi- 2 square with m and n degrees of freedom, respectively i.e. U 2 vχ m and V vχ n independently, then where F m,n is the F distribution with m, n degrees of freedom. Proof omitted. Sampling Theory 5.3.3 The Student-t Distribution If X is a random variable with density Γ[(k + 1)2] 1 1 f X (x) = √ f or −∞ Γ(k2) kπ [1 + x k] 2 (k+1)2 where Γ(.) is the gamma function, then X is defined to have a t distribution with k degrees of freedom . Notice that if X is distributed as above then: k E[X] = 0 and var[X] = −2 Theorem 18 If the random variables Z and V are independently distributed as stan- dard normal and chi-square with k, respectively i.e. Z v (N(0, 1) and V v χ 2 k inde- pendently, then where t k is the t distribution with k degrees of freedom. Proof omitted.