SAMPLING THEORY

Chapter 5 SAMPLING THEORY

  To proceed we shall recall the following definitions.

  Let X 1 ,X 2 , ..., X k be k random variables all defined on the same probability

  space (Ω, A, P [.]). The joint cumulative distribution function of X 1 ,X 2 , ..., X k ,

  denoted by F X 1 ,X 2 ,...X n ( •, •, ..., •), is defined as

  F X 1 ,X 2 ,...X k (x 1 ,x 2 , ..., x k ) = P [X 1 ≤x 1 ;X 2 ≤x 2 ; ...; X k ≤x k ]

  for all (x 1 ,x 2 , ..., x k ) .

  Let X 1 ,X 2 , ..., X k be k discrete random variables, then the joint discrete

  density function of these, denoted by f X 1 ,X 2 ,...X k ( •, •, ..., •), is defined to be

  f X 1 ,X 2 ,...X k (x 1 ,x 2 , ..., x k ) = P [X 1 =x 1 ;X 2 =x 2 ; ...; X k =x k ]

  for (x 1 ,x 2 , ..., x k ) , a value of (X 1 ,X 2 , ..., X k ) and is 0 otherwise.

  Let X 1 ,X 2 , ..., X k be k continuous random variables, then the joint contin-

  uous density function of these, denoted by f X 1 ,X 2 ,...X k ( •, •, ..., •), is defined to be

  a function such that

  Z x k Z x 1

  F X 1 ,X 2 ,...X k (x 1 ,x 2 , ..., x k )=

  ... f X 1 ,X 2 ,...X k (u 1 ,u 2 , ..., u k )du 1 ..du k

  −∞

  −∞

  for all (x 1 ,x 2 , ..., x k ) . The totality of elements which are under discussion and about which infor- mation is desired will be called the target population. The statistical problem is

  Sampling Theory

  to find out something about a certain target population. It is generally impossible or impractical to examine the entire population, but one may examine a part of it (a sample from it) and, on the basis of this limited investigation, make inferences regarding the entire target population.

  The problem immediately arises as to how the sample of the population should

  be selected. Of practical importance is the case of a simple random sample, usually called a random sample, which can be defined as follows:

  Let the random variables X 1 ,X 2 , ..., X n have a joint density f X 1 ,X 2 ,...X n (x 1 ,x 2 , ..., x n )

  that factors as follows:

  f X 1 ,X 2 ,...X n (x 1 ,x 2 , ..., x n ) = f (x 1 )f (x 2 )...f (x n )

  where f (.) is the common density of each X i . Then X 1 ,X 2 , ..., X n is defined to be a

  random sample of size n from a population with density f (.). Note that identical distribution can be weakened - could have different population for each j - reflecting heterogeneous individuals. Also, in time series we might want to allow dependence, i.e., X j and X k are dependent. When we are dealing with finite population, sampling

  without replacement causes some heterogeneity since if X 1 =x 1 , then the distribution

  of X 2 must be affected.

  5.1 Sample Statistics

  A sample statistic is a function of observable random variables, which is itself an observable random variable, which does not contain any unknown parameters, i.e. a

  sample statistic is any quantity we can write as a measurable function, T (X 1 , ..., X n ) .

  For example, let X 1 ,X 2 , ..., X k be a random sample from the density f (.). Then the

  r th

  sample moment , denoted by M

  r , is defined as:

  1 X r

  M r =

  X i .

  n i=1

  Means and Variances

  In particular, if r = 1, we get the sample mean, which is usually denoted by X orX n ; that is:

  Also the r th sample central moment (about X n ), denoted by M r , is defined

  In particular, if r = 2, we get the sample variance, and the sample standard deviation

  or maybe another sample statistic for the variance,

  We can also get the sample Median,

  M = median {X 1 , ..., X n }=

  ⎩ 1 £

  ¤

  2 X (r) +X (r+1)

  if n = 2r

  the empirical cumulative distribution function

  These are analogous of corresponding population characteristics and will be shown to be similar to them when n is large. We calculate the properties of these variables: (1) Exact properties; (2) Asymptotic.

  5.2 Means and Variances We can prove the following theorems:

  Sampling Theory

  Theorem 10 Let X 1 ,X 2 , ..., X k

  be a random sample from the density f (.). The

  expected value of the r th sample moment is equal to the r population moment, i.e.

  th

  the r th sample moment is an unbiased estimator of the r population moment (Proof omitted).

  th

  Theorem 11 Let X 1 ,X 2 , ..., X n be a random sample from a density f (.), and let

  1 P

  X n = n X i be the sample mean. Then

  var[X n ]= σ

  n

  where μ and σ 2 are the mean and variance of f (.), respectively. Notice that this is

  true for any distribution f (.), provided that is not infinite.

  n ] = E[ n X i ]= n E[X i ]= n μ= n nμ = μ . Also

  1 var[X 1

  n ] = var[ n X i ]= n 2 var[X i ]=

  Theorem 12 2 Let X

  1 ,X 2 , ..., X n be a random sample from a density f (.), and let s ∗

  defined as above. Then

  var[s ∗ ]=

  where σ 2 and μ 4 are the variance and the 4 th central moment of f (.), respectively.

  Notice that this is true for any distribution f (.), provided that μ 4 is not infinite.

  Proof We shall prove first the following identity, which will be used latter:

  P n

  2 P n ¡

  ¢ 2 ¡

  ¢ 2

  (X i − μ) =

  X i −X n +n X n −μ

  i=1

  P i=1

  2 P¡

  ¢ 2 P £¡

  ¢ ¡ ¢¤ 2

  (X i − μ) =

  X i −X n +X n −μ =

  X i −X n + X n −μ =

  P h¡

  ¢ 2 ¡

  ¢¡

  ¢ ¡

  ¢ 2 i

  X i −X n

  X i −X n X n −μ + X n −μ =

  P¡

  ¢ 2 ¡

  ¢P¡

  ¢

  ¡

  ¢ 2

  X i −X n

  X n −μ

  X i −X n +n X n −μ =

  Sampling from the Normal Distribution

  Using the above identity we obtain: ∙

  σ − nvar X n =

  The derivation of the variance of S 2

  n is omitted.

  Theorem 13 Let X 1 , ..., X n be a random sample from a population with mean μ,

  variance σ 2 , skewness κ

  3 , and kurtosis κ 4 . Then,

  (3) E (F n (x)) = F (x), and V ar (F n (x)) = F (x) (1 − F (x)) n (4) Characteristic Function of X, ϕ n

  X (t) = [ϕ X (tn)] .

  E (F n (x)) = E n

  1 (X i ≤ x) = E (1 (X i ≤ x)) = F (x) . Also V ar (F n (x)) =

  E [F n (x) − F (x)] =E n

  =E n 2 {1 [X i ≤ x] − F (x)} + n 2 {1 [X i ≤ x] − F (x)} {1 [X j ≤ x] − F (x)}

  = n E {1 [X ≤ x] − F (x)} =

  i

  n {E {1 [X i ≤ x] − F (x) }} = n F (x) [1 − F (x)]

  5.3 Sampling from the Normal Distribution Theorem 14 Let denote X n the sample mean of a random sample of size n from a

  2 normal distribution with mean μ and variance σ 2 . Then (1) X v N(μ, σ n). (2) X and s 2 are independent.

  (3) 2 (n−1)s ∗

  2 σ 2 vχ n−1

  (4) X−μ s ∗ √ n vt n−1 .

  Proof

  Sampling Theory

  (1) From a Theorem above we have that

  ϕ n

  X (t) = [ϕ X (tn)] .

  1 2 ¡ t ¢ 2 ´i Now ϕ X (tn) = exp iμt − 2 σ t . Hence ϕ X (t) = exp iμ n − 2 σ n

  t , which is the cf of a normal distribution with mean μ and

  2 n

  σ variance 2

  n .

  X 1 (2) For n = 2 we have that if X +X 2

  1 v N(0, 1) and X 2 v N(0, 1) then X =

  2 (X 1 = X −X 2 ) and s 2 . Define Z

  1 = 1 +X 2 4 X and Z 2 = 1 −X 2 . Then Z 1 and Z 2 are

  uncorrelated and by normality independent.

  5.3.1 The Gamma Function The gamma function is defined as:

  Z ∞

  Γ(t) = x t−1 e −x dx

  f or t>0

  Notice that Γ(t + 1) = tΓ(t),as

  0 +t x de −x = tΓ(t)

  and if t is an integer then Γ(t + 1) = t!. Also if t is again an integer then Γ(t + 1

  1∗3∗5∗...(2t−1) √

  1 √

  π . Finally Γ( 2 )= π .

  2 t

  Recall that if X is a random variable with density

  x 2 −1 e − 2 f or 0

  Γ(k2) 2

  where Γ(.) is the gamma function, then X is defined to have a chi-square distrib- ution with k degrees of freedom .

  Notice that X is distributed as above then:

  E[X] = k

  and

  var[X] = 2k

  We can prove the following theorem

  Sampling from the Normal Distribution

  Theorem 15 If the random variables X i , i = 1, 2, .., k are normally and indepen- dently distributed with means μ 2

  i and variances σ i then

  has a chi-square distribution with k degrees of freedom. Proof omitted.

  Furthermore,

  Theorem 16 If the random variables X i , i = 1, 2, .., k are normally and indepen-

  dently distributed with mean μ and variance σ 1 2 , and let S 2 P

  where χ 2 n−1 is the chi-square distribution with n −1 degrees of freedom. Proof omitted.

  5.3.2 The F Distribution If X is a random variable with density

  ³m m m2

  ´

  Γ[(m + n)2]

  x 2 −1

  f X (x) =

  f or 0

  Γ(m2)Γ(n2) n

  [1 + (mn)x] (m+n)2

  where Γ(.) is the gamma function, then X is defined to have a F distribution with m and n degrees of freedom .

  Notice that if X is distributed as above then:

  n 2 2n (m + n − 2)

  E[X] =

  and

  var[X] =

  −2 2 − 2) (n − 4)

  m(n

  Theorem 17 If the random variables U and V are independently distributed as chi-

  2 square with m and n degrees of freedom, respectively i.e. U 2 vχ

  m and V vχ n

  independently, then

  where F m,n is the F distribution with m, n degrees of freedom. Proof omitted.

  Sampling Theory

  5.3.3 The Student-t Distribution If X is a random variable with density

  Γ[(k + 1)2] 1 1

  f X (x) =

  √

  f or −∞

  Γ(k2)

  kπ [1 + x k]

  2 (k+1)2

  where Γ(.) is the gamma function, then X is defined to have a t distribution with k degrees of freedom .

  Notice that if X is distributed as above then:

  k

  E[X] = 0

  and

  var[X] =

  −2

  Theorem 18 If the random variables Z and V are independently distributed as stan- dard normal and chi-square with k, respectively i.e. Z

  v (N(0, 1) and V v χ 2 k inde-

  pendently, then

  where t k is the t distribution with k degrees of freedom. Proof omitted.

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Docking Studies on Flavonoid Anticancer Agents with DNA Methyl Transferase Receptor

0 55 1

EVALUASI PENGELOLAAN LIMBAH PADAT MELALUI ANALISIS SWOT (Studi Pengelolaan Limbah Padat Di Kabupaten Jember) An Evaluation on Management of Solid Waste, Based on the Results of SWOT analysis ( A Study on the Management of Solid Waste at Jember Regency)

4 28 1

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

The Correlation between students vocabulary master and reading comprehension

16 145 49

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Pengaruh kualitas aktiva produktif dan non performing financing terhadap return on asset perbankan syariah (Studi Pada 3 Bank Umum Syariah Tahun 2011 – 2014)

6 101 0

Transmission of Greek and Arabic Veteri

0 1 22