Inference for Noisy Samples

  PJSOR, Vol. 8, No. 3, pages 381-391, July 2012 On the Occasion of his 75th Birthday Anniversary In Honour of Distinguished Professor Dr. Mir Masoom Ali Statistics in the Twenty-First Century: Special Volume

Inference for Noisy Samples

  Ibrahim. A. Ahmad

  Department of Statistics Oklahoma State University Stillwater, Oklahoma 74078 [email protected]

  N. Herawati

  Department of Mathematics University of Lampung Bandar Lampung, Indonesia 35145 Abstract

In the current work, some well-known inference procedures including testing and estimation are adjusted to

accommodate noisy data that lead to nonidentically distributed sample. The main two cases addressed are

the Poisson and the normal distributions. Both one and two sample cases are addressed. Other cases

including the exponential and the Pareto distributions are briefly mentioned. In the Poisson case, the

situation when the sample size is random is mentioned.

  

Keywords: Noisy samples, Hypothesis testing and estimation, Poisson, Normal,

Exponential, and Pareto.

1.Introduction

  Statistical inference with its estimation and hypothesis testing routes is usually based on the concept of “random samples”.This means that the data framework is assumed to be “independent identically distributed” (iid) random variables. Although this setting is becoming less and less practical, it is still the way inference is presented to statistics students via textbooks even at the graduate level, see Rohatgi and Saleh (2001) and Casella and Berger (2002). When the iid structure is violated, there are no guarantees that standard accepted procedures will continue to hold and often modifications result in only partial and or weaker solutions. This need not be the case in some noisy sampling schemes that are now in use. In the current work, it is shown that for some non-iid sampling schemes in the Poisson and normal cases, standard procedures can be modified keeping the main characteristics of the estimates or test statistics intact. The sampling we discuss here includes the now popular rate sampling, cf. Thode (1997) and Ng and Tang (2005) for Poisson distribution and Moser et al. (1989), and Sprott and Farewell (1993) for the normal distribution, among others.

  Rate sampling discussed here means that we have either a Poisson or Gaussian process and that we observe the sum of readings during adjacent time slots of lengths C ,..., C , 1 n where C  , ii 1 ,..., n . Thus in the Poisson case, the observations are non-iid random

I. A. Ahmad, N. Herawati

  variables

  X such that E ( X )  C ,  . While in Gaussian case, we have i i i 2 2  i 1 ,..., n 2 E ( X )  C and V ( X ) C  or in the homogeneous variance case is just . i i i i

  Variations of the rate sampling in the Poisson case have been researched in the literature including work by Detre and White (1970), Sichel (1973), Thode (1997), Krishnamoorthy and Thomsons (2004), and Ng and Tang (2005). Rate sampling in the normal case with homogenous variance is known as the regression through the origin case. The celebrated Behrens-Fisher problem is a two-sample variation and solutions for several different forms of it were discussed by Bernard (1982), Moser et al. (1989), Sprott and Farewell (1993), and Moser and Stevens (1992).

2. Poisson Inference

  X ,...,

  X X ~ P ( )

  Let denote a random sample from Poisson distributions such that 1 n i i

  

  with C  , i i i i 1 ,..., n , where C  is a known constant. Our task is to do inference about . The usual iid case falls at C  ......  C1 n 1 .

  The likelihood function is: X i X i C  C

    i i

   

  L ( X , C  )  e .

  (1)

  X ! i

  Clearly one obtains the maximum likelihood estimate (MLE) of to be X i

  ˆ ˆ   .

    MLE 1 C i

   This estimate is complete and sufficient for as well as unbiased, so it is the uniformly minimum variance unbiased estimate (UMVUE) of . This estimate suggests the following class of unbiased estimates for : r

  C i i

  X

  ˆ  , r  is an integer. r 1 r 1

   C i

  

  Now clearly, 2 r 1 C i

  

  ˆ V ( )  . r 1 (2)

   r 1 2

  ( C ) i

  

  ˆ But since is the UMVUE, we have

  1 2 r 1 C i

  ˆ ˆ

  V ( )   V ( )  .

     r1 2

  (3)

   1 r

  1 C ( C ) i i

    Hence as a by product we have a simple proof of the inequality: r 1 2 2 r 1

   

  ( C ) ( C )( C ) , (4) i i i

    

  Inference for Noisy Samples

  where C  , i=1,…, n. Note also that in the Poisson case E

  XV ( X ) ,  . i   i i i   1 , , n Thus we might try to find an unbiased estimate of based on the sample variance.

  Precisely we have: THEOREM(1): The following estimate is an unbiased estimate of : C i 2

   ˆˆ ˆ

   ( XC ) .

  (5)

   ( C )  Ci i i 2 2 i i 1

    PROOF:Note that 2 2 2 2

  ˆ ˆ ˆ   

  E ( XC )  EXi i 1 i i

  2 C E 1 i i XC E ( ) . (6) 1iiii

  But 2 2 EXC C  , i i i 1 2 2 2 ˆ

  EX C C C  C , and i 1 i i i j C   2iji 2 2

  ˆ ˆ ˆ

  E 1 V ( )  [ E ( )]   . 1 1 Ci

  Summing and substituting into (6) yield that 2 ( C )  C 2 2

    i i

  ˆ

  E ( XC )  . i i 1 C i i   The result now follows.

  ˆ ˆ

  Let us briefly discuss the relations between   and ˆˆ . Since ˆ is the UMVUE of , 1  ˆˆ ˆ ˆ

  ˆ

  and if E ( | )  ( ) , thenmust be the identity function, i.e.   ˆ  . Hence

       

   22 ˆˆ ˆˆ ˆˆ

  ˆ ˆ ˆ ˆ ˆ Cov ( , ) E ( ) E E ( | ) V ( ) . (7)

          

  Hence

   ˆ and ˆˆ are positively correlated with correlation coefficient equal to V ( ) ˆ 1 / 2 2

  ˆ ˆˆ ˆ

   ( , )  { } . In the special case C   C1   n 1 , V ( )  / n

  2 /( n  1 ) , thus V ( ) ˆˆ

  

  1

  1  

  ˆˆ ˆ

    ,   as n   (8)

    .   1  2 n /( n  1 ) 1 

  2

    We may combine ˆ and ˆˆ to obtain yet another unbiased estimate of .

   1  ˆˆ ˆˆ ˆ ˆ

     

  Note that E ( / )  E E ( | )  ˆ 1 . Thus  

     2

  ˆ 

  X C )

   

   (

  C i i

     i i

   E

  1 ,  2    2

  ˆ

  CC     i i

      and this simplifies to 2

  

  X C C 2 2 2 i i i   

    

     C E .

     i

   

  

  X C i i

     

I. A. Ahmad, N. Herawati

  Hence  2 2 2

  C

  X CC

  1   i   i   ii

     

  • *  

  

  (9) 2  

  C i i i

  X C

 

  

    is an unbiased estimate of . Sometime in sampling from Poisson distribution we do inverse sampling where the sample size is random. Let N be an integer-valued random variable not necessarily independent of X ’s. We then estimate by N i r r i 1 C i i

  X

  ˆ

    , r  , r , N N r 1

  (10)

   i 1 C i

  and N i 1 C i N 2

  

  ˆ ˆˆ

    ( N i i , N 2 2 i 1 XC  ) . (11)  

  ( C )  C i i

   

  ˆ Clearly, E E E  | N , and r , N   r , N  

   

   N 2 r 1

  C i

    1

  

  ˆ ˆ

  VE V | N

  V E | NE . (12)       r , N   r , N     r , N    N r 1 2 

  C i   1

  

    In the usual case when C   C  1 , we get that 1 n

  V  E { 1 / N }.

    r , N

  On the other hand,

    1 1 ˆˆ ˆˆ V ( )  E N NN N V ( | N )   E    2  E   . (13) 1 

       ˆˆ

  Thus the correlation coefficient between is

  ˆ and N N

  1

  1  

  ˆ ˆˆ ,   ,

    

   N N1   E ( ) 1 

  2 K N 1 

  2

   1 E ( ) N 1

  provided that 1 1 E ( ) E ( )  K  . N N 1 Finally, we can easily deduce the following unbiased estimate: N 2  

  X i N

  1 1      *  . N   N

  (14)  

  N 1 X   i

   

  Inference for Noisy Samples

  To test H : vs. H : or or , set any of the following test

        1

  statistics that are all asymptotically normal: ˆ

  ˆ 

    

    Z C Z   C  or    . (15) 1 i 2 i

  ˆ

   ˆˆ

  Next, let us consider the two-sample case: let

  X ~ P ( C  ), ii i 1  1 , , m and

  Y ~ P ( D ), jj j 2  i j 1 , , n . Assume the X ’s and the Y ’s are independent. We want to test

  ˆ 1 H :  R , where R is a known constant. Again, 

  X C and   1 2 i i

   

  ˆ

  

Y D H , we have the likelihood function (with 

 . Further, under 2 i i   ) given

  2  

  by: X Y i i 2 C D ( R C D ) X Y i i i j i j       X i

  L ( X , Y , C , D , R , ) e R .

      (16)

  X Y i i

  Hence (16) leads to the null estimate of :

  XY

   ij ˆ  .

  (17)

  R C D

     i j Set the test statistic:

  X Y i j  

  R

  ˆ ˆ

  C D Ri j 1 2

    Z   . (18) 3 R

  1 XY i j R

  1 ˆ

    (  )

  ( )(  )

  C D i j R CD C D

  i j i j

   

  Reject H when | Z  | Z or Z according to the alternative. If we would like an 3 / 2

   

  unbiased variance null estimate for we use the following:

  

  THEOREM (2): Under H , the following is an unbiased estimate for the common parameter

   R C D 2 2

    ˆˆ

  ˆ ˆ  [ ][ ( XC  )  ( YD  ) ] , (19) i i j j 2 2 2

2

  i j

     ( R C D ) ( R C D ) i j i j    

  ˆ where is as given in (17). PROOF: Let 2 2

  ˆ ˆ ˆ (

  X R C ) ( Y D )     i i j j   i j 2 2 2 2 2 2

  ˆ ˆ 

  XY   ( R CD ) 

i j i j i i j j

  2 ( R C

  XD Y ). . (20)      

I. A. Ahmad, N. Herawati

  The result follows by noticing that: 2 2 2 2 2 2 2 E (

  XY )  R  CR  C D D , (21) i j i i j j       2 R 2 2 2 2 2

  ˆ R E ( C ) ( C R C C R C D ) , (22) i i i i i j  

        R CD i j

   

  1 2 2 2 2 2 ˆ

  E ( D Y  )  (  D D D C D ) (23) j j j j j i j      

  R CD i j  

  and 2 2 2 ˆ ˆ ˆ

  E ( ) V ( ) ( E ( )) . (24)   R CD

i j

    The result follows from (21) to (24).

  ˆˆ is:

  A test statistic based on

  ˆ ˆ  

  R1 2 Z .

  (25) 4

  R

  1 ˆˆ

  (  ) C D i j

    Reject H if | Z  | Z or Z according to the alternative. 4   / 2

  3. Normal Inference

  Let

  X , X , , X denote a random sample from normal distribution such that 1 2 n2 X ~ N ( C  , ) , i  , , n . Then the likelihood function is given by i i 1  n n   2 2 2  2

  1 2 L (

  X , C , , )  (

  2 ) ( ) exp[ (

  XC ) ]. (26)      2 i i

  2 2  i Hence the MLE of and are respectively:

  C i i

  X 2

  1 2

  

  ˆ ˆ ˆ

    and  ( C i i 2 XC  ) . (27) 

  C n i i

  ˆ

  Clearly, is sufficient, complete and unbiased for so it is the UMVUE, while, 2

  2

  ˆ

  E ˆ ( n

  1 ) / n , thus the UMVUE for is S  1 /( n  1 ) ( XC  ) .

     C i i

   2 i ˆ

  Furthermore, we can verify that ˆ and S are independent and is normal with mean 2 2 2 C C 2 2

  and variance / C and ( n

  1 ) S / is ( n  1 ) as expected, since

   C  i2 2 2

  (

  XC ˆ )  ( XC )  n ( ˆ  ) . i i i i C       2

  ˆ To test H : , we use the modified t (  n 1 ) statistic T   C ( ) / S .

   i   C

  Inference for Noisy Samples 2

  Let us turn our attention to a general setting of the two sample case: let 2 X ~ N ( C  , ) , i i 1 C , , C

i  , , m and Y ~ N ( D  ,  ) , j  , , n , where , D , , D and are

  1  i j

2 1  

1  n 1 m

  known constant. We want to test H :  for given values and . Note that 1 2 the above setting include most known cases where: (i)  is testing H : ,

  1 , 1 2

  (ii) given in testing H :   , (iii)  and given

  1 ,     is testing 1

2

H :  . As shown above the UMVUE’s for and are, respectively, 1 2

  1

  2 C i i j j

  X D Y  

  ˆ ˆ

  

 and  . Now, we need to set an unbiased null variance estimate

1 2 2 2 C D i j  

  ˆ ˆ which is independent of and . This is done as the following:

  1

2 THEOREM (3): Under H :      and with    , the MLE estimates of  and

  1 2 2

  2 (adjusted to be unbiased) are:

  1

2

 C

  X D Y  C i i j j i     

   ˆ , (28)  2 2

  1 2  CD i j

   

  and 2 2

  1 2

  1 ˆ [ ( X  C  ˆ C  ) ( Y  ˆ D ) ] . (29)

       i i i j j  

   mn  2 i j

  PROOF: Under H , the likelihood function of the two samples is given by:

  2 L ( X , Y , C , D , , , ; , )      mn

  1  mn 2 n 2

  1 2

  1  ( ) ( ) exp{ [ ( 2 X C C  )  ( YD  )]} (30) i i i j j

   

  2 2  i  j

   2 *

  Taking the log of L and differentiating with respect to results in (28). Then 2 m n 2

   ˆ

  differentiating with respect to results in mn 1 . Correcting the bias comes through evaluating the following: 2

  1

  1 2 ˆ   ˆ   ˆ

  EE [ ( XCC )  ( YD ( YD ) ]

i i i j j j j

   i j j  

  1

2

2

  1 2 ˆ

   V ( X )  i j i j V ( Y ) 

2 V ( )( C  D )

      i j j 2   ( m n ) .

  (31)

I. A. Ahmad, N. Herawati

  2

  But V ( ˆ )  . The result now follows. 2 2

  1 2

  ( CD ) i j  

   j

  Hence, set the test statistic ˆ ˆ

  (32)

   

  1

2 T  ~ t ( m  n  2 ).

  2

  1 ˆ 

  

  2

2 C D

   i j

    j

4. Other Examples and Final Remarks (A) Other examples:

  Extending the MLE’s and the hypotheses testing presented earlier for the non-iid samples from the Poisson and the normal distributions can be used in various other cases. We illustrate this by two further examples.

  xC 1 i

  Example (1): Let

  X ~ exp( C , ) with p.d.f. f ( x ) exp[ ( )] , where i i i   x  { max C } and  , i 1, 2,..., . The likelihood function is: i

  1  in n 1 Xi

  L ( X , C ; , ) exp[ (

  1 )(

  X C )] , min ( )  . (33)    i 1 i i i

  C i ˆ  min

  X C   ˆ ˆ

  Hence the MLE of is and of is  (

  1 / n )  XC .But clearly,    i ii i

    i 1 i n  n

  1 ˆ E   ˆ   and E  .

    C n i

    

  This leads to the UMVUE’s of and given by:  

  nXi

  ˆ ˆ

    min   C UB i

  (34)   1 i n  

    n

  1 C i  

    and

  1 ˆ ˆ .

  (35) UB i i    i XCn

  1

  When CC   C  1 , we get the known unbiased estimate of , 1 2 n  

  X X / n ( 1 )

  ˆ  .

  n

  1

  1 C

i

  Example (2): Let

  X ~ f x ,  x  , =1,2,…,n. i i i    C i

  Inference for Noisy Samples

  Thus the likelihood function is 1

   C i C

   i

  L ( X , C ;  ,  max ( X ) . (36) )   1  in i

  Hence the MLE of is

  1

  ˆ

  C

  max ( X i ).

  

  (37)

  i 1  in

  

  

  To find the UMVUE of we need to adjust for bias. Note that

    C i

  

  ˆ ˆ

  E P ( u ) du F ( x ) dx . (38) i i  

    Ci

  1 1   

  C i

  ˆ C  

  Hence  max ( UB i X i ) is the UMVUE of .   1  in

  Ci

  1   

  (B) Some Final Remarks:

  (i) In the normal two-sample case, there is a major outstanding problem known as the 2 2 2

    

  “Behrens-Fisher” which is that when there is no estimate for  that

   1 2   1 2

  can lead to a t-statistic, cf, Rohatgi and Saleh (2001). There are many suggestions of 2 2 partial solutions. In this context, we offered one when we can assume , see

   1 2 Section 2. Intermediate, approximate or partial solutions have been offered in the

  literature. For example, when one is known, say , or in the general case of  ,

     1 1 2

  both unknown, Satterthwaite (1941) and (1946) proposed a method that provides an 1 2 22 2 2    S S1 2 1 2

  

  approximate for   with

       m n m n

       1 2 222 2 2 2

   m  n  1 2 1 2       

  

     . Hence could be estimated by replacing     m n m  1 n

  1     2 2   2 2

    1 and by S and S , respectively, see also Ames and Webster (1991) and Moser 2 1 2 and Stevens (1992). The case when is given was discussed by Maity and Sherman 1

  (2006). The value of is obtained via matching * * 2 2 2 2 ( )( / m / n ) 1 2 2 * 2 V ( S / mS / n )  2 1 4 2 2 . This gives 2 2

  

   ( / m  / n ) { /[ n ( n  1 )]} . Then one replaces with its estimate S . The

      1 2 2 2 2

  above method is based on matching the variance of the proposed estimate of 2 2 2

    m n  

  2 (standardized) to the variance of a Chi-square   and solving for

   1 2 

  the induced parameter and then estimate it. Needless to say matching variance is not enough to guarantee matching of distributions. In a subsequent work we will provide a new method based on matching the distributions.

I. A. Ahmad, N. Herawati

  (ii) Note that the essence of the Behrens-Fisher problem for testing H :   , 2    2 2 1 1 2 1

   V ( Y Y )

  , known constants is that the estimate of    which is a

  n n m

  1 n

  1 weighted sum of the   and   Chi-squares with unknown weights 1 1 2 2 22 2 2     

    1 1 2    2 2 1 2 2

     

  

K m , n and K m , n . When

1         2

     

  mm

  1  m n nn  1  m n     2 2 K m , n i  1, 2 S S we replace , by their estimates based on and we losethe

    1 2

  1

  independence and the problem becomes intractable. We shall provide in the near future some plausible solutions based on what the weighted sum of Chi-squares looks like. (iii) There are situations where even for the model we discussed above, namely that the parameters are known to beproportionate, there may not be an explicit solution. However, in some of these cases approximate solutions may be possible. Let us illustrate this via the example of nonidentical Bernoulli trials: let X ,..., 1 n X be n independent Bernoulli rv’s such that P (

  Xi i i i 1 )  pC p , i  1 ,..., n . As before assume C  , i  1 ,..., n to be known. Thus, the likelihood function is: X i X i 1  X

   i

  L ( X , C , p )  C p ( i i 1  C p ) (39)   i i

  Taking the log of L (

  X , C , p ) and differentiating with respect to p we get C p i X  ( )( i i 1  X ) 

  (40)

    i i

  1  C p i It is not difficult to show that the MLE of p exists and is unique but it is not explicit. If C p i we need an explicit solution, we can use the approximation  C p , i1  C p i i 1 ,..., n , thus solve i i Xp C ( i i i

  1 

  X )  ,  

  

X

ii p ˆ  .

  giving an approximate estimate Using the well known formula i C (

i i

1 

  X ) 

  X E ( X ) E ( )  , we get that 2 p C i i

  E ( p ˆ )  p { 1  }. 2

  (41) i i Cp C i i 2   pK Thus if C / CK  then E ( p ˆ )  p { i i 1 pK 1  }. Hence pˆ is asymptotically

    

  unbiased if K=0 while it is positively biased if K>0 in which case, the estimate ˆ

  p K

  1 p ˆˆ p ˆ {

  1 } ,  

  (42) ˆ 1  p K

  Inference for Noisy Samples

  X Y E Y

  is asymptotically unbiased. Finally, using the well known formula }.

  X V   

  ) ( )] ( [

  ) ( { ]

  X E Y

  X V Y E

  1

  V X E

2 Y E

  X E Y

  7. Maity, A., and Sherman, M. (2006). “The Two-Sample T Test with One Variance known,” The American Statistician, 60, 163-166.

  2. Bernard, G. A. (1982), “A New Approach to the Behrens-Fisher Problem,” Utilitas Mathematica, 218, 261-277.

  3. Chamberlin, S. R., and Sprott, D. A. (1989), “The Estimation of a Location Parameter when the Scale Parameter is Confined to a Finite Range: The Notion of Generalized Ancillary Statistic,” Biometrika, 76, 609-612.

  4. Casella, G., and Berger, R. L. (2002), “Statistical Inference,” 2

  nd

  Ed., Duxbury, Pacific Grove, CA.

  5. Detre, K., and White, C. (1970), “The Comparison for Two Poisson Distributed Observations,” Biometrics, 26, 851-854.

  6. Krishnamoorthy, K., and Thomson, J. (2004). “A more Powerful Test for comparing Two Poisson Means,” Journal of Statistical Planning and Inference, 119, 23-35.

  8. Moser, B. K., and Stevens, G. R. (1992). “Homogeneity of Variance in the Two Sample Means Test,” The American Statistician, 46, 19-21.

  References

  9. Moser, B. K., Stevens, G. R., and Walts, C. L. (1989). “The Two Sample t Test Versus Satterthwaite’s Approximate F-Test,” Communications in Statistics- Theory and Methods, 18, 3963-3975.

  10. Ng, H. K. T., and Tang, M. L. (2005). “Testing the Equality of Two Poisson Means Using the Rate Ratio,” Statistics in Medicine, 24, 955-965.

  11. Rohatgi, V. K., and Saleh, Md. K. E. (2001). “An Introduction to Probability and

  nd 12. Satterthwaite, F. E. (1941). “Synthesis of Variance,” Psychometrika, 6, 309-316.

  13. Satterthwaite, F. E. (1946). “An Approximatie Distribution of Estimates of Variance Component,” Biometric Bulletin, 6, 110-114.

  14. Sichel, H. S. (1973). “On a Significance Test for Two Poisson Variables,” Applied Statistics, 22, 50-58.

  15. Sprott, D. A., and Farewell, V. T. (1993). “The Differences Between Two Normal Means,” The American Statisticians, 47, 126-128.

  16. Thode, H. C. (1997). “Power and Sample Size Requirements for Tests of Differences Between Two Poisson Rates,” The Statistician, 46, 227-230.

  1. Ames, M. H. and Webster, J. T. (1991), “On Estimating Approximate Degrees of Freedom,” The American Statistician, 45, 45-50.

  (44)

  2

  { ] ) 1 (

  

2

  [ ) (

  ) ( ) (

  2 )] ( [

  (43) We get that }.

  2 ] )

  [ 1 ( ) 1 (

  ) ( ) 1 (

  [ ) ˆ ( 2 3 2 2 2

  V

      

   

   

  ) ( ) ( ) cov(

   

   i i i i i i i i i i i

  C p C C C p C p C p C p C

  C p C C p p

   