Marginal Probability Mass Function

Joint Distribution

Joint Distribution
randomly draw 3 balls

randomly draw 3 balls

{0,1,2,3}
{0,1,2,3}

X = no. of red balls
Y = no. of white balls

X = no. of red balls
Y = no. of white balls

C× C
Pr
Pr(Pr
0.1227
(XX( X= 123=))0==) =333C192×3C993=C=210 .=0004545

.3818
4909
C
12 12
3 33
12 C

PrPr
( X( X= 12=,,YY0,=Y= 20=))0==) 3=3CC152×C×435CC=21 0==.04545
00..08182
06818
C333
12
12
12C
X

X

0


1

2

3

Prob.

0.3818

0.4909

0.1227

0.004545

Y

0


1

2

3

Prob.

0.2545

0.5091

0.2182

0.01818

Joint Probability Mass Function
p (x, y ) = Pr ( X = x, Y = y )


{0,1,2,3}
{0,1,2,3}

0

1

2

3

0

Y

0.04545

0.1818

0.1364


0.01818

1
2

0.1364
0.06818

0.2727
0.05455

0.08182
0

0
0

3


0.004545

0

0

0

Marginal Probability Mass Function
p X (x ) = Pr ( X = x ) = ∑ p( x, y )

Joint pmf

y

Example : p(x, y ) =

3

C x × 4 C y × 5 C 3− x − y

12

X

Y

0

pY ( y ) = Pr (Y = y ) = ∑ p ( x, y )

,0 ≤ x+ y ≤ 3

C3

x

1

2


3

Total

0

0.04505

0.1818

0.1364

0.01818

0.3818

1

0.1364


0.2727

0.08182

0

0.4909

2

0.06818

0.05455

0

0

0.1227


3

0.04545

0

0

0

0.04545

Total

0.2545

0.5091

0.2182


0.01818

1

Pr (Y = 0 )

Pr (Y = 1)

Pr (Y = 2)

Pr (Y = 3)

Pr ( X = 0 )

Pr ( X = 3)

C y × 9 C 3− y
C3

, y = 0,1,2,3

1

2

3

Total

0.04505
0.1190

0.1818
0.4762

0.1364
0.3571

0.01818
0.0476

0.3818
1
0.4909
1

1

0.1364
0.2778

0.2727
0.5556

0.08182
0.1667

0

2

0.06818
0.5556

0.05455
0.4444

0

0

0.1227
1

3

0.04545
1

0

0

0

0.04545
1

p ( x, y )
Conditional pmf
( y0|.Y
x )are
= related
Xpand
(not independent).
04505
p X≠( x0).3818 × 0.2545
0.5091

0.2182

0.01818

10

20

40

80

pX( x)

20

0.04

0.08

0.08

0.05

0.25

40

0.12

0.24

0.24

0.15

0.75

pY( y)

0.16

0.32

0.32

0.20

1

X

0

0.2545

4

Example :
for all x,y

Example :

Total

pY ( y ) =

C x × 9 C 3− x
, x = 0,1,2,3
12 C 3

I ndependence

p ( x , y ) = p X ( x ) pY ( y )

0

3

12

X and Y are independent if

Y

p X (x ) =

Pr ( X = 2 )

I ndependence

X

Example :

Pr ( X = 1)

1

Y

0.20
16 × 0.75
32
25 = 0.15
04
08
05
12
24
X and Y are independent !

1

Mathematical Expectation

Mathematical Expectation

Example : (X, Y) = height and weight of a random man

E (3 X + 4Y ) = 3E ( X ) + 4 E (Y )

Joint distribution
75

80

85

90

pX( x)

1.7

0.1

0.08

0.06

0.03

0.27

1.8

0.09

0.2

0.15

0.05

0.49

1.9

0.02

0.05

0.07

0.1

0.24

pY( y)

0.21

0.33

0.28

0.18

1

X

Y

(

)

E 7 log X − 2Y 2 + 5 XY = 7 E (log X ) + −2 E (Y 2 ) + 5 E
E ( X ) = 1.797
E (Y ) = 82.15

In general, E ( XY ) ≠ E ( X )E (Y )

Example : (X, Y)
µ x = height and weight of a random man
Joint distribution
X

Y

80

1.7

0.1

0.08

0.06

0.03

0.27

1.8

0.09

0.2

0.15

0.05

0.49

1.9

0.02

0.05

0.07

0.1µ y

0.24

pY( y)

0.21

0.33

0.28

0.18

1

( X − µ x )(Y − µ y ) > 0

µ x = 1.797

)(Y − µ y ) >p0X( x)
85 ( X − µ x90

75

( X − µ x )(Y − µ y ) < 0

(

) ( )( )

E X2 Y =E X2 E Y

E ( X Y ) = E ( X )E (1 Y )

Covariance
σ xy = Cov ( X , Y ) = EE[((XXY− µ)x−)(Yµ−x µµy y)]
Example : (X, Y) = height and weight of a random man
Joint distribution
75

80

85

90

pX( x)

1.7

0.1

0.08

0.06

0.03

0.27

1.8

0.09

0.2

0.15

0.05

0.49

1.9

0.02

0.05

0.07

0.1

0.24

pY( y)

0.21

0.33

0.28

0.18

1

X

Y

E ( X ) = 1.797
E (Y ) = 82.15

E ( XY ) = (1.7 )(75 )(0.1) + (1.7 )(80 )(0.08 ) +  + (1.9 )(90 )(0.1) = 147.745

σ xy = E [( X − µ x )(Y − µ y )] > 0

Cov( X , Y ) = 147.745 − (1.797 )(82.15) = 0.12145

Covariance

+ve correlated

Covariance
X and Y independent
σ xy > 0

σ xy = 0

σ xy = 0

⇐ ?
Income ($1000)

No. of dates

10

20

30

Marginal

0

1/ 3

0

1/ 3

2/ 3

1

0

0

0

0

2

0

1/ 3

0

1/ 3

Marginal

1/ 3

1/ 3

1/ 3

1

E ( XY ) =

Independent

( X),=Y E) =( X0 )E (Y )
Cov
E ( XY



σ xy < 0

Example :

Dependent

)

E ( XY ) = E ( X )E (Y )

µ y = 82.15

( X − µ x )(Y − µ y ) < 0

XY

If X and Y are independent, then

) =×11.797
E((36
XYY )X=+(10..7875
+ )((10..91)(
.
24(36
E
Y)()0=.27
(36) +×(11..80
78+)(00..849×)75
) +085
.9 + 090
.8 × 90 )(0.1)
E  22  == ? 2 (0.1) +  2 (0.08) +  +  2 (0.07 ) +  2 (0.1)
 X   1.7 = 130.412
1.7 
1.9 
 1.)9 


E (Y ) = (75)(0.21) + (80)(0.33) + (85)(0.28) + (90 )(0.18
= 82.15
= 36 E ( X ) + 0.8 E (Y )
= 25.5191
2
≠ E (Y ) E ( X )

Covariance

(

E ( X Y ) ≠ E ( X ) E (Y )

1
(0 × 10 + 0 × 30 + 2 × 20 ) = 40
3
3

E( X ) =

2
1
2
×0+ ×2 =
3
3
3

E (Y ) =

1
(10 + 20 + 30) = 20
3

40

2

Cov ( XY
, Y ) =are−uncorrelated
× 20 = 0
X and
3 3

2

Pr ( X = 0 ) Pr (Y = 10 ) = ≠ Pr ( X = 0, Y = 10 )
X and
Y are not
9 independent!

2

Correlation Coefficient

Correlation Coefficient
Example : (X, Y) = height and weight of a random man

Magnitude of σxy depends on the scale of X and Y.

Joint distribution

Cov(aX + b , cY + d ) = acCov( X , Y )
Standardized by standard deviations of X and Y :
Cov ( X , Y )
ρ = Corr ( X , Y ) =
Var ( X )Var (Y )

Correlation coefficient

90

pX( x)

0.08

0.06

0.03

0.27

1.8

0.09

0.2

0.15

0.05

0.49

1.9

0.02

0.05

0.07

0.1

0.24

pY( y)

0.21

0.33

0.28

0.18

1

2

2

Y = wife’s income

E (Y ) = 16

Var (Y ) = 70

22

Stock
BReturn
S on-10%

2

2

Slightly +ve correlated

E (S ) = 9%

-10%
0%

0%
10%

Probability
6%
0

0.1
0

0.2
0.1

0.40.1

0.3 0.2

8%

0.1

0.3

0.2

0.6

0.1
6%
0.2
0.2

0
8%
0.4
0.6

0
10%
0.20.3

0.2

0

10%
0.1
Return on Bonds
Margin
0.1
Probability

Cov( X , Y ) = 49

Var (S ) = 89 % 2

10%
20% 20%
Margin

E (B ) = 8%
Var ( B ) = 1.6 % 2

Cov(S , B ) = −8 % 2

1

E
Cov
R(=SB(pS
S) ,=B+()−(1=10
−64)(p6−)B
)(90×) +8 =(0−)(86 )(0) Portfolio
+  + (20 )(10 )(0 ) = 64

Total income : S = X + Y

E (R ) = 89pE
p+ (+pS8)(+1 −(1 p−)p )E (B )

E (S ) = E ( X ) + E (Y ) = 20 + 16 = 36

Var ( R ) = 106
p(S21.−
.16
6 (pB(1)−+ p2 )p (1 − p )Cov (S , B )
89
p 2Var
p.62 +
)6+19
(1(1−.2−pp)+2)2−1Var

Var (S ) = 60
× 49
Var+(70
X )++ 2Var
(Y =) +228
2Cov ( X , Y ) σ S = 228 = 15.1

E (R ) = 8.09

p = 0.09

Population and Sample

σ R = 0.8576

Population and Sample

Population

Population

µ =6

µ ,σ 2

Population parameters

σ2 =8
Sample

X,S

X,S2

E (Y ) = 82.15

σ xy = 0.12145

Example: investment

Var (aX + bY ) = a Var ( X ) + b Var (Y ) + 2abCov ( X , Y )

Sample

E ( X ) = 1.797

Linear Combination of Random Variables

Linear Combination of Random Variables
E (aX + bY ) = aE ( X ) + bE (Y )

Example : X = husband’s income

2

Cov(0X.12145
,Y )
= 0.3362
( X )Var)((25
Var
(0.005091
Y ).6275)

ρ=

X and Y independent ⇐
⇒ρ=0

E ( X ) = 20

85

0.1

( ) ( )
( )
( )
Var
E (Y(Y))==(75
6774
)(0.25.21−) +82(80
.15 )(=
0.33
25).6275
+ (85 )(0.28) + (90 )(0.18) = 6774.25

−1 ≤ ρ ≤ 1

Var ( X ) = 60

80

1.7

Y

2
2
Var
E X( X
)== 13..72343
(0.−271).797
+ 1.28=2 (00..005091
49) + 1.9 2 (0.24) = 3.2343

Corr (aX + b , cY + d ) = sign(ac )Corr ( X , Y )

2

75

X

2

Inference

Sample Mean

Sample Variance

(with replacement)

µ ,σ

2

{X

1

, X 2}

X + X2
X= 1
2

Sample Statistics

S2 =

( X1 −{(XX )− X )
2

1

2 − 12

2
1

2

+ (X 2 − X )

2

0

Prob = 1/25

3
6

2

Prob = 2/25

8

Prob = 2/25

2

}

……………….

3

Population and Sample

Very Simple Random Sample (VSRS)

Sampling distributions

{X

X

2

3

4

5

6

7

8

9

10

Prob

0.04

0.08

0.12

0.16

0.2

0.16

0.12

0.08

0.04

S2

0

2

8

18

32

Prob

0.2

0.32

0.24

0.16

0.08

(very simple) random sample

Each X drawn from same population (distribution)
X’s are independent
Var ( X ) =

σ2
n

SE = Var ( X ) =
µ = 71.8

Example : Final examination scores

Unbiased

E (S ) = 80 ×

0.2 + 2 × 0.32 +  + 32 × 0.08
2

, X 2 ,..., X n }

E (X ) = µ

E (X ) = 6
2 ×= 0µ.04 + 3 × 0.08 +  + 10 × 0.04
2

1

For a VSRS of 16
4 students
students: :

E ( X ) = µ = 71.8

Sampling Distribution

standard
error

σ 2 = 195.2

SE =

Var (S 2 ) = 86.4 ⇒ Var (S 2 ) = 9.295

Var ( X ) = 4 ⇒ Var ( X ) = 2

σ
n

σ
195.2
=
= 73.05
16
4
n

Sampling Distribution

Population

Population

VSRS

{X

1

VSRS

, X 2}

{X

1

, X 2}

S2

X

Sampling distributions
X

2

3

4

5

6

7

8

9

10

Prob

0.04

0.08

0.12

0.16

0.2

0.16

0.12

0.08

0.04

S2

0

2

8

18

32

Prob

0.2

0.32

0.24

0.16

0.08

Normal Population
N (µ ,σ

Population

2

Normal Population
Example : measurement X , true value µ
X ~ N (µ , 0.01)

)
VSRS

{X

1

, X 2 ,..., X n }

 X − µ 0.05 

Pr ( X − µ > 0.05) = Pr 
>
0.1 
 0.1

µ

X

Dist. of S 2

Dist. of X

S2

= 2(1 − Φ (0.5)) = 0.617

Take 10 measurements independently. (VSRS with n = 10)
Dist. of X

 σ 
N  µ, 
 n 
2

Dist. of

(n − 1)S

χ n2−1
µ

2

 0.01 
X ~ N  µ,

 10 

σ2

 X −µ
0.05 
Pr X − µ > 0.05 = Pr 
>
 0.1 10 0.1 10 



(

)

= 2(1 − Φ (1.58)) = 0.1142

4

Simple Random Sample (SRS)
{X

1

, X 2 ,..., X n }

Simple Random Sample
Population

simple random sample

µ =6

Sampling without replacement from population (size N)

σ2 =8

Equal probability for each possible sample
N −n σ2



E ( X ) = µ Var (X ) =  N − 1  n



N −n
≤1
N −1

SE = Var (X ) =

N −n σ
N −1 n

standard
error

Sample

Sample Mean

Sample Variance

(without replacement)

{X

1

, X 2}

X=

X1 + X 2
2

S2 =

3

Finite population correction factor

4
6

(X

− X2 )
2

2

1

2

Prob = 1/10

8

Prob = 1/10

8

Prob = 1/10

……………….

Simple Random Sample

Sampling Distribution (SRS)
Population

Sampling distributions

SRS
X

3

4

5

6

7

8

9

Prob

0.1

0.1

0.2

0.2

0.2

0.1

0.1

S2

2

8

18

32

Prob

0.4

0.3

0.2

0.1

{X

1

, X 2}

S2

X

Dist. of S 2

Dist. of X

E (X
E ()X= )6==3µ× 0.1 + 4 × 0.Unbiased
1 +  + 9 × 0 .1
EE(S(S2 )2 =) =102≠× σ0.24 + 8 × 0.3Biased
+  + 32 × 0.1
Var ( X ) = 3 ⇒ Var ( X ) = 1.732

Var (S 2 ) = 88 ⇒ Var (S 2 ) = 9.38

Central Limit Theorem
Arbitrary population

(mean µ , variance σ2)

VSRS

Central Limit Theorem
{X

1

, X 2 ,..., X n }

X

Arbitrary population

VSRS

(mean µ , variance σ2)

X −µ L

→ N (0,1)
σ n

small
n nn
Very
large
moderate

For large n

c−µ 
Pr (X ≤ c ) ≈ Φ

σ n 

{X

1

, X 2 ,..., X n }

as n → ∞

Normal approximation

?
c − µ 
Pr ( X ≤ c )= Φ 

 σ 

5

Normal Approximation

Normal Approximation

Example : Monthly income

Normal approximate binomial

Income 4000

7000

10000

15000

20000

30000

40000

60000

Prob

0.25

0.15

0.1

0.05

0.03

0.01

0.01

0.4

E (Y ) = 35 × 0.25 = 8.75

)(0.4) + (7000
)(0.225X) +~. 
) = 155150000
(4000
8342)(0
− 9250
 .01)
 + (60000
σE2( X= Var
X ) = 155150000
69587500
N  9250,

For a VSRS with size 100
2

2

2 2

100 



(

)

Var (Y ) = 35 × 0.25 × 0.75 = 6.5625

 7 − 8.75 
 15 − 8.75 
) =) 0.7454
0.9927
Φ
Pr (7 ≤ Y ≤ 15) ≈ Φ
(2.440−)(−1 −Φ0(−−.7527
0.683

 6.5625 
 6.5625 

 10000 − 9250 
 = 1 − Φ(0.90 ) = 0.1841
Pr ( X > 10000) ≈ 1 − Φ
 8342 10 
 10000 − 9250 
Pr
)≈1− =
− Φ+(00.09
Φ0.1 + 0.05 + 0=.103
Pr( XX> 10000
> 10000
.01) =+0.04641
.01 =
8342



Y − nπ .
~ N (0,1)
nπ (1 − π )

n large

Example : Y ~ b(35,0.25)

µ = E ( X ) = 9250
(4000 )(0.4 ) + (7000 )(0.25σ) =+ Var
+ (60000
)(0.01)
( X ) = 8342
2

Y ~ b(n, π )

By exact calculation,

Pr (7 ≤ Y ≤ 15) = ∑ 35 C y (0.25 ) (0.75)
15

0.2

Normal Approximation

y

y =7

35 − y

= 0.8018

Continuity Correction
Approximate discrete distribution by continuous distribution

0.16
0.14
0.12
0.1
0.08
0.06

) ≤)15)
Pr((77 ≤ b
N((35
8.75
.5625
Pr
,0,.625
) ≤ 15

0.04

0
5

7

10

15

20

25

30

Continuity correction

6.5

Normal Approximation

Example :

Pr ( X ≤ c − 0.5 )

Pr ( X ≥ c )

Pr ( X ≥ c − 0.5 )

Pr ( X > c )

Pr ( X ≥ c + 0.5 )

35

15.5

E ( X ) = 8.75 Var ( X ) = 6.5625

 6.5 − 8.75 
 15.5 − 8.75 
−0Φ
Pr
(2.635−)(−1 −Φ0(−.8100
Pr(77 ≤≤YY ≤≤15
15))≈≈ Φ
Φ
0.9958
.878
) =) 0.8058 
 6.5625 
 6.5625 

By exact calculation,

Pr (7 ≤ Y ≤ 15) = 0.8018

Normal Approximation

Normal approximate Poisson
Y ~ ℘(θ )

Pr ( X ≤ c + 0.5 )

Pr ( X < c )

Example : Y ~ b(35,0.25)

0.02

0

Pr ( X ≤ c )

Normal approximate Chi-Square

θ large

Y −θ
~ N (0,1)
θ
.

Y ~ ℘(30)

Y ~ χ r2

r large

Y −r .
~ N (0,1)
2r

Example : Y ~ χ 802

By normal approximation with continuity correction,
39.5 − 30 
.5 − 30
 23
0.(9586
−(0−.8824
Pr (24 ≤ Y ≤ 39) ≈≈ Φ
) 0.841
1.734−) (−1 Φ
1Φ.187
 )=
30 
30 



By exact calculation,

 96.58 − 80 
 60.39 − 80 
−1.Φ
(1.311−) (−1 Φ
) 0.8445 
Pr(60.39 ≤ Y ≤ 96.58) ≈ Φ
0Φ.9051
− 0(−.9394
550
) =
160 
160 



Note : no continuity correction is need as Chi-Square is continuous

From Chi-Square distribution table,

e −30 30 y
Pr (24 ≤ Y ≤ 39 ) = ∑
= 0.8391
y = 24
y!
39

Pr(60.39 ≤ Y ≤ 96.58 ) = 0.9 − 0.05 = 0.85

6