CONDITIONAL EXPECTATION OF BIVARIATE RANDOM VARIABLES

Chapter 9 CONDITIONAL EXPECTATION OF BIVARIATE RANDOM VARIABLES

  This chapter examines the conditional mean and conditional variance associated with two random variables. The conditional mean is very useful in Bayesian estimation of parameters with a square loss function. Further, the notion of conditional mean sets the path for regression analysis in statistics.

  9.1. Conditional Expected Values Let X and Y be any two random variables with joint density f(x, y).

  Recall that the conditional probability density of X, given the event Y = y, is defined as

  f (x, y)

  g(xy) =

  where f 2 (y) is the marginal probability density of Y . Similarly, the condi- tional probability density of Y , given the event X = x, is defined as

  f (x, y)

  h(yx) =

  where f 1 (x) is the marginal probability density of X. Definition 9.1. The conditional mean of X given Y = y is defined as

  µ X |y = E (X | y) ,

  Probability and Mathematical Statistics

  x g(xy)

  if X is discrete

  : R 1 x g(xy) dx

  if X is continuous.

  Similarly, the conditional mean of Y given X = x is defined as

  y h(yx)

  if Y is discrete

  : R 1 y h(yx) dy

  if Y is continuous.

  Example 9.1. Let X and Y be discrete random variables with joint proba- bility density function

  What is the conditional mean of X given Y = y, that is E(X|y)? Answer: To compute the conditional mean of X given Y = y, we need the

  conditional density g(xy) of X given Y = y. However, to find g(xy), we

  need to know the marginal of Y , that is f 2 (y). Thus, we begin with

  Therefore, the conditional density of X given Y = y is given by

  f (x, y)

  g(xy) =

  f 2 (y) = x+y

  6 + 3y x = 1, 2, 3.

  Conditional Expectations of Bivariate Random Variables

  The conditional expected value of X given the event Y = y

  X

  E (X | y) =

  x g(xy)

  Remark 9.1. Note that the conditional mean of X given Y = y is dependent only on y, that is E(X|y) is a function of y. In the above example, this

  function is a rational function, namely (y) = 14+6y 6+3y . Example 9.2. Let X and Y have the joint density function

  What is the conditional mean E Y | X = 1

  Answer:

  Z 1

  f 1 (x) =

  (x + y) dy

  

  = xy + y 2

  1 =x+ .

  Probability and Mathematical Statistics

  h(yx) =

  3 y h(yx) dy

  The mean of the random variable Y is a deterministic number. The conditional mean of Y given X = x, that is E(Y |x), is a function (x) of

  the variable x. Using this function, we form (X). This function (X) is a random variable. Thus starting from the deterministic function E(Y |x), we

  have formed the random variable E(Y |X) = (X). An important property of conditional expectation is given by the following theorem.

  Theorem 9.1. The expected value of the random variable E(Y |X) is equal to the expected value of Y , that is

  E x E y |x (Y |X) = E y (Y ),

  Conditional Expectations of Bivariate Random Variables

  where E x (X) stands for the expectation of X with respect to the distribution of X and E y |x (Y |X) stands for the expected value of Y with respect to the conditional density h(yX).

  Proof: We prove this theorem for continuous variables and leave the discrete case to the reader.

  Z 1

  E x E y |x (Y |X) = E x

  y h(yX) dy

  Z 1 ✓Z 1 ◆

  y h(yx) dy f 1 (x) dx

  Z 1 ✓Z 1 ◆

  y h(yx)f 1 (x) dy dx

  Z 1 ✓Z 1 ◆

  h(yx)f 1 (x) dx y dy

  Example 9.3. An insect lays Y number of eggs, where Y has a Poisson distribution with parameter . If the probability of each egg surviving is p, then on the average how many eggs will survive?

  Answer: Let X denote the number of surviving eggs. Then, given that Y = y (that is given that the insect has laid y eggs) the random variable X has a binomial distribution with parameters y and p. Thus

  X |Y ⇠ BIN(Y, p) Y ⇠ P OI( ).

  Therefore, the expected number of survivors is given by

  E x (X) = E y E x |y (X|Y ) =E y (p Y )

  (since X|Y ⇠ BIN(Y, p))

  =pE y (Y ) =p.

  (since Y ⇠ POI( ))

  Definition 9.2. A random variable X is said to have a mixture distribution if the distribution of X depends on a quantity which also has a distribution.

  Probability and Mathematical Statistics

  Example 9.4. A fair coin is tossed. If a head occurs, 1 die is rolled; if a tail occurs, 2 dice are rolled. Let Y be the total on the die or dice. What is the expected value of Y ?

  Answer: Let X denote the outcome of tossing a coin. Then X ⇠ BER(p),

  where the probability of success is p = 1 2 .

  Note that the expected number of dots that show when 1 die is rolled is 126 36 , and the expected number of dots that show when 2 dice are rolled is 252 36 .

  Theorem 9.2. Let X and Y be two random variables with mean µ X and

  µ Y , and standard deviation X and Y , respectively. If the conditional

  expectation of Y given X = x is linear in x, then

  where ⇢ denotes the correlation coefficient of X and Y . Proof: We assume that the random variables X and Y are continuous. If

  they are discrete, the proof of the theorem follows exactly the same way by replacing the integrals with summations. We are given that E(Y |X = x) is

  linear in x, that is

  E(Y |X = x) = a x + b,

  where a and b are two constants. Hence, from above we get

  Z 1 y h(yx) dy = a x + b

  Conditional Expectations of Bivariate Random Variables

  which implies

  Z 1 f (x, y)

  f 1 (x) Multiplying both sides by f 1 (x), we get

  Now integrating with respect to x, we get

  This yields

  µ Y =aµ X + b.

  Now, we multiply (9.1) with x and then integrate the resulting expression with respect to x to get

  From this we get

  E(XY ) = a E X 2 +bµ X .

  Solving (9.2) and (9.3) for the unknown a and b, we get

  E(XY ) µ X µ Y

  Similarly, we get

  b=µ Y +⇢ Y µ X .

  X

  Letting a and b into (9.0) we obtain the asserted result and the proof of the theorem is now complete.

  Example 9.5. Suppose X and Y are random variables with E(Y |X = x) =

  x + 3 and E(X

  |Y = y) = 1

  4 y + 5. What is the correlation coefficient of

  X and Y ?

  Probability and Mathematical Statistics

  Answer: From the Theorem 9.2, we get

  Therefore, equating the coefficients of x terms, we get

  Similarly, since

  we have

  Multiplying (9.4) with (9.5), we get

  which is

  2 ⇢ 1 = .

  Solving this, we get

  1 ⇢= ± . 2

  Since ⇢ Y X = 1 and Y X > 0, we get

  9.2. Conditional Variance The variance of the probability density function f(yx) is called the

  conditional variance of Y given that X = x. This conditional variance is defined as follows:

  Definition 9.3. Let X and Y be two random variables with joint den- sity f(x, y) and f(yx) be the conditional density of Y given X = x. The conditional variance of Y given X = x, denoted by V ar(Y |x), is defined as

  V ar(Y 2 2 |x) = E Y |x (E(Y |x)) ,

  where E(Y |x) denotes the conditional mean of Y given X = x.

  Conditional Expectations of Bivariate Random Variables

  Example 9.6. Let X and Y be continuous random variables with joint probability density function

  What is the conditional variance of Y given the knowledge that X = x?

  Answer: The marginal density of f 1 (x) is given by Z 1

  Thus, the conditional density of Y given X = x is

  f (x, y)

  h(yx) =

  Thus, given X = x, Y has an exponential distribution with parameter ✓ = 1 and location parameter x. The conditional mean of Y given X = x is

  Z 1

  E(Y |x) =

  y h(yx) dy

  where z = y x

  Z 1 Z 1

  =x

  e z dz +

  ze z dz

  = x (1) + (2) = x + 1.

  Probability and Mathematical Statistics

  Similarly, we compute the second moment of the distribution h(yx).

  Z 1 E(Y 2 |x) = 2 y h(yx) dy

  where z = y x

  V ar(Y

  Remark 9.2. The variance of Y is 2. This can be seen as follows: Since, the

  R y

  marginal of Y is given by f 2 (y) = 0 e y dx = y e y , the expected value of Y

  Thus, the variance of Y is V ar(Y ) = 6 4 = 2. However, given the knowledge

  X = x, the variance of Y is 1. Thus, in a way the prior knowledge reduces the variability (or the variance) of a random variable.

  Next, we simply state the following theorem concerning the conditional variance without proof.

  Conditional Expectations of Bivariate Random Variables

  Theorem 9.3. Let X and Y be two random variables with mean µ X and

  µ Y , and standard deviation X and Y , respectively. If the conditional

  expectation of Y given X = x is linear in x, then

  E x 2 (V ar(Y |X)) = (1 ⇢ ) V ar(Y ),

  where ⇢ denotes the correlation coefficient of X and Y .

  Example 9.7. Let E(Y |X = x) = 2x and V ar(Y |X = x) = 4x 2 , and let X

  have a uniform distribution on the interval from 0 to 1. What is the variance of Y ?

  Answer: If E(Y |X = x) is linear function of x, then

  E x 2 ( V ar(Y |X) ) = Y (1 ⇢ 2 ).

  We are given that

  Hence, equating the coefficient of x terms, we get

  ⇢ Y =2

  X

  which is

  Further, we are given that

  V ar(Y

  |X = x) = 4x 2

  Since X ⇠ UNIF(0, 1), we get the density of X to be f(x) = 1 on the interval (0, 1) Therefore,

  Z 1

  E x ( V ar(Y |X) ) =

  V ar(Y |X = x) f(x) dx

  Z 1

  4x 2 dx

  0  x 1 3 =4

  Probability and Mathematical Statistics

  By Theorem 9.3,

  =E x

  3 ( V ar(Y |X) )

  Since X ⇠ UNIF(0, 1), the variance of X is given by 2

  X = 1 12 . Therefore,

  the variance of Y is given by

  Example 9.8. Let E(X|Y = y) = 3y and V ar(X|Y = y) = 2, and let Y have density function

  What is the variance of X? Answer: By Theorem 9.3, we get

  V ar(X

  Hence from (9.7), we get E y (V ar(X|Y )) = 2 and thus

  which is

  X =9 2 Y + 2.

  Conditional Expectations of Bivariate Random Variables

  Now, we compute the variance of Y . For this, we need E(Y ) and E Y 2 .

  V ar(Y ) = E Y 2 [ E(Y ) ] 2 = 2 1 = 1.

  Hence, the variance of X can be calculated as

  Remark 9.3. Notice that, in Example 9.8, we calculated the variance of Y directly using the form of f(y). It is easy to note that f(y) has the form of an exponential density with parameter ✓ = 1, and therefore its variance is

  the square of the parameter. This straightforward gives 2 Y = 1.

  9.3. Regression Curve and Scedastic Curve One of the major goals in most statistical studies is to establish relation-

  ships between two or more random variables. For example, a company would like to know the relationship between the potential sales of a new product in terms of its price. Historically, regression analysis was originated in the works of Sir Francis Galton (1822-1911) but most of the theory of regression analysis was developed by his student Sir Ronald Fisher (1890-1962).

  Probability and Mathematical Statistics

  Definition 9.4. Let X and Y be two random variables with joint probability density function f(x, y) and let h(yx) is the conditional density of Y given

  X = x. Then the conditional mean

  Z 1

  E(Y |X = x) =

  y h(yx) dy

  is called the regression function of Y on X. The graph of this regression function of Y on X is known as the regression curve of Y on X.

  Example 9.9. Let X and Y be two random variables with joint density

  xe x(1+y)

  What is the regression function of Y on X?

  Answer: The marginal density f 1 (x) of X is Z 1

  xe x(1+y) dy

  The conditional density of Y given X = x is

  f (x, y)

  h(yx) =

  f 1 (x)

  xe x(1+y) =

  e x =xe xy .

  Conditional Expectations of Bivariate Random Variables

  The conditional mean of Y given that X = x is

  Z 1

  E(Y |X = x) =

  y h(yx) dy

  (where z = xy)

  Thus, the regression function (or equation) of Y on X is given by

  Definition 9.4. Let X and Y be two random variables with joint probability density function f(x, y) and let E(Y |X = x) be the regression function of Y

  on X. If this regression function is linear, then E(Y |X = x) is called a linear regression of Y on X. Otherwise, it is called nonlinear regression of Y on X.

  Example 9.10. Given the regression lines E(Y |X = x) = x + 2 and E(X

  |Y = y) = 1 + 1

  2 y, what is the expected value of X?

  Answer: Since the conditional expectation E(Y |X = x) is linear in x, we get

  µ Y +⇢ Y (x µ X ) = x + 2.

  X

  Hence, equating the coefficients of x and constant terms, we get

  ⇢ Y =1

  X

  Probability and Mathematical Statistics

  respectively. Now, using (9.8) in (9.9), we get

  µ Y µ X = 2.

  Similarly, since E(X|Y = y) is linear in y, we get

  Hence, letting (9.10) into (9.11) and simplifying, we get

  2µ X µ Y = 2.

  Now adding (9.13) to (9.10), we see that

  µ X = 4.

  Remark 9.4. In statistics, a linear regression usually means the conditional expectation E (Yx) is linear in the parameters, but not in x. Therefore,

  E (Y x) = ↵ + ✓x 2 will be a linear model, where as E (Yx) = ↵ x ✓ is not a

  linear regression model. Definition 9.5. Let X and Y be two random variables with joint probability

  density function f(x, y) and let h(yx) is the conditional density of Y given

  X = x. Then the conditional variance

  Z 1

  V ar(Y |X = x) =

  y 2 h(yx) dy

  is called the scedastic function of Y on X. The graph of this scedastic function of Y on X is known as the scedastic curve of Y on X.

  Scedastic curves and regression curves are used for constructing families of bivariate probability density functions with specified marginals.

  Conditional Expectations of Bivariate Random Variables

  9.4. Review Exercises

  1. Given the regression lines E(Y |X = x) = x+2 and E(X|Y = y) = 1+ 1

  2 y,

  what is expected value of Y ?

  2. If the joint density of X and Y is

  1 < x < 1; x < 2

  where k is a constant, what is E(Y |X = x) ?

  3. Suppose the joint density of X and Y is defined by

  ( 10xy 2 if 0

  f (x, y) =

  0 elsewhere.

  What is E X 2 |Y = y ?

  4. Let X and Y joint density function

  ( 2e 2(x+y)

  if 0

  f (x, y) =

  0 elsewhere.

  What is the expected value of Y , given X = x, for x > 0 ?

  5. Let X and Y joint density function

  What is the regression curve y on x, that is, E (YX = x)?

  6. Suppose X and Y are random variables with means µ X and µ Y , respec-

  1 tively; and E(Y |X = x) = 3

  3 x + 10 and E(X |Y = y) = 4 y + 2. What are

  the values of µ X and µ Y ?

  7. Let X and Y have joint density

  What is the conditional expectation of X given Y = y ?

  Probability and Mathematical Statistics

  8. Let X and Y have joint density

  What is the conditional expectation of Y given X = x ?

  9. Let X and Y have joint density

  What is the conditional expectation of X given Y = y ?

  10. Let X and Y have joint density

  What is the conditional expectation of Y given X = x ?

  11. Let E(Y |X = x) = 2 + 5x, V ar(Y |X = x) = 3, and let X have the density function

  What is the mean and variance of random variable Y ?

  12. Let E(Y |X = x) = 2x and V ar(Y |X = x) = 4x 2 + 3, and let X have the density function

  What is the variance of Y ?

  13. Let X and Y have joint density

  What is the conditional variance of Y given X = x ?

  Conditional Expectations of Bivariate Random Variables

  14. Let X and Y have joint density

  What is the conditional variance of Y given X = x ?

  15. Let X and Y have joint density

  What is the marginal density of Y ? What is the conditional variance of X

  given Y = 3 2 ?

  16. Let X and Y have joint density

  What is the conditional variance of Y given X = 0.5 ?

  17. Let the random variable W denote the number of students who take business calculus each semester at the University of Louisville. If the random variable W has a Poisson distribution with parameter equal to 300 and the

  probability of each student passing the course is 3 5 , then on an average how

  many students will pass the business calculus?

  18. If the conditional density of Y given X = x is given by

  8 5 < y x y (1 x) 5y

  if y = 0, 1, 2, ..., 5

  f (yx) =

  : 0 otherwise,

  and the marginal density of X is

  < 4x 3 if 0 < x < 1

  f 1 (x) = : 0 otherwise,

  then what is the conditional expectation of Y given the event X = x?

  19. If the joint density of the random variables X and Y is

  8 2+(2x 1)(2y 1)

  2 if 0 < x, y < 1

  f (x, y) =

  : 0 otherwise,

  Probability and Mathematical Statistics

  then what is the regression function of Y on X?

  20. If the joint density of the random variables X and Y is

  8 ⇥

  ⇤

  e < min {x,y}

  1 e (x+y) if 0 < x, y < 1

  f (x, y) =

  : 0 otherwise,

  then what is the conditional expectation of Y given X = x?

  Transformation of Random Variables and their Distributions