Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji jbes%2E2011%2E0819

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Medicare Health Plan Choices of the Elderly: A
Choice-With-Screening Model
Qian Li & Pravin K. Trivedi
To cite this article: Qian Li & Pravin K. Trivedi (2012) Medicare Health Plan Choices of the
Elderly: A Choice-With-Screening Model, Journal of Business & Economic Statistics, 30:1, 81-93,
DOI: 10.1198/jbes.2011.0819
To link to this article: http://dx.doi.org/10.1198/jbes.2011.0819

View supplementary material

Published online: 06 Apr 2012.

Submit your article to this journal

Article views: 145

View related articles


Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 11 January 2016, At: 22:29

Supplementary materials for this article are available online. Please go to www.tandfonline.com/r/JBES

Medicare Health Plan Choices of the Elderly:
A Choice-With-Screening Model
Qian L I
United Biosource Corporation, Lexington, MA 02451 (qianli.econ@hotmail.com)

Pravin K. T RIVEDI

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

Department of Economics, Indiana University, Wylie Hall, Bloomington, IN 47405 (trivedi@indiana.edu)
With the expansion of Medicare, increasing attention has been paid to the behavior of elderly persons

in choosing health insurance. This article investigates how the elderly use plan attributes to screen their
Medicare health plans to simplify a complicated choice situation. The proposed model extends the conventional random utility models by considering a screening stage. Bayesian estimation is implemented,
and the results based on Medicare data show that the elderly are likely to screen according to premium,
prescription drug coverage, and vision coverage. These attributes have nonlinear effects on plan choice
that cannot be captured by conventional models. This article has supplementary material online.
KEY WORDS: Limited decision-making ability; Markov chain Monte Carlo; Two-stage discrete choice
model.

1. INTRODUCTION

2003). Decision rules may be implemented by agents to restrict the choice set and thereby reduce information overload
(Tversky 1972; Tversky and Kahneman 1979; Johnson et al.
1993). The Centers for Medicare and Medicaid Services (CMS)
uses attribute screening to help the elderly choose health plans.
The Medicare website (https:// www.medicare.gov/ find-a-plan/
questions/ home.aspx, as accessed in 2008) provides an online
“Health Plan Finder” tool that elderly persons can use to search
for and compare health plans serving their residential area. The
tool has two parts. For all available plans, the first part provides
information on premiums, network restrictions, and whether the

plan covers prescription drugs, vision services, dental services,
and physical exam. The second part describes detailed plan benefits for the plans selected from the first part.
In analyses of health plan choice, a number of studies have
considered psychological factors as part of the plan evaluation in the standard RUM framework (e.g., Frank 2004; Buchmueller 1997; Strombom 2002; Frank and Lamirand 2007). In
this article, we analyze health plan choice as a process consistent with the use of simple decision rules to reduce the choice
set.
Studies in marketing science have analyzed consumers’
choice in a two-stage decision process, in which consumers first
screen all of the available alternatives and then choose among
the alternatives that passed the screening. Roberts and Lattin
(1997) and Gilbride and Allenby (2004) have reviewed this literature. The reduced choice set in the second-stage of the decision process is usually unobservable to researchers. Following
Gilbride and Allenby (2004), the probability
of choosing alter
native j can be modeled as pr(j) = C pr(j|C) × pr(C), where
C denotes the choice set in the second stage of the decision process. The two-stage model can relax some restrictive properties
of the standard RUMs. In the standard RUMs, offering more

Since the inception of the Medicare program in 1965, Medicare reforms have encouraged the participation of various types
of private plans to provide elderly persons with more avenues
to obtain benefits beyond the basic fee-for-service (FFS) Medicare and its supplements. Debates about the efficiency of these

reforms have been ongoing for a long time. The main critiques
point to the complexity related to the vast array of health plans
available. But advocates believe that the large selection helps
people find the plan that best fits their needs. The recent debates
on Medicare Part D, which established a voluntary prescription
drug benefit and became effective in 2006, also have similar
features (Duggan, Healy, and Morton 2008).
We cannot evaluate the pros and cons of these policies without fully understanding how the elderly choose health plans.
Within the framework of the standard random utility models
(RUMs), agents compare and evaluate all options and then
choose the best; thus they benefit from expansion of the choice
set. Many previous studies have analyzed Medicare health plan
choices in the RUM framework (e.g. Dowd, Feldman, and
Coulam 2003; Atherly, Dowd, and Feldman 2004). But this
framework has limitations when plans are complex and many
plans are available. Studies have found that many Medicare
beneficiaries are confused and find it difficult to compare features of available plans (Atherly 2001). Elderly persons also are
likely to make suboptimal choices. For example, enrollment in
Medicare health plans has been lower than expected, despite
the variety and generosity of the program (Frank 2004). In addition, a large proportion of the elderly selected the Medicare

Part D plans that did not minimize their expected out-of-pocket
drug expenses (McFadden 2006), and most felt that too many
alternative plans had been offered (Heiss 2006).
The suboptimality of plan choice can be explained using
concepts from the behavioral economics literature (McFadden
2005, 2006; Mullainathan and Thaler 2001; Loewenstein 1999).
The complexity of the choice situation comes with a cost, often
referred to as information overload (e.g., Eppler and Mengis

© 2012 American Statistical Association
Journal of Business & Economic Statistics
January 2012, Vol. 30, No. 1
DOI: 10.1198/jbes.2011.08191
81

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

82

options cannot increase the choice probability for the existing

alternatives (Rieskamp, Busemeyer, and Mellers 2006); however, in the two-stage model, more choice options may induce
the agents to screen, and then an existing alternative may have a
higher chance to be chosen, conditional on passing the screen.
Screening rules considered in the literature fall into two
broad classes, compensatory rules and noncompensatory rules.
Compensatory rule applies when an alternative is included in
the second stage if its utility contribution exceeds the cognitive
cost (Hauser and Wernerfelt 1990). Search models that analyze
plan choice based on a comparison of costs and benefits of an
additional search are essentially compensatory (McCarthy and
Tchernis 2009). Noncompensatory rules rely on alternative attributes. An alternative passes the screen if all attributes (conjunctive rule) or at least one attribute (disjunctive rule) are satisfactory. Noncompensatory rules are more likely to be used by
agents because they require less effort to screen than compensatory rules, which is especially relevant in the choice of health
insurance, where many alternatives are offered and/or the alternatives have many attributes (Bettman, Luce, and Payne 1998;
Schram and Sonnemans 2011). The conjunctive rule has shown
the best empirical performance among several screening rules
(Gilbride and Allenby 2004).
The choice-with-screening model proposed in this study is
a two-stage discrete choice model with a conjunctive screening
rule, which is consistent with how the elderly choose their plans
through the “Health Plan Finder” tool. Specifically, it assumes

that the elderly screen health plans according to attributes, and
that a plan can pass the screening if and only if all of its attributes can pass the corresponding cutoffs, where the cutoff
values reflect the individual’s demand for the attributes. Conditional on the demographics, individual demand for each attribute is random and independent. After screening, the plan
choice in the second stage of the decision process is analyzed
using the “standard” random coefficient multinomial probit
(RCMNP) model, where the individual’s preference for plan
attributes also depends on the demographic data. The screening behavior cannot be observed, and no data are available to
indicate how the elderly value the attributes in the screening
stage. Moreover, because of the large dimension of plan offers,
estimating the probabilities of all possible second-stage choice
sets is impossible. These features make the Bayesian approach
computationally more attractive than maximum likelihood estimation.
The choice-with-screening model embodies additional nonlinearity relative to RCMNP, and thus comes with additional
parametric and functional form assumptions. This could potentially improve the fit to the data, but also requires that the
data contain sufficient variation to identify the model. Despite
the conceptual and practical difficulties in identification, the
choice-with-screening model provides a useful sensitivity analysis for the conclusions of the standard models. By using an
RCMNP model, a flexible version of RUMs, as a benchmark,
we set a high hurdle for the choice-with-screening model. Both
models are estimated by Bayesian methods with similar priors,

so the functional form is the main potential source to differentiate the identification between the two.
Our proposed screening model is an extension of the model
of Gilbride and Allenby (2004). First, we link the probability that a plan attribute will be used as screen to the demand

Journal of Business & Economic Statistics, January 2012

for the attribute. This provides an economic explanation and
a flexible specification of screening. For example, it can be
assumed that when elderly people screen plans, they consider
each attribute independently, conditional on observable factors.
Moreover, we allow the demand for attributes and the resulting
screening to be determined jointly across attributes by some unobservable factors. To illustrate the concept of the choice-withscreening model, we assume independent demand for attributes.
We also incorporate demographic data to explain heterogeneity
in choice behaviors, and analyze revealed choices of Medicare
health plan instead of an analysis based solely on product attributes from hypothetical situations, as done by Gilbride and
Allenby (2004).
To the best of our knowledge, this is the first empirical study
of screening behavior in health insurance choice, a setting in
which the existence of a plethora of plans with multiplicity
of attributes makes the concept of information overload quite

plausible and makes the screening model a more plausible description than standard models (Schwartz 2004). Our screening
model for the choice of Medicare health plans is relevant to
modeling choices in other Medicare markets, such as Medicare
Part D plans.
2. MEDICARE HEALTH PLANS
The basic Medicare FFS program has two main parts. Part A
covers hospitalization and skilled nursing care. Most people are
automatically enrolled in Part A without a premium requirement when they turn 65. Disabled persosn and those with endstage renal disease (ESRD) may also be eligible. Part B, which
covers physician services and most outpatient care, is voluntary and requires a monthly premium ($54 in 2002). More than
90% of the Part A enrollees also buy Part B. Therefore, the basic Medicare FFS can be considered the default plan for the
elderly. More than 90% of Medicare beneficiaries obtain additional coverage from other resources, which can be categorized
as plans supplementing the basic Medicare FFS, such as Medigap plans, employer-sponsored retiree plans, and Medicaid, and
Medicare health plans, which are replacements for the basic
Medicare FFS and its supplements.
Medicare health plans are provided by private insurers who
contract with the CMS annually. In exchange for a payment
from the CMS, the insurers agree to offer benefits to Medicare beneficiaries in a specified service area. The plan benefits
should at least be equal to those offered by the basic Medicare
FFS. The plans may charge a premium for extra benefits, but
only community rating is allowed. In the open enrollment period, any elderly person without ESRD can choose a Medicare

health plan serving the residential area to cover the next year’s
health care expenses. The enrollees also must enroll in Part B
and pay the Part B premium in addition to the premium charged
by the Medicare health plan.
3. MODEL SPECIFICATION
The decision process in the choice-with-screening model is
presented in Figure 1. Each individual follows a two-stage decision process to choose one health plan from the basic Medicare FFS and Medicare health plans. In the first stage, he or she

Li and Trivedi: Medicare Health Plan Choices of the Elderly

83

where xhtj is a vector of plan attributes and β h is the individual preference parameter. The error term εhtj is independent
across individuals (h), years (t), and plans (j). Following Rossi,
McCulloch, and Allenby (1996), the heterogeneity of individual preference is imposed hierarchically by a linear regression
model,
β ′h = Dh ×  + eh ,

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016


Figure 1. Two-stage decision process of elderly person. The solid
boxes represent observable data; the dashed boxes, unobservable information.

screens Medicare health plans according to some plan attributes
and follows a conjunctive screening rule, in which a plan can
pass the screening if and only if all of the attributes can pass
the screening. An attribute can pass the screening if its level
can satisfy the demand for it. After screening, the beneficiary
obtains a reduced choice set, from which he or she thoroughly
evaluates each plan and selects the one with the highest utility.
The plan utility depends on plan attributes and individual preference. Both the demand for the attribute in the first stage and
the preference for attribute in the second stage can be driven by
demographics. To control for interest in the status quo, elderly
persons are assumed not to screen the basic Medicare FFS. If all
Medicare health plans are screened out, then the basic Medicare
FFS will be the only plan in the second-stage choice set and will
be chosen for sure.
3.1 Choice-With-Screening Model
The choice-with-screening model is specified based on the
two-stage decision process described earlier. Suppose that
health plan choices are independent across years, given observable factors. Then, for beneficiary h in year t, the two-stage
decision process can be expressed as


yhtj = 1 if zhtj = arg max zhtk for all k s.t. I(x̃htk , γ̃ h ) = 1 .



the 1st stage





the 2nd stage



(3.1)

The first stage of the decision process is the screening
stage, where the indicator function I(x̃htk , γ̃ h ) = 1 if plan k
for individual h in year t passes the screening and enters the
second-stage choice set. The variable x̃htk contains the attribute
information considered in screening, and the parameter γ̃ h represents individual demand for attributes. The relationship between x̃htk and γ̃ h defines the screening rule, which we discuss
in Section 3.2.
In the second stage of the decision process, variable zhtk is
the latent utility of plan k for individual h in year t. The choice
indicator yhtj = 1 is observed if the jth plan passes the screening and gives the highest utility in the second-stage choice set,
which consists of all plans that passed the screening. This is analyzed using the RCMNP model with the following plan utility:
zhtj = x′htj β h + εhtj ,

εhtj ∼ iid N(0, 1),

(3.2)

eh ∼ iid N(0, Vβ ),

(3.3)

where Dh is a row vector containing a constant term and individual demographic information,  is the coefficient, eh is the
normal error term independent across individuals, and Vβ is the
variance–covariance matrix of the regression.
The RCMNP imposes a simple error term in the latent utility equation, but can analyze both the heterogeneity and the
posterior uncertainty of individual preference. The model also
can capture complicated correlation patterns between the alternatives. To see this, combine Equations (3.2) and (3.3): zhtj =
x′htj (Dh × )′ + x′htj e′h + εhtj = x′htj (Dh × )′ + uhtj . Note that
there is no restriction imposed on Vβ ; then cov(uhtj , uhtk ) =
xhtj Vβ x′htk is generally not equal to 0, and var(uhtj ) = 1 +
xhtj Vβ x′htj reflects a heteroscedastic error term. Allenby and
Rossi (1999) examined this issue in more detail.
3.2 Modeling the Screening Stage
Elderly persons are assumed to apply a conjunctive screening
rule, in which a plan cannot enter the second-stage choice set if
it has any attribute level that cannot satisfy the demand. In this
article, attributes analyzed in the first stage of the model are
defined as screening attributes, and they are possible screens.
Which attributes are screening attributes is an a priori assumption; however, the data determine which screening attribute has
a higher demand and thus is more likely to be used as a screen.
The conjunctive screening rule is formed by products of indicator functions, where each indicator function is a screening
on an attribute,

I(x̃htjm > fm (γ̃hm )) = 1,
(3.4)
I(x̃htj , γ̃ h ) = 1, iff
m

where x̃htjm is the mth screening attribute of plan j for individual h in year t, γ̃hm is the individual demand for this attribute,
and fm (·) is a nondecreasing mapping from the individual demand that defines a lower threshold or a cutoff value for the
mth screening attribute. The functional form of fm (·) depends
on the property of the screening attribute. The demand for the
attribute is a linear regression on demographic variables, with
error terms independent across individuals and attributes.
When the attribute has continuous values (e.g., premium),
fm (·) is assumed to be an identity function, so that the demand
level serves as the cutoff value,
c
c
fmc (γ̃hm
) = γ̃hm
,

(3.5)

where the superscript c means that the specification is applied
to continuous attributes. The demand is
c
γ̃hm
= D̃′h α cm + uchm ,

uchm ∼ N(0, σm2 ),

(3.6)

where D̃h contains a constant term and demographic variables,
α cm determines the impact of demographics on screening, and
uchm is the error term with variance σm2 .

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

84

Journal of Business & Economic Statistics, January 2012

An attribute also can take on discrete values (x̃htjm =
0, 1, 2, . . .). For example, the generosity of drug coverage can
be ranked in several levels. In this case, fm (·) needs to be a
nondecreasing step function for identification purposes. In this
study, the demand is assumed to have two levels: high demand
and low demand. For an individual with high demand, only
plans with positive attribute levels (x̃htjm = 1, 2, . . .) can enter
the second-stage choice set, but the generosity of the plan attribute has no effect on screening given positive attribute levels.
Thus the high demand can be interpreted as an aversion to a zero
attribute level. On the other hand, when the demand is low, any
attribute level (x̃htjm = 0, 1, 2, . . .) can pass the screening; that
is, the attribute is not used as a screen. Then the low demand
can be considered to represent the individual’s indifference to
the attribute in the screening stage.
For the case of high demand, any cutoff value between 0 and
1 (say 0.5) will ensure that only positive attribute levels pass
the screen. For the case of low demand, a negative cutoff value
(say −0.5) will ensure that no plan is screened out by this attribute. Then demand for the discrete screening attribute can be
specified as the following latent variable model:


d ≤0
−0.5 if γ̃hm
d d
(3.7)
fm (γ̃hm ) =
d > 0,
0.5
if γ̃hm
d
γ̃hm
= D̃′h α dm + udhm ,

udhm ∼ N(0, 1),

(3.8)

where the superscript d means that the specification is applied
to discrete attributes.
In the screening, lower thresholds for attributes are defined.
Premium is a possible screen in the study of health plan choice.
But because of the budget constraint, the elderly may consider only plans that cost less than a certain price. So a negative premium should be the argument in the screening rule,
and thus the demand for negative premium (with a variance
2
) reflects the upper threshold of premium in the
of σ−PREMIUM
screening. If all plan offers have a premium lower than the upper threshold, then the individual is not using premium to screen
plans. For a discrete screening attribute, the probability of bed > 0]. Because demand
ing used as a screen is equal to pr[γ̃hm
is stochastic, different individuals can use different screening
attributes or none of the screening attributes to screen. But the
basic Medicare FFS is assumed to always pass the screening,
which reflects people’s interest in the default plan and ensures
a nonempty second-stage choice set.
4. MARKOV CHAIN MONTE CARLO ESTIMATION
We adapted the Bayesian analysis of the two-stage model
proposed by Gilbride and Allenby (2004) to estimating the
choice-with-screening model. The estimation uses MCMC
methods in conjunction with a data augmentation approach
(Tanner and Wong 1987).
Suppressing the subscripts in Equations (3.1)–(3.8), the
structure of the screening model follows
y|z, I(x̃, γ̃ ),
z|x, β,
β|D, , Vβ ,
γ̃ |D̃, α, σ 2 ,

where the variance of the demand for screening attribute σ 2
equals 1 for discrete attributes. After imposing conjugate prior
distributions of the hyperparameters (, Vβ , α, σ 2 ), the posterior distribution of the hyperparameters can be simulated by
drawing iteratively from the conditional distributions,
z|y, I(x̃, γ̃ ), x, β,

(4.1)

β|z, , Vβ , D, x,

(4.2)

|β, Vβ , D,

(4.3)

Vβ |β, , D,

(4.4)

γ̃ |z, α, σ 2 , y, x̃, D̃,

(4.5)

α|γ̃ , σ 2 , D̃,

(4.6)

and
σ 2 |γ̃ , α, D̃.

(4.7)

The detailed estimation algorithm is provided in Appendix A,
and its performance is checked by a simulation study.
The conditional distribution of latent utility for plans in the
second-stage choice set (4.1) is a normal distribution truncated
from above by the latent utility of the chosen plan. If none of
the Medicare health plans pass the screening and only the basic
Medicare FFS is in the second-stage choice set, then the truncation does not apply. In addition, for the Medicare health plans
not in the second-stage choice set, latent utility follows a nontruncated normal distribution. It is the existence of screening
that leads to various latent utility distributions.
For individual h, the conditional distributions of the demand
for the mth screening attributes (4.5) have the expression







p[γ̃hm |·] ∝
p yht zht ,
I(x̃htm > fm (γ̃hm ))
t

m

× p[γ̃hm |α m , σm2 ],

which
 is a normal distribution truncated on the support t {p[yht |
zht , m I(x̃htm > fm (γ̃hm ))]} = 1. It is not analytically tractable
and thus is simulated using the accept–reject sampling (A–
R) method, in which efficiency is easier to control compared with the Metropolis–Hastings (M–H) method
 (Chib and
Greenberg 1995). The indicator function p[yht |zht , m I(x̃htm >
fm (γ̃hm ))] = 1 if γ̃hm leads to a second-stage choice set, where
the chosen plan is included and gives the highest latent utility.
Otherwise, the value of γ̃hm is not consistent with the observed
data or the augmented data. The inconsistency arises when random draws from p[γ̃hm |α m , σm2 ] lead to a second-stage choice
set that does not include the chosen plan, or a second-stage
choice set that includes some plans whose utilities are drawn
from the nontruncated normal distribution in Equation (4.1)
and are higher than the utility of the chosen plan. In the model,
the chosen plan must have all of its attributes pass the screening, but its utility might not be the upper threshold for some
plan utilities in the simulation. This feature can detect impermissible values for γ̃hm for all of the years’ choices and ensures
convergence of the Bayesian estimation.

Li and Trivedi: Medicare Health Plan Choices of the Elderly

5.

85

DATA

Two datasets provided by the CMS are matched to construct the sample analyzed in this study. The Medicare Current
Beneficiary Survey (MCBS), a nationally representative sample of the Medicare beneficiaries, contains demographic and
plan choice information. The Medicare Health Plan Compare
(MHPC) dataset provides information of the Medicare health
plans offered to the elderly. A detailed explanation of the data
preparation is available on request.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

5.1 Medicare Current Beneficiary Survey
The sample is selected from 2002 Access to Care file in
MCBS, which includes 15,142 individuals who completed
a personal interview. To focus on the Medicare health plan
choices, observations are dropped if individuals were not identified as “aged, no ESRD” by the CMS (n = 2543) or were
enrolled in any of the following plan types: Medicaid and other
public plans (n = 3459), employer-sponsored plans (n = 4926),
or Medigap plans (n = 4462). After deleting observations in
the foregoing categories, the sample contains 2866 observations of individuals who either had only the basic Medicare
FFS (n = 1282) or had a Medicare health plan (n = 1584).
Furthermore, observations are dropped if they changed their
plan or residential area (n = 717). Some 52% of the remaining

sample did not purchase any Medicare health plan. These beneficiaries were asked if they had heard of Medicare health plans
and if they knew the availability of such plans. A beneficiary
who did not know of the existence of Medicare health plans in
his or her choice set was excluded from the analysis. This led to
a sample of 1466 observations with 1026 Medicare health plan
enrollees.
5.2 Medicare Health Plan Compare
The 2002 MHPC provided plan attribute information for 567
plans (451 basic plans plus 116 optional packages) under 169
contracts signed between CMS and private insurers. The plan
information includes the premium and the cost-sharing arrangements for a total of 38 health care categories. To control the
number of parameters in the model, only seven attributes are
considered in the study, including monthly premium, prescription drug coverage, non-network doctor/hospital service coverage, vision coverage, dental coverage, routine physical exam
coverage, and doctor/specialist visit copayment. The health service coverage is aggregated into several levels to measure plan
generosity. The definition and measurement of the plan attributes are presented in Table 1.
Other variables are also used to describe the health plans.
A variable “ADD” is created to indicate plan generosity captured by attributes other than the seven basic attributes to control for the omitted plan attributes. Assuming that the Medicare

Table 1. Variables used to describe Medicare health plans
Variable name Definition

Measurement

Plan attributes variables
PREMIUM Monthly premium in addition to the
Part B premium

Dollar value

COPAYDV Average primary doctor and specialist
visit co-payment

Dollar value

DRUG

0 = No drug coverage
1 = Only cover some generic drug expenses
2 = Cover some generic drug expenses and some brand drug expenses

Aggregated drug coverage

NON_NET Non-network doctor/hospital service coverage 0 = Not cover non-network doctor/hospital service
1 = Cover non-network doctor/hospital service
VISION

Aggregated vision coverage

0 = Not cover non-Medicare-covered eye exam and eye wear
1 = Cover non-Medicare-covered eye exam and eye wear with no free service
2 = Cover non-Medicare-covered eye exam and eye wear with one free service
3 = Cover non-Medicare-covered eye exam and eye wear wit all services free

DEN

Aggregated dental coverage

0 = No dental coverage
1 = Some dental coverage but no free service
2 = 1 free service
3 = 2 free services
4 = 3 free services
5 = 4 free services

PX

Aggregated routine physical exam coverage

0 = No routine physical exam coverage
1 = Routine physical exam coverage with co-payment >$0
2 = Routine physical exam coverage with co-payment =$0

Other variables
ADD
Generosity in other plan attributes

Integer with higher value for higher generosity

AAPCC

Medicare capitation payment to the risk plans Dollar value

NPAK

Number of packages under the contract

Integer

86

Journal of Business & Economic Statistics, January 2012

health plans are actuarially competitive, then among Medicare
health plans inferior to the other competing plans in all of the
seven basic attributes, the variable ADD is set to 1 for the plan
with the least inferior basic attributes and to 2 for the plan with
the second least inferior attributes, and so forth. The variable
is 0 for plans not inferior in basic attributes. Another variable
is the 2002 Adjusted Area Per Capita Cost (AAPCC) obtained
from the State County file released by the CMS. AAPCC is
the county-specific reimbursement rate for the Medicare health
plans. A higher AAPCC is associated with a more generous
plan in the area (Kaiser Family Foundation 2004). The number
of plans under the contract, an index of insurers’ market power,
is also used to describe the plans.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

5.3 Matching the Medicare Current Beneficiary Survey
to Medicare Health Plan Compare
Data from MCBS are matched to MHPC by county and zip
code. Among the 1466 elderly persons selected from MCBS,
433 were excluded because plan information on their residential area is not available in MHPC. For each Medicare health
plan enrollee, MCBS reports under which contract the plan was
issued. This contract was matched to the plan contracts serving
the enrollee’s residential area provided by MHPC. A successful
match was achieved for all but 66 of the remaining 723 Medicare health plan enrollees, resulting in the final sample of 967
elderly persons with 657 Medicare health plan enrollees.

MCBS only identifies the contract serving the Medicare
health plan enrollee, and does not specify the plan chosen by
the enrollee. Because the insurer is allowed to provide multiple plans under each contract, strategies are needed to impute
the specific plan choice for the Medicare health plan enrollees.
Previous works (e.g., Atherly, Dowd, and Feldman 2004; Brand
2005) has assumed that plan enrollees choose the cheapest plan
under the chosen contract. The strategy used in this study first
tries to match the beneficiary’s description of a plan provided
by MCBS to the plan information in MHPC, and if the plan
choice is still not specified, then assumes that the plan with the
most generous basic attributes is the choice.
The size of the final Medicare health plan choice set for the
967 elderly persons ranged from 1 to 60 (mean 9.86 ± 9.11).
Stacking all of the choice sets, a total of 9539 Medicare health
plans were offered to the sample. The descriptive statistics of
the plans offered and the plans chosen by the sample are presented in Table 2. The table also lists the variable values for the
basic Medicare FFS. Note that the variable PREMIUM is the
premium of the Medicare health plan, which is additional to the
Part B premium.
The beneficiary’s age, sex, income, education level, and
self-rated health status are expected to affect the plan choice.
Atherly, Dowd, and Feldman (2004) showed that the number of
chronic diseases affects an individual’s desire for drug coverage. They also showed that elderly persons with diabetes systematically choose Medicare health plans with vision coverage,

Table 2. Descriptive statistics of plan variables

Variable name
PREMIUM
COPAYDV
ADD
AAPCC
NPAK
Variable name

Medicare health plans offered to the sample

Medicare health plans chosen by the sample

Min

Min

Max

Mean

Std.

Median

0
189
0
55
0
6
423.59 834.20
1
53

50.65
15.37
0.45
623.25
8.57

43.16
6.33
0.79
100.64
9.25

50
15
0
569.50
6

Measurement

Percent

Median

Max

Basic Medicare FFS

Mean

Std.

Median

Value

0
189
0
37.5
0
5
500.37 834.20
1
53

36.10
15.16
0.41
596.65
5

39.51
5.85
0.73
77.55
5.91

25
15
0
557.62
3

0
25
0
0
0

Measurement

Percent

Median

Value

DRUG

0
1
2

28.87
28.78
42.35

1

0
1
2

23.44
30.90
45.66

1

0

NON_NET

0
1

87.21
12.79

0

0
1

92.24
7.76

0

1

VISION

0
1
2
3

20.92
11.74
62.14
5.19

2

0
1
2
3

19.94
12.63
60.58
6.85

2

0

DEN

0
1
2
3
4
5

68.02
9.71
15.17
0.46
5.18
1.47

0

0
1
2
3
4
5

82.04
5.18
5.63
2.13
2.89
2.13

0

0

PX

0
1
2

6.09
70.79
23.12

1

0
1
2

0.61
82.95
16.44

1

0

N = 9539

N = 657

Li and Trivedi: Medicare Health Plan Choices of the Elderly

87

Table 3. Descriptive statistics of sample demographic
Variable name

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

AGE

Min

Max

Mean

Std.

Median

65

98

75.64

6.65

75

Percentage

Median

Variable name

Measurement

GENDER

1 = Male
2 = Female

45.81
54.19

2

INCOME
(income of the sample
person and the spouse)

1 = $5000 or less
2 = $5001–$10,000
3 = $10,001–$15,000
4 = $15,001–$20,000
5 = $20,001–$25,000
6 = $25,001–$30,000
7 = $30,001–$35,000
8 = $35,001–$40,000
9 = $40,001–$45,000
10 = $45,001–$50,000
11 = $50,001 or more

2.59
9.00
21.61
17.17
12.72
10.86
7.86
4.96
2.07
3.21
7.96

4

EDUCATION

1 = No schooling
2 = Nursery to 8th grade
3 = 9th to 12th grade, but no diploma
4 = High school graduate
5 = Vocational, technical, business, etc.
6 = Some college, but no degree
7 = Associate’s degree
8 = Bachelor’s degree
9 = Post graduate degree

0.62
12.93
18.41
31.95
6.31
16.03
3.41
5.89
4.45

4

HEALTH
(self-rated health status)

1 = Excellent
2 = Very good
3 = Good
4 = Fair
5 = Poor

17.89
30.20
32.57
13.96
5.38

3

NCHRONIC
(# of chronic diseases)

0
1
2
3

10.75
30.92
40.85
17.48

2

DIABETES

0 = No problem
1 = Some problem

79.63
20.37

0

N = 967

given that diabetes is a leading cause of blindness in the United
States. Therefore, the model considers the number of chronic
diseases (including bone problems, heart problems, and cancer)
and an indicator for diabetes. Demographic data for the sample
are presented in Table 3.
5.4 Specification of Variables
Here we apply the choice-with-screening model described by
Equations (3.1)–(3.8) to the dataset constructed earlier. Vector x
in Equation (3.2) contains all of the variables in Table 1 to describe plans in the second-stage choice set. The screening attributes x̃ in Equation (3.4) are assumed to be negative monthly
premium, prescription drug coverage, non-network service coverage, vision coverage, dental coverage, and routine physical
exam coverage. These are the plan features listed in the first part
of the Health Plan Finder on the Medicare website (as accessed
in 2008). Only the negative monthly premium is a continuous
screening attribute.

Vector D in Equation (3.3) contains the demographic variables in Table 3 to explain the individual preference in the second stage of the decision process. Demeaned demographic values are used, so that the coefficient of the constant term in D
is equal to the expected individual preference evaluated at the
average demographic values in the sample. This can be shown
by rewriting Equation (3.3) as
 ∗
1


+ eh
β h = [1, dh ] ×
∗2
= (∗1 + d̄∗h × ∗2 ) + (d∗h − d̄∗h ) × ∗2 + eh
≡ 1 + dh × 2 + eh ,
where d∗h are the original demographic variables, d̄∗h are the
sample means of d∗h , and dh are the demeaned values. The vector D̃h is assumed to contain only a constant term at this point,
because a larger sample size is needed to identify more parameters in this vector. Then the coefficients in Equations (3.6) and

88

Journal of Business & Economic Statistics, January 2012

Table 4. Coefficient estimates for the first stage in the choice-with-screening model
Posterior mean (std.) of α
Negative monthly
premium
Constant

−92.59
(−2.98)
50% of
the sample

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

NOTE:

Prescription drug
coverage

Non-network
coverage

Vision
coverage

Dental
coverage

Routine exam
coverage

−0.47
(−0.11)

−7.59
(−1.55)

−1.04
(−0.21)

−13.74
(−1.58)

1.39
(−0.29)

0.15
(0.05)

0.00
(0.00)

0.91
(0.05)

0.32
(0.04)

Prob. of used as screen
0.00
(0.00)

Bold type indicates that more than 90% of the posterior draws have the same sign as the posterior mean.

(3.8) represent the mean demand for screening attributes. With
this specification, the probability for a discrete attribute being
d > 0] = (α d ), where (·) is the
used as a screen is pr[γ̃hm
m
standard normal cumulative distribution function.

6. ESTIMATION RESULTS
A total of 50,000 draws are generated according to the
MCMC algorithm in Appendix A. The first 40,000 iterations
are discarded, and every twentieth draw is collected to estimate
the parameters. Tables 4 and 5 report data on the posterior distribution of the hyperparameters α and , respectively. To save
space, the tabled do not report estimation results for Vβ and
2
σ−PREMIUM
. The posterior distributions are described by posterior mean and standard deviation. The tables also indicate the
consistency between the signs of the posterior draws and the
sign of the posterior mean. If more than 90% of the posterior
draws have the same sign as the posterior mean, then we can
say with some confidence that the posterior mean has a nonzero
value. The plots of the posterior draws show that the estimation
converges, albeit somewhat slowly.

6.1 Coefficient Estimates for the First Stage
of the Model
The posterior distribution of α, the mean demand for screening attributes, is given in the first row of Table 4, and the corresponding probability for each attribute to be used as a screen is
presented in the second row. Note that the result applies only to
the Medicare health plans, because the elderly are assumed to
not screen the basic Medicare FFS.
The mean demand for the continuous screening attribute (i.e.,
the negative monthly premium) reflects the expected cutoff
value. The posterior mean shows that on average, the elderly
only consider Medicare health plans with a monthly premium
below $93. According to this estimate, the premium screening
will apply for 50% of the elderly persons in the sample.
For the aggregated health service coverage, the mean demand needs to be converted through the step function (3.7)
to reflect the screening behavior. The screening probability,
d > 0] = (α d ), is evaluated at each posterior draw of
pr[γ̃hm
m
d . It shows that the elderly tend to use the drug coverage and
αm
vision coverage as screens. Drug coverage screening is twice
as likely to be used as vision coverage screening. However,
the screening probability is near zero for the coverage of non-

Table 5. Coefficient estimates for the second stage in the choice-with-screening model
Posterior mean (std.) of 

Constant
AGE
GENDER
INCOME
EDUCTION
HEALTH
NCHRONIC
DIABETES
NOTE:

PREMIUM

DRUG

NON_NET

VISION

DEN

PX

COPAYDV

ADD

NPAK

AAPCC

−0.02
(−0.02)
−0.01
(0.00)
−0.02
(−0.03)
0.01
(−0.01)
0.00
(−0.01)
−0.02
(−0.01)
0.07
(−0.02)
−0.01
(−0.04)

−0.01
(−0.37)
0.04
(−0.07)
0.73
(−0.60)
−0.34
(−0.14)
0.47
(−0.17)
−0.14
(−0.33)
0.14
(−0.48)
−0.57
(−1.06)

−13.85
(−1.38)
−0.17
(−0.19)
−10.88
(−2.50)
−0.28
(−0.64)
−0.07
(−0.61)
1.65
(−1.12)
−0.51
(−1.71)
2.88
(−2.36)

0.03
(−0.71)
−0.29
(−0.12)
1.88
(−1.35)
−0.43
(−0.26)
0.19
(−0.35)
−1.23
(−0.78)
1.26
(−1.01)
2.59
(−1.50)

−5.53
(−0.92)
−0.17
(−0.11)
−0.30
(−1.46)
−0.31
(−0.27)
0.72
(−0.49)
−0.62
(−0.92)
1.68
(−0.85)
1.85
(−1.68)

−5.47
(−0.63)
−0.30
(−0.08)
0.90
(−1.08)
−0.21
(−0.25)
−0.47
(−0.44)
−0.24
(−0.85)
0.15
(−0.73)
5.92
(−1.25)

−0.30
(−0.05)
0.00
(−0.01)
0.38
(−0.13)
−0.05
(−0.03)
−0.03
(−0.03)
0.09
(−0.06)
−0.20
(−0.07)
0.34
(−0.14)

0.18
(−0.22)
0.07
(−0.04)
−0.75
(−0.53)
0.27
(−0.11)
−0.16
(−0.17)
−0.65
(−0.28)
0.41
(−0.28)
0.62
(−0.75)

0.81
(−0.16)
0.01
(−0.03)
−0.14
(−0.30)
0.01
(−0.05)
−0.07
(−0.08)
0.09
(−0.13)
−0.03
(−0.13)
−0.58
(−0.27)

0.02
(0.00)
0.00
(0.00)
0.001
(−0.01)
0.00
(0.00)
0.00
(0.00)
0.00
(0.00)
−0.01
(0.00)
−0.01
(−0.01)

Bold type indicates that more than 90% of the posterior draws have the same sign as the posterior mean.

Li and Trivedi: Medicare Health Plan Choices of the Elderly

network service and dental service. Among the discrete screening attributes, the routine physical exam coverage has the highest mean demand; this is not surprising, given that 94% of the
Medicare health plans cover some routine exam expenses, as
shown in Table 2.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

6.2 Coefficient Estimates for the Second Stage
of the Model
Estimates of the coefficient  is reported in Table 5. The table shows how demographic data affect an individual’s preference for plan attributes after screening. The second-stage choice
set includes the basic Medicare FFS along with Medicare health
plans that passed the screening.
The first row of Table 5 presents the estimate corresponding to the constant term in Dh , which represents the average
expected preference in the sample. It can be seen that after
screening, on average the elderly prefer a plan with network
restriction (NON_NET = 0), which is a traditional feature of
managed care plans. Surprisingly, in the second stage of the decision process, on average the elderly prefer a plan with less
generous dental coverage or less generous physical exam coverage. As expected, a lower copayment for the doctor/specialist
visit is preferred. Also, if an insurer were allowed to sell more
Medicare health plans or to obtain more AAPCC compensation,
then its plan would have a greater chance of being chosen.
The significant negative average tastes for non-network coverage and dental coverage can explain why the demand for them
is very low in the screening stage. On the other hand, once
the plans survive the screening, the elderly on average will not
care much about premium, drug coverage, and vision coverage.
This demonstrates that a plan attribute used to screen Medicare
health plans will play only a small role in the second stage of
the decision process. This finding is consistent with the study of
Gilbride and Allenby (2004). The average taste for the routine
physical exam coverage is negative and large, even though demand for this coverage is very high in the screening, indicating
how the lack of plans without this coverage affects results in the
two-stage parametric functional form.
The impact of demographic heterogeneity on individual preference, relative to the average preference, is described by the
second to the last row in Table 5. Elderly persons with higher
educational levels have a greater preference for drug coverage. The reason for this, as explained in Atherly, Dowd, and
Feldman (2004), is that some Medicare health plans have drug
coverage underpriced relative to the actuarial value, and moreeducated people can understand this better. Elderly persons
with poorer self-rated health (i.e., higher HEALTH level) or
with diabetes favor the non-network coverage more than the
overall Medicare population, consistent with favorable selection into the managed care plans. As expected, elderly persons
with diabetes have a strong desire for the vision coverage. They
also have less aversion for the routine physical exam coverage
than the overall Medicare population. For the doctor/specialist
copayment, elderly persons with more chronic diseases are
more sensitive, but those with poor self-rated health or diabetes
are less sensitive. The benefits in other plan attributes, as measured by the variable ADD, are more attractive to elderly persons who are older and richer and have better self-rated health
or more chronic diseases.

89

6.3 Comparison With the Conventional Random
Coefficient Multinomial Probit Model
We compared the choice-with-screening model with the conventional RCMNP model. The conventional RCMNP model
does not consider the screening rules (Walker and Ben-Akiva
2002); that is, it assumes that the alternatives will always pass
the screening. Thus this model can be considered as a screening
model with the restriction that no attributes are used to screen,
and it can be expressed by Equation (3.1)–(3.3), with the restriction that I(·) = 1 always holds. Specifically, for beneficiary
h, in year t, the choice of health plan j can be expressed as
yhtj = 1 if zhtj > zhtk for all k,
zhtj = x′htj β h + εhtj ,

εhtj ∼ iid N(0, 1),

β ′h = Dh ×  + eh ,

eh ∼ iid N(0, Vβ ),

where the variables and parameters have the same interpretation as in the screening model. Estimation follows steps 1–4 in
the algorithm presented in Appendix A, with I(·) set to unity.
Again, with diffuse priors, 50,000 posterior draws are generated, and every twentieth draw after the 40,000th iteration is
used in the estimation. Based on the final sample, the results
of estimation of the coefficient  in the conventional RCMNP
model are presented in Table 6.
Comparing the first row in Table 5 with the first row in Table 6 shows that the average preferences in the two models are
quite different for the attributes likely to be used as a screen. In
the RCMNP model, the average preferences for premium, drug
coverage, and vision coverage become statistically significant
and have a much larger magnitude. The average preferences for
non-network coverage and dental coverage, which are not likely
to be used in screening, are similar in the two models.
The estimation results show that the coefficient estimates
in the RCMNP model reflect a combination of the effects in
the two stages of the screening model. The estimates in the
RCMNP model mask the discontinuity and nonlinearity in the
individual preference for the plan attributes. When predicting
how plan attributes and demographics will affect plan choice,
the RCMNP model will give different results from the screening model. We discuss this in detail in the next section.
We also compare the two models in terms of goodness of fit
to the data. The log marginal density of the data, calculated using the harmonic mean method of Newton and Raftery (1994),
is equal to −210.66 in the screening model and −354.99 in
the RCMNP model. This provides evidence that the screening
model performs better than the RCMNP model, even though it
is less parsimonious. To investigate the predictive power of the
models and to reveal the discrepancies between the models and
the data, we also conduct posterior predictive check on the realized residuals, as suggested by Gelman et al. (2000). The plot
of realized residuals under the screening model is very similar
to that under the RCMNP model. Both plots show that the models do not have obvious discrepancies with the data, but that the
precision of estimation could be improved. This check does not
indicate a clear superiority of either specification. Details on the
posterior predictive check are provided in Appendix B.

90

Journal of Business & Economic Statistics, January 2012

Table 6. Coefficient estimates for the conventional RCMNP model
Posterior mean (std.) of 

Constant
AGE
GENDER
INCOME
EDUCTION
HEALTH

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:29 11 January 2016

NCHRONIC
DIABETES
NOTE:

PREMIUM

DRUG

NON_NET

VISION

DEN

PX

COPAYDV

ADD

NPAK

AAPCC

−0.09
(−0.01)
0.00
(0.00)
−0.02
(−0.02)
0.00
(0.00)
0.00
(0.00)
0.00
(−0.01)
0.02
(−0.01)
−0.01
(−0.02)

2.38
(−0.38)
0.01
(−0.05)
−0.68
(−0.75)
−0.06
(−0.17)
0.06
(−0.17)
0.42
(−0.27)
0.22
(−0.31)
−0.82
(−0.84)

−14.40
(−1.35)
−0.45
(−0.22)
−8.81
(−2.89)
−2.03
(−0.74)
2.00
(−0.68)
2.29
(−1.41)
1.16
(−1.51)
−4.63
(−2.85)

5.63
(−0.78)
−0.20
(−0.12)
0.01
(−1.58)
−0.09
(−0.32)
0.26
(−0.42)
−0.28
(−0.68)
1.25
(−1.09)
1.07
(−1.58)

−6.39
(−0.71)
0.16
(−0.11)
0.77
(−1.10)
0.10
(−0.29)
0.58
(−0.36)
−1.62
(−0.71)
1.32
(−0.70)
4.30
(−1.27)

−1.48
(−0.94)
0.08
(−0.11)
2.84
(−2.14)
0.92
(−0.34)
−0.85
(−0.73)
−0.88
(−0.83)
−2.36
(−1.03)
7.23
(−1.85)

−0.23
(−0.06)
0.02
(−0.01)
0.34
(−0.11)
0.02
(−0.02)
−0.02
(−0.03)
0.02
(−0.06)
−0.16
(−0.07)
0.39
(−0.12)

0.25
(−0.17)
0.01
(−0.03)
−0.66
(−0.38)
0.12
(−0.10)
−0.04
(−0.10)
−0.09
(−0.18)
0.41
(−0.29)
−0.15
(−0.40)

1.04
(−0.19)
−0.06
(−0.03)
−0.83
(−0.37)
−0.08
(−0.09)
−0.11
(−0.10)
0.17
(−0.18)
0.24
(−0.21)
−0.79
(−0.45)

−0.02
(0.00)
0.00
(0.00)
0.02
(−0.01)
0.00
(0.00)
0.00
(0.00)
0.00
(0.00)
0.00
(0.00)
0.00
(−0.01)

Bold type indicates that more than 90% of the posterior draws have the same sign as the posterior mean.

6.4 Effects on Choice Probability
In this section we analyze how plan attributes and demographics affect the plan choice. The discussion focuses