Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji 073500106000000639

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Analytical Bias Reduction for Small Samples in the
U.S. Consumer Price Index
Ralph Bradley
To cite this article: Ralph Bradley (2007) Analytical Bias Reduction for Small Samples in the
U.S. Consumer Price Index, Journal of Business & Economic Statistics, 25:3, 337-346, DOI:
10.1198/073500106000000639
To link to this article: http://dx.doi.org/10.1198/073500106000000639

Published online: 01 Jan 2012.

Submit your article to this journal

Article views: 35

View related articles

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 12 January 2016, At: 23:07

Analytical Bias Reduction for Small Samples in
the U.S. Consumer Price Index
Ralph B RADLEY
U.S. Bureau of Labor Statistics, Washington, DC 20212 (bradley.ralph@bls.gov )
The U.S. Bureau of Labor Statistics (BLS) faces sample size constraints when computing its Consumer
Price Index (CPI–U). The samples are not sufficiently large for the index to equal a true “fixed basket”
price index. This study adjusts for this small-sample bias by estimating the second order of a stochastic
expansion of the index. Unlike increasing sample size, this adjustment is inexpensive because it uses the
same data used to compute the CPI–U. From the beginning of 1999 to the end of 2003, we estimate
that 63% of the difference between the BLS superlative index (C–CPI–U) and the CPI–U is the result of
finite-sample bias, and the other 37% is commodity substitution bias.
KEY WORDS: Jevons price index; Laspeyres price index; Stochastic expansions.

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016


1. INTRODUCTION
Properly measuring inflation is an essential ingredient in
refereeing U.S. economic policy debates, setting optimal contracts, guiding monetary policy, and determining the financing
needs of all levels of government. Although the U.S. Consumer
Price Index (CPI–U) is perhaps the most widely used national
inflation measure, it has many shortcomings. This study looks
at one source of bias in the CPI–U. To compute the CPI–U, the
Bureau of Labor Statistics (BLS) takes many small samples,
and these small samples induce an upward bias in the CPI–U.
The CPI–U is a two-stage index. In the second stage, the
“all-items” CPI–U is computed as a weighted average of 8,018
subindexes called price relatives, or relatives. A relative is a
price index for a group of commodities or services (called
items) within a metropolitan area (called area). There are 38
geographic areas, called primary sampling units (PSUs), and
211 items that generate 38 × 211 = 8,018 cells. The relatives
are computed in the first stage. Examples of the so-called item–
areas are apples in Boston, rental housing in Seattle, and so on.
BLS is limited in the number of prices that it can collect for
each item–area, and sample sizes as small as 1 can be used to

compute a relative. The relative is a nonlinear moment of the
collected prices, and whereas the relative asymptotically converges to its true value, BLS samples are not large enough to
achieve this consistency. Therefore, the CPI–U is a weighted
average of relatives that suffer from finite-sample bias.
This study proposes a low-cost method to reduce this finitesample bias by using the same data used to compute the CPI–U.
Although this method improves the accuracy of the CPI–U, it is
not as good a solution as the more costly alternative of increasing the item–area’s sample size. Increasing sample size not only
will reduce bias, but also will eliminate the variance effects of
sampling error. This low-cost bias-adjustment method is based
on a second-order approximation and does not purge sampling
error effects on the variance.
Finite-sample bias in the relative has been widely reviewed.
Before 1998, all CPI–U relatives were Laspeyres-type indexes
that were ratios of averages. Early studies used methods described by Cochran (1963), who he reviewed the finite-sample
bias from ratio estimation. Studies using these methods to derive finite-sample bias for price indexes include those of Kish,
Namboordi, and Pillai (1962) and McClelland and Reinsdorf
(1997). After 1998, most of the CPI–U relatives changed from a

Laspeyres form to a geometric mean or Jevons form. Unlike the
Laspeyres-type relatives, the finite-sample bias of the geometric mean relative is systematically positive because of Jensen’s

inequality.
This study commenced after the BLS switch to the geometric
mean relative, unlike the earlier ones mentioned here. Because
finite-sample bias from a geometric mean relative is no longer a
problem of ratio estimation, here we do not use the ratio method
of the earlier studies; instead, we adjust for finite-sample bias
by using the second-order term in stochastic expansion of the
relative around its true parameter value to adjust the price relative. Our methods closely resemble those outlined by Hahn and
Newey (2004) and Rilstone, Srivastava, and Ullah (1996). The
basic intuition behind this analytical bias adjustment is that we
are using a second moment of prices and the curvature of the
index function as additional estimation information. Stochastic
expansion theory is the starting point for bootstrap theory that
adjusts for bias and improves confidence interval estimation
with Edgeworth expansions (see Hall 1992). This idea is often
used in other econometric problems, such as weak instrumental
variables (Hahn and Hausman 2002) or correcting for bias in
nonlinear panel data models (Hahn and Newey 2004). Analytical bias adjustment is not the only method of finite-sample bias
correction. In earlier work (Bradley 2001), a simulated bootstrap correction was done only for food and home fuel items.
However, doing a simulated bootstrap correction on the entire

set of samples requires 8,018 bootstraps per month. Even in
today’s environment of high-speed chips, this is still computationally intractable.
BLS uses three different methods to compute relatives. The
most widely used method is the geometric mean. In this study,
we estimate that 96% of the finite-sample bias adjustments in
the CPI–U comes from the relatives computed by a geometric mean. The other methods used to compute relatives are the
Laspeyres and the sixth root of a Laspeyres for certain housing items. Although housing has the largest expenditure weight
in the CPI–U, it contributes very little to the finite-sample bias
of the overall CPI, because the housing samples are relatively
large, and there is, surprisingly, much less variance in rents than

337

In the Public Domain
Journal of Business & Economic Statistics
July 2007, Vol. 25, No. 3
DOI 10.1198/073500106000000639

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016


338

in the price of other goods and services. Food and apparel are
the two groups that contribute the most to the finite-sample bias
of the CPI–U.
There are several sources of bias in the CPI; finite-sample
bias is just one of them. Lebow and Rudd (2003) gave the most
up-to-date and comprehensive review of the various sources of
bias or measurement error. It is important that finite-sample bias
in the CPI–U not be confused with commodity substitution bias
occurring because the Laspeyres-type form of the upper-level
CPI–U does not allow for commodity substitution across items
when there is a change in the ratio of price relatives. Lebow
and Rudd (2003) and Shapiro and Wilcox (1997) measured
commodity substitution bias by taking the difference between
the CPI–U and a superlative price index. Earlier work (Bradley
2001) and the present study show that this is not the correct
measure of commodity substitution bias, because finite-sample
bias makes the CPI–U a biased estimator for a Laspeyres or
“fixed-basket” price index. In earlier work (Bradley 2001) and

here, proof is given that some superlative indexes, such as the
Törnqvist index, do not suffer from finite-sample bias in the
same way as a Laspeyres index does. Thus the difference between the CPI–U and a superlative index estimates the sum
of two sources of bias: commodity substitution bias and finitesample bias. Thus the overall bias estimates generated in these
previous studies are not affected.
Significant substitution opportunities cannot be expected
among all of the 8,018 item–areas. For instance, suppose that
the price of bananas in Philadelphia increases; Philadelphia
residents will not substitute any item sold in another city for
bananas in Philadelphia. At best, there are a few other items
within food in Philadelphia that they can substitute in response
to the price increase in bananas. If consumers do not substitute for items outside their geographic reach, this means that if
the price of any 1 of the 211 items increases, then at most they
have substitution opportunities with only 0. Therefore, for finite samples, 
RGit .
RGit is a more precise estimator for exp(µGit ) than 
The first term on the right side of (14) shows that adjusting the
relative by 
BGit will not purge the variance in the relative that
comes from sampling error.

For the Laspeyres relative, we get the second-order expansion

=

1

E(pijt−1 )



ni

j=1



pijt /ni − E(pijt )


 n

i
E(pijt ) 
pijt−1 /ni − E(pijt−1 )

E(pijt−1 )2
j=1


+

i
i
[ nj=1
pijt /ni − E(pijt )][ nj=1
pijt−1 /ni − E(pijt−1 )]
E(pijt−1 )2

i
pijt−1 /ni − E(pijt−1 )]2
E(pijt )[ nj=1

E(pijt−1

)3

 −3/2 
+ O p ni
.

2 /n
E(pijt )σit−1
i

E(pijt−1 )3



 −3/2 
σit,t−1 /ni
. (15)
+ O p ni

E(pijt−1 )2

2 , and σ
The sample estimates for E(pijt ), E(pijt−1 ), σit−1
it,t−1 are


pit =

(13)

and the bias-adjusted geometric mean relative is 
RGit = 
RGit −

BGit . We can now state the following result, the proof of which
is given in the Appendix.


RLit − µLit

=

and


σit,t−1 =

ni



pit−1 =

pijt /ni ,

j=1

ni

j=1 (pijt

ni


pijt−1 /ni ,

j=1

−
pit )(pijt−1 −
pit−1 )
ni − 1


σi,t−1 =

ni

j=1 (pijt−1

−
pit−1 )2

ni − 1

,

.

The Laspeyres bias correction is approximated as

BLit =

2 /n

pit
σit−1
i


p3it−1




σit,t−1 /ni
.

p2it

(16)

The bias-adjusted Laspeyres is then 
RLit = 
RLit − 
BLit . This
leads to the following result.

Proposition 2. If (a) pijt is iid across j and (b) E(p4ijt ) < ∞,
2 ).
2 such that n3/2 (
BLit −BLit ) → N(0, γLit
then there is a finite γLit
i

We can go through the same second-order expansion for the
housing relative to get




pit 1/6

BHit = (1/2)

pit−6



2 /n
σit−6
σit2 /ni
σit,t−6 /ni
2 
5 
7 
i
. (17)


×
36 
36 
pit−6
pit
36 
p2it−6
p2it
2.4 Effects on the All-Items Index

The month-to-month “all-items” CPI–U based on the BLS
samples is







Rt =
RHit wHit−1 , (18)
RLit wLit−1 +
RGit wGit−1 +
i∈G





i∈L



i∈H

where i∈G , i∈L , and i∈H denote the sum of geometric
mean, Laspeyres, and housing relatives and wGit−1 , wGit−1 , and
wHit−1 are corresponding expenditure weights for period t − 1.
Define the “true population” CPI–U as



RHit wHit−1 , (19)
RLit wLit−1 +
Rt =
RGit wGit−1 +
i∈G

i∈L

i∈H

where RMit is the asymptotic relative of method M = {G, L, H}.
If all of the relatives were geomean (i.e., L = H = ∅), then we
would get the unambiguous result that the “all-items” CPI–U
has positive bias [E(
Rt − Rt ) > 0]. Let N be the total number
of item–area cells; if all of the relatives were computed by the
geometric mean and the variance of the sample means for each
relative were >0, then we would get plimN→∞ 
Rt − Rt > 0. But

Bradley: Analytical Bias Reduction for Small Samples

341

because of the relatives are not geometric means, we cannot get
this unambiguous result. The bias-adjusted “all-items” index is



Rt =
(
RLit − 
BLit )wLit−1
(
RGit − 
BGit )wGit−1 +
i∈L

i∈G

+


(
RHit − 
BHit )wHit−1 .

(20)

i∈H

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016

In the previous section, we showed that whereas both 
RMit and



RMit = RMit − BMit (M = {G, L, H}) have the same asymptotic
properties, for finite samples, 
RMit is a more precise estimator
for RMit .
It might be concluded that we need to make the same biasadjustment “plug in” for a superlative index such as a Törnqvist.
But this is not correct. To show this, we start with a simplified
Törnqvist functional form,
Tt =

N

(Rit )wit ,

(21)

i=1

where Rit is the true price relative between periods t and t − 1
for item–area i and wit is its expenditure share weight, which
is a simple average of expenditure shares in period t and t − 1.
Suppose that Tt is estimated by

Tt =

N

(
Rit )wit ,

(22)

i=1

and for simplicity assume that all 
Rit {i = 1, . . . , N} are estimated by geometric mean indexes with fixed sample sizes,
ni {i = 1, . . . , N}, that are sufficiently small so that var(
Rit ) > 0
for all i. Then E[ln(
Rit ) − ln(Rit )] = 0, even though E[
Rit −
Rit ] > 0. 
Tt can be rewritten as
 N



Tt = exp
wit ln(
Rit ) .
i=1

−1/2
Because wit = Op (N −1 ) and ln(
Rit ) = Op (ni
), and, given
E[ln(
Rit )] = ln(Rit ), we get that for a fixed ni {i = 1, . . . , N},

plim

N


N→∞ i=1

wit ln(
Rit ) = plim

N


N→∞ i=1

wit ln(Rit ) = θ.

(23)

N


Letting 
θ= N
i=1 wit ×
i=1 wit ln(Rit ) and θ = plimN→∞
ln(Rit ), we use a well-known result to conclude that
plimN→∞ 
θ = θ implies that
plim 
Tt − Tt = 0,

(24)

N→∞

even though for the Laspeyres index, plimN→∞ (
Rt − Rt ) > 0
when all ni ’s are fixed and all of the relatives are geomean.
BLS publishes a “Törnqvist-type” index called the C–CPI–U.
For the CPI–U and the C–CPI–U, N is 8,018. This is an adequate size for assuming that the asymptotic properties of a
Törnqvist are satisfied.
The C–CPI–U is a Törnqvist-type index, but some of the relatives are not geometric means. Nonetheless, in the empirical
section of this article (Sec. 4), we show that 96% of the finitesample bias in the “all-items” CPI–U can be attributed to the
geometric mean indexes. Thus plugging bias adjustments into
the C–CPI–U will most likely induce a negative bias.

2.5 Additive Decomposition of the Indexes
and Bias Adjustments
The “all-items” CPI–U is the upper chained index,





It =
RGit wGit−1 +
RLit wLit−1
i∈G

i∈L

+


i∈H




RHit wHit−1 
It−1 .

(25)

After computing the bias adjustment, we calculate the biasadjusted CPI–U index as



It =
(
RGit − 
BGit )wGit−1 +
(
RLit − 
BLit )wLit−1
i∈G

i∈L






I t−1 .
+
(RHit − BHit )wHit−1 

(26)

i∈H

For both 
It and 
I t , we set the index at t = December 1998 equal
to 1. The final period, denoted as T, is December 2003. Note
that (26) is chained over time, and we have not accounted for
any of the time series properties of the prices.
From here on, we drop the G, L, and H subscripts so that the
relative and the bias correction are now 
Rit and 
Bit . There are
eight major groups in the “all items” CPI–U. Each group is a
set of similar items; for example, the food group includes the
banana, meat, cereal, and diary items. These eight BLS groups,
with the designated letter in parentheses denoting the set of all
items within the group, are as follows:

Food and beverages (F), Housing (H), Apparel (A),
Transportation (T), Medical care (M), Recreation (R),
Education and
 communication (E), Other goods and
services (G) .

We let G = {F, H, A, T, M, R, E, G} denote the set of groups.
Because different groups have different price distributions and
sample sizes, it is useful to disaggregate both the indexes and
the bias adjustments to determine how each group contributes
to these statistics. Because the indexes in (25) and (26) are
chained, additive decomposition of the “all-items” CPI–U bias
adjustment into the eight groups is not straightforward. To find
the percent contribution of each group to the average monthly
indexes and bias adjustment over the T periods, we need the
following proposition, which proven in the Appendix.

Proposition 3. Monthly average growth in the CPI–U and
the bias-adjusted CPI–U is (
IT )1/T − 1 and (
I T )1/T − 1. The
following identity holds:

1/T
1/T

I T −
IT


T

1/T
1/T


Rt m(
Rt ,
IT )
Rt ,
IT )
Rt m(
− T
=
T
 1/T
 1/T
t=1 m(Rt , I T )
t=1 m(Rt , I T )
t=1




i wit−1 Bit
×
,

Rt − 
Rt

Rit wit−1 ),
where m(a, b) = (ln(a) − ln(b))/(a − b), 
Rt = ( i 


Rit − 
Bit )wit−1 ).
and Rt = ( i (

342

Journal of Business & Economic Statistics, July 2007

As a corollary, we can state the following.
1/T
to 
IT

1/T
−
IT

Corollary 1. The contribution
by a group J ∈
G is


T

1/T
1/T


Rt ,
IT )
Rt ,
IT )
Rt m(
Rt m(
− T
T
 1/T
 1/T
t=1 m(Rt , I T )
t=1 m(Rt , I T )
t=1




i∈J wit−1 Bit
.
×

Rt
Rt − 
This allows us to additively decompose the total bias adjust1/T
1/T
ment, 
I T −
I T , into the contribution made by each group.

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016

3. MONTE CARLO SIMULATION

To verify the properties established in Section 2, we conducted a Monte Carlo experiment over a 12-month period. The
parameters of this Monte Carlo experiment are calibrated to
the published CPI–U. Historically, the average sample sizes for
the geometric mean, Laspeyres, and housing relatives are 10,
12, and 40; thus in each of the 12 time periods, we draw repeated samples of size 10 to compute repeated geomean relatives, of size 12 to compute repeated Laspeyres relatives, and of
size 40 to compute repeated housing relatives. Historically, the
geomean, Laspeyres, and housing relatives have averaged 71%,
11%, and 18% of the expenditure weight in the “all-items”
CPI–U. We use these upper-level weights in this experiment.
The average variance of the month-to-month change in the log
of prices is approximately .0025, and the annual CPI–U is close
to 3%. In the Monte Carlo experiment, we sample the difference in log prices from an iid N(.002463233, .0025), so that
the mean compounded 12-month price growth is 3%. The true
population relative for each period is then exp(.002463233).
We repeat this process 2,000 times; this repetition is indexed
as r. Let 
RM,t,r be the relative computed from the Mth method
(M = geomean, Laspeyres, housing), and let 
BM,t,r be its corresponding bias adjustment computed from the sample moments.
Table 1 reports the average of 
RM,t,r − exp(.002463233),

BM,t,r , and 
RM,t,r − exp(.002463233) in percentage terms
(
RM,t,r = 
RM,t,r − 
BM,t,r ). Even though each of the methods
faces the same data-generating process, the geomean has the
largest bias. This cannot be explained entirely by sample size,
because the sample size of the Laspeyres has only two more observations, but the average bias is 35% less than the geomean.
For the housing index, there is almost no bias. Therefore, the
difference in functional form between the geomean and the
Laspeyres plays a role in finite-sample bias. This provides evidence indicating that when BLS changed from a Laspeyres relative to a geomean, finite-sample bias became a greater problem.

The geomean bias adjustment reduces the bias by 83%. The
Laspeyres slightly overadjusts, and the housing correction is ineffective. Because there was so little bias in the housing relative,
we decided to reinvestigate the housing adjustment by conducting another simulation in which the sample size was 10 and
the variance of the log of prices was increased to .64. We show
the results in the row titled “Alternative housing.” Here the bias
correction does perform better, but not as well as the geomean
adjustment.
To investigate the impact of sampling error, Table 2 shows
the performance of the estimated standard deviations of the log
price for the geomeans and the price for both Laspeyres and
housing. Because the difference in log prices is drawn from
N(.002463233, .0025), the √
true standard deviation of the difference of log of prices is .0025 = .05, and for prices it is
{exp(.00492t + .0050t) − exp(.00492t + .0025t)}1/2 . Table 2 reports the results for the differences between the simulated standard deviations and the true standard deviations. As expected,
the “Mean difference” column shows that the sample estimate
is unbiased; however, the last three columns show that sampling
error exists. In other words, on average we get the population
standard deviation; however, for a particular sample there will
be some error. For the geomean, >95% of the draws produce
standard deviations in the interval (.045, .055), where the true
standard deviation is .05, and although this does not allow for
perfect bias adjustment, there is noticeable improvement. This
provides evidence indicating that the bias-adjusted relative is a
more precise estimator of the true relative than the unadjusted
relative.
Finally, for each of the 2,000 repetitions, we compute an
“all-items” yearly index by an expenditure weighted sum where
the geomeans relative is given a 71% weight, the Laspeyres is
given an 11% weight, and the housing is given a 18% weight.
We do the same for the bias-adjusted relative. The average difference between the unadjusted yearly index and the true index is .34%, whereas the average difference between the biasadjusted index and the true index is .09%. (These results closely
resemble the results obtained using the actual BLS data base, as
described in the next section.)
This experiment provides evidence that bias adjustment
based on sample estimates of second-order approximations
does not completely remove the bias. Sampling error remains a
problem; however, this bias adjustment does greatly lower the
bias.
4. RESULTS FOR THE CPI–U
We computed both the CPI–U (25) and the bias-adjusted
CPI–U (26). Table 3 presents the annual and cumulative re-

Table 1. Simulation results: Finite-sample bias, the adjustment and bias for adjusted relative
Relative type index (weight)
Geomeans (71%)
Laspeyres (11%)
Housing (18%)
Alternative housing

Sample size

Average bias of initial relative

Average adjustment

Average bias of adjusted relative

10
12
40
10

.0300%
.0195%
.0008%
.2300%

.0250%
.0209%
.0002%
.1730%

.0050%
−.0014%
.0006%
.0570%

NOTE: These are the averages of 2,000 replications of 12-month draws. Using the notation of the text, the column labeled “Average bias of initial relative” is the mean of 
RM,t,r −RM,t,r ,
where RM,t,r = exp(.002463), for all M, t, r. The column labeled “Average adjustment” is the average of 
BM,t,r , and the last column reports 
RM,t,r − RM,t,r .

Bradley: Analytical Bias Reduction for Small Samples

343

Table 2. Simulated difference between the sample standard deviations and the true standard deviations
Relative type

Mean difference

Standard deviation of difference

Minimum difference

Maximum difference

Geomenans
Laspeyres
Housing

−6.74E−05
2.22E−06
−8.06E−07

.0023601
.0011421
.000601

−.00494566
−.00250337
−.00177075

.020122428
.008251568
.003574577

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016

NOTE: In this simulation, the difference in log prices is drawn from a N(.002463233t, .0025), where t is the month number. Because the accuracy of the bias adjustment depends on
the accuracy of the variance estimator, the table examines the distribution of the difference of the sample estimates of the variance and the true variance.

sults over the 60-month period from December 1998 to December 2003. It also gives the results for BLS’s recently published
“Törnqvist-type” index, the final C–CPI–U. The “(CPI–U −
C–CPI–U)” column gives the difference between the published
CPI–U and the “Törnqvist-type” index. This difference fluctuates positively with the underlying inflation rate, ranging from
.28% to .79%. The next columns decompose this difference
between the difference of the CPI–U and the bias-adjusted index and the difference between the bias-adjusted index and the
C–CPI–U. The first difference can be attributed to finite-sample
bias; the second difference may be attributed to commodity
substitution bias. Over the 5-year period, these estimates attribute on average 62.5% of the difference between the published CPI–U and the C–CPI–U to finite-sample bias and the
rest to commodity substitution bias. Because these estimates
are based on second-order asymptotics, they do not completely
eliminate bias. As mentioned earlier, we ignore the time series
properties of the data when using the adjustment.
Table 3 gives the additive decompositions based on the
method in Section 2.5. Table 4 lists the group contribution to
the CPI–U, the bias-adjusted CPI, and the bias adjustment. Note
that the group contribution to each of the indexes is the same;
this occurs because there generally is very little difference between the unadjusted and adjusted relatives. The round-off error hides the slight difference. Although food and apparel contribute only 20.5% to the indexes, they contribute 65.8% to
the bias. This is because both food and apparel have unusually
volatile prices, due in part to the frequency of sales. The bias
for the geomean indexes is a positive function of the variance of
the growth in the log of prices. Over the 5 years that this study
was conducted, the average sample estimates of the variance of
the growth log of prices for apparel, food, and transportation
were .0028, .0016, and .00029. Table 5 lists the contribution

of the different methods. (Note that the housing group index is
different from the housing relative method. The housing group
includes home heating fuel, cleaning supplies, and various services; these items are not computed with the housing method.)
From the result of the Monte Carlo simulation, it is not surprising to find that 96% of the bias comes from the relatives based
on the geomean method. Combining this result with the theoretical results in Section 2.4 provides evidence indicating that
“plugging in” bias-adjusted relatives into the C–CPI–U would
induce additional bias. Table 6 breaks down the “all-items” index into core and noncore parts. The core includes all items except non-alcohol–related food and beverages and energy items,
such as motor and heating fuel. The purpose for the core index is to remove items that are highly volatile over time but not
necessarily volatile within an item–area. However, the volatility
here is volatility over time rather than within an item area. The
variances of the sample means for log prices of energy items
are smaller because their sample sizes are larger. The average
sample size for an energy item–area is 28, almost three times
the sample size of a typical geomean index. Although energy
prices fluctuate widely over time, there is relatively little price
variation within time.
There is a large difference between a group’s contribution to
the index and its contribution to the bias. This is because the
bias adjustment varies widely by group. Looking at Table 7,
the bias adjustment for apparel is the largest, whereas its share
of the index is small. On the other hand, the bias adjustment
for housing is smaller than average, whereas its share of the
index is large. The food group is the largest contributor to the
bias adjustment. Although its average bias adjustment is less
than that of apparel, it has a higher expenditure share. Both the
apparel and food group relatives are computed with geomean

Table 3. Summary statistics for the CPI–U, the bias-adjusted CPI–U, and the CPI–C

Period from
Dec–Dec
1998–1999
1999–2000
2000–2001
2001–2002
2002–2003
Entire 5-year
period

(C–CPI–U)

(CPI–U) −
(CPI–C)

(CPI–U) −
(bias adjusted
CPI–U)

(Bias adjusted
CPI–U) −
CPI–C

2.335%
3.090%
1.300%
2.107%
1.585%

2.139%
2.600%
1.267%
2.021%
1.509%

.524%
.790%
.280%
.379%
.360%

.329%
.300%
.248%
.294%
.285%

.195%
.490%
.033%
.086%
.075%

62.7%
38.0%
88.3%
77.4%
79.0%

37.3%
62.0%
11.7%
22.6%
21.0%

10.851%

9.902%

2.531%

1.582%

.949%

62.504%

37.496%

CPI–U

Bias-adjusted
CPI–U

2.663%
3.390%
1.548%
2.400%
1.870%
12.433%

% Difference
from finitesample bias

% Difference
from commodity
substitution bias

NOTE: The column labeled “CPI–U” lists the 12-month growth in the published CPI–U. The column labeled “Bias-adjusted CPI–U” lists the 12-month change of the bias-adjusted
CPI–U where the bias adjustment for the lower level geomeans relative is given in (13), the Laspeyres adjustment is given by (16), and the housing adjustment is given by (17). The
column labeled “CPI–C” lists the 12-month growth in the BLS’s published chained CPI–C. The next three columns just list the differences in the first three. The last two columns lists
the percent of the difference between the CPI–U and CPI–C comes from finite-sample bias and from substitution bias.

344

Journal of Business & Economic Statistics, July 2007

Table 4. Contributions to index growth and bias adjustment
Group
Group
Group
contribution
contributioncontribution to
CPI–U
adjusted CPI–U bias adjustment

Group name
Apparel
Education
Food and beverage
Housing
Medical
Recreation
Transportation
Other

4.5%
5.5%
16.0%
40.4%
5.8%
6.0%
17.3%
4.6%

4.5%
5.5%
16.0%
40.4%
5.8%
6.0%
17.3%
4.6%

25.6%
3.1%
40.2%
15.7%
1.2%
6.2%
4.4%
3.5%

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016

NOTE: The table lists the percent contribution to the cumulative 5-year growth rates for
the CPI–U, the bias adjusted CPI–U, and the difference between the two. The decomposition method is described in Proposition 4.

indexes. Table 8 gives a yearly breakdown by core and noncore items. This shows the high “over-time” volatility of energy, which contrasts with the relatively small “within-time”
variability exhibited in Table 6.
5.

CONCLUSIONS

The currently published CPI–U is an upward-biased estimate
of a “fixed-basket” price index; therefore, the difference between the CPI–U and a superlative index cannot be attributed
entirely to commodity substitution bias. The CPI–U’s finitesample bias can be reduced using the same data used to initially
generate the index. Therefore, it is less costly.
On a year-to-year basis, it is not possible to predict the reduction in the CPI–U if it is adjusted for finite-sample bias, but in
the 5 years of this study, the CPI–U is reduced by .27% on average. However, the bias adjustments are unpredictable, because
they are based on the variance of prices within a cell, and these
variances change unpredictably over time; they are also derived
from a second-order approximation and thus do not completely
eliminate bias, and their effectiveness in bias reduction depends
on the time series properties of the collected prices.
Analytical bias reduction is not the only method that can be
used to adjust for finite-sample bias, but it requires less computation than bootstrapping and is less expensive than expanding
sample sizes. However, if analytical bias reduction were implemented, then the final relatives that are used in the CPI–U would
differ from those used in the C–CPI–U.
ACKNOWLEDGMENTS
The author thanks the referees, John Greenlees, Tim Erickson, Ronald Johnson, and Elliot Williams, for their feedback,
Table 5. Method type contributions to index growth and
bias adjustment
Relative Type contribution Type contribution- Type contribution
type
CPI–U
adjusted CPI–U
to bias adjustment
G
L
H
NOTE:

71.1%
17.7%
11.3%
See the note for Table 4.

71.1%
17.7%
11.3%

96.5%
2.5%
1.0%

Table 6. Core type contributions to index growth and bias adjustment
Relative Type contribution Type contribution- Type contribution
type
CPI–U
adjusted CPI–U
to bias adjustment
Core
Energy
Food*
NOTE:

77.8%
7.1%
15.3%

77.6%
7.1%
15.3%

59.7%
.8%
39.8%

See the note for Table 4.

Table 7. Annual growth in group indexes
Annual growth
CPI–U

Bias adjusted

Difference

1999
Apparel
Education
Food and beverage
Other
Housing
Medical
Recreation
Transportation

−.47%
1.58%
2.00%
5.08%
2.17%
3.66%
.79%
5.39%

−2.08%
1.03%
1.24%
4.83%
2.08%
3.57%
.34%
5.30%

1.61%
.55%
.76%
.25%
.09%
.10%
.45%
.09%

2000
Apparel
Education
Food and beverage
Other
Housing
Medical
Recreation
Transportation

−1.70%
1.32%
2.73%
4.17%
4.28%
4.18%
1.67%
4.08%

−3.23%
1.24%
1.90%
3.90%
4.19%
4.14%
1.42%
3.99%

1.53%
.08%
.83%
.27%
.09%
.05%
.25%
.09%

−4.66%
3.13%
2.08%
4.29%
2.83%
4.67%
1.32%
−3.88%

1.40%
.06%
.70%
.23%
.08%
.05%
.25%
.07%

2002
Apparel
Education
Food and beverage
Other
Housing
Medical
Recreation
Transportation

−3.26%
3.20%
2.77%
4.52%
2.91%
4.73%
1.57%
−3.82%
−1.76%
2.13%
1.51%
3.30%
2.35%
5.04%
1.11%
3.82%

−3.57%
1.99%
.80%
3.10%
2.24%
5.00%
.75%
3.77%

1.80%
.14%
.72%
.20%
.11%
.05%
.35%
.05%

2003
Apparel
Education
Food and beverage
Other
Housing
Medical
Recreation
Transportation

−2.05%
1.55%
3.50%
1.48%
2.23%
3.69%
1.09%
.35%

−3.87%
1.47%
2.76%
1.31%
2.10%
3.62%
.87%
.28%

1.82%
.09%
.74%
.17%
.12%
.07%
.22%
.07%

Group

2001
Apparel
Education
Food and beverage
Other
Housing
Medical
Recreation
Transportation

NOTE: This table lists the December-to-December growth in the eight subindexes in the
CPI–U. The column labeled “Bias adjusted” lists the growth of the bias adjusted CPI–U. It
is important to note that Table 2 lists the contribution to the CPI–U whereas this table lists
the subindexes.

Bradley: Analytical Bias Reduction for Small Samples

345

Table 8. Annual growth in core and noncore indexes

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016

Type

Annual growth CPI–U

Bias adjusted

Difference

1999
Core
Energy
Food

1.936%
13.400%
1.971%

1.677%
13.372%
1.188%

.260%
.028%
.783%

2000
Core
Energy
Food

2.543%
14.215%
2.741%

2.339%
14.173%
1.884%

.204%
.043%
.857%

2001
Core
Energy
Food

2.754%
−13.055%
2.828%

2.572%
−13.093%
2.110%

.181%
.038%
.718%

2002
Core
Energy
Food

1.915%
10.720%
1.497%

1.691%
10.692%
.759%

.224%
.028%
.739%

2 /n | is O (n−3/2 ):
We need only show that |
BGit − exp(µGit )σGit
i
p i
2

/ni
BGit − exp(µGit )σGit

2
2
= (1/2) exp(
µGit )
σGit
/ni − (1/2) exp(µGit )σGit
/ni
n

i


2
1
ln(pijt /pijt−1 ) − 
µGit exp(
µGit )
=
2ni (ni − 1)

j=1

2003
Core
Energy
Food

1.110%
6.914%
3.580%

.894%
6.881%
2.811%

.216%
.033%
.769%

2
− (1/2) exp(µGit )σGit
/ni
 ni
µGit )2
1
j=1 (ln(pijt /pijt−1 ) − 
exp(
µGit )
=
2ni
ni − 1

2
− exp(µGit )σGit

=

−1/2

We need to show that A(ni ) = Op (ni
). We do this by taking
2 ,
a first-order expansion of A(ni ) around µGit and σGit
2
2
σGit
− σGit
)
A(ni ) = exp(µGit )(
2
+ σGit
exp(µGit )(
µGit − µGit ) + Op (n−1 ),

NOTE: This table lists the December-to-December growth rates in the core and noncore
CPI–U and the bias-adjusted CPI–U.

and Uri Kogan for his careful proofing and review of the manuscript. The views expressed here are solely those of the author
and do not reflect either the policies or the procedures of the
U.S. Bureau of Labor Statistics.
APPENDIX: PROOFS

From (10), the following holds:

Proof of Proposition 4


RGit − exp(µGit )

n
i

= exp(µGit )
ln(pijt /pijt−1 )/ni − µGit
j=1



− (1/2) exp(µGit )

ni

j=1

ln(pijt /pijt−1 )/ni − µGit

2

Passing through the expectations operator, we get
BGit = E[
RGit − exp(µGit )]
=

E(An ) = O(n−1 ). Using the condition (b) that |E(ln(pijt /
2 − σ 2 ) = O (n−1/2 ), and (
σGit
µGit − µGit ) =
pijt−1 )4 | < ∞, (
p i
Git
−1/2
1/2
Op (ni
). This establishes that |A(ni )| = Op (ni ). Because
2
3/2
var(
σGit ) and var(
µGit ) are bounded, then var(n (
BGit −BGit ))
is bounded and because pGit is iid, we can use the Levy–
2
Lindberg central limit theorem to conclude that there is a γGit
d
2 ).
such that n3/2 (
BGit − BGit ))) → N(0, γGit
The proof of Propositions 2 and 3 follow the same process
and thus are omitted here.

Proof of Proposition 1

 −3/2 
.
= O p ni

1
A(ni ).
2ni

2
(1/2) exp(µGit )σGit
/ni

Reinsdorf, 
Diewert, and Ehemann (2002) showed that the
index PG =  ni=1 (pit /pit−1 )wi has the additive
decomposin
n
tion PG =
i=1 wi pit /m(pit , PG pit−1 )/( i=1 wi pit−1 /m(pit ,
PG pit−1 )). We can use this to additively decompose both
1/T
1/T

I
into the contribution of each month. Because
I T and 
T T

1/T
1/T


I T = t=1 (Rt )1/T and 
I T = Tt=1 (
Rt )1/T , we can additively
decompose the months as
T
T

 1/T
 1/T

1/T
t=1 (1/T)Rt /m(Rt , IT )
t=1 Rt /m(Rt , IT )

IT = n
=

n
 1/T
 1/T
i=1 (1/T)/m(Rt , IT )
i=1 1/m(Rt , IT )
and

 −3/2 
.
+ O p ni

−3/2
), where 
BGit =
We wish to show that 
BGit − BGit = Op (ni
3/2 
2
(1/2)
σGit exp(
µGit )/ni , and thus showing that ni (BGit −
d

2 ). By the triangle inequality,
BGit ) → N(0, γGit

  −3/2 
2
.
|
BGit − BGit | ≤ |
BGit − (1/2) exp(µGit )σGit
/ni | + O ni

T
T 
1/T
 1/T
(1/T)
Rt /m(
Rt ,
IT )
1/T
t=1 Rt /m(Rt , IT )

I t = t=1
=
.
n
n
 1/T
 1/T
i=1 (1/T)/m(Rt , IT )
i=1 1/m(Rt , IT )

The within-month contribution of each item in the unadjusted
Rit . Because 
Rit and wit−1
Bit =
and adjusted index is just wit−1


Rit − Rit , we get the desired result.
[Received October 2004. Revised July 2006.]

346

Journal of Business & Economic Statistics, July 2007

REFERENCES

Downloaded by [Universitas Maritim Raja Ali Haji] at 23:07 12 January 2016

Amemiya, T. (1985), Advanced Econometrics, Cambridge, MA: Harvard University Pres.
Biggeri, L., and Giommi, A. (1987), “On the Accuracy Precision of the CPI,
Methods and Applications to Evaluate the Influence of Sampling Households,” in Proceedings of the 46th Session of the ISI, Tokyo, IP-12.1,
pp. 137–154.
Boskin, M. J., Dulberger, E. R., Gordon, R. J., Griliches, Z., and Jorgensen, D.
(1996), “Toward a More Accurate Measure of the Cost of Living: Final Report to the Senate Finance Committee for the Advisory Commission to Study
the Consumer Price Index,” U.S. Senate Finance Committee.
Bradley, R. (2001), “Finite-Sample Effects in the Estimation of Substitution Bias in the Consumer Price Index,” Journal of Official Statistics, 17,
369–390.
Cochran, W. G. (1963), Sampling Techniques (2nd ed.), New York: Wiley.
Hahn, J., and Hausman, J. (2002), “A New Specification Test for the Validity of
Instrumental Variables,” Econometrica, 70, 163–189.
Hahn, J., and Newey, W. (2004), “Jacknife and Analytical Bias Reduction for
Nonlinear Panel Models,” Econometrica, 72, 1295–1320.

Hall, P. (1992), The Bootstrap and Edgeworth Expansions, New York: SpringerVerlag.
Hansen, L. (1982), “Large-Sample Properties of Generalized Method-ofMoments Estimators,” Econometrica, 50, 1029–1054.
Kish, L., Namboordi, N. K., and Pillai, R. K. (1962), “The Ratio Bias in Surveys,” Journal of the American Statistical Association, 57, 92–115.
Lebow, D., and Rudd, J. (2003), “Measurement Error in the Consumer Price
Index: Where Do We Stand?” Journal of Economic Literature, 41, 159–201.
McClelland, R., and Reinsdorf, M. (1997), “Small-Sample Bi