07350015%2E2012%2E707590

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Inference for Income Distributions Using Grouped
Data
Gholamreza Hajargasht , William E. Griffiths , Joseph Brice , D.S. Prasada Rao
& Duangkamon Chotikapanich
To cite this article: Gholamreza Hajargasht , William E. Griffiths , Joseph Brice , D.S.
Prasada Rao & Duangkamon Chotikapanich (2012) Inference for Income Distributions
Using Grouped Data, Journal of Business & Economic Statistics, 30:4, 563-575, DOI:
10.1080/07350015.2012.707590
To link to this article: http://dx.doi.org/10.1080/07350015.2012.707590

View supplementary material

Accepted author version posted online: 05
Jul 2012.

Submit your article to this journal


Article views: 473

View related articles

Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 11 January 2016, At: 22:51

Supplementary materials for this article are available online. Please go to http://tandfonline.com/r/JBES

Inference for Income Distributions Using
Grouped Data
Gholamreza HAJARGASHT
Department of Economics, University of Melbourne, Melbourne, VIC 3010, Australia ([email protected])

William E. GRIFFITHS
Department of Economics, University of Melbourne, Melbourne, VIC 3010, Australia ([email protected])


Joseph BRICE
School of Economics, University of Queensland, St Lucia, QLD 4072, Australia ([email protected])

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

D.S. Prasada RAO
School of Economics, University of Queensland, St Lucia, QLD 4072, Australia ([email protected])

Duangkamon CHOTIKAPANICH
Department of Econometrics and Business Statistics, Monash University, Caulfield East, VIC 3145, Australia
([email protected])
We develop a general approach to estimation and inference for income distributions using grouped or
aggregate data that are typically available in the form of population shares and class mean incomes, with
unknown group bounds. We derive generic moment conditions and an optimal weight matrix that can be
used for generalized method-of-moments (GMM) estimation of any parametric income distribution. Our
derivation of the weight matrix and its inverse allows us to express the seemingly complex GMM objective
function in a relatively simple form that facilitates estimation. We show that our proposed approach,
which incorporates information on class means as well as population proportions, is more efficient than
maximum likelihood estimation of the multinomial distribution, which uses only population proportions.
In contrast to the earlier work of Chotikapanich, Griffiths, and Rao, and Chotikapanich, Griffiths, Rao, and

Valencia, which did not specify a formal GMM framework, did not provide methodology for obtaining
standard errors, and restricted the analysis to the beta-2 distribution, we provide standard errors for
estimated parameters and relevant functions of them, such as inequality and poverty measures, and we
provide methodology for all distributions. A test statistic for testing the adequacy of a distribution is
proposed. Using eight countries/regions for the year 2005, we show how the methodology can be applied
to estimate the parameters of the generalized beta distribution of the second kind (GB2), and its special-case
distributions, the beta-2, Singh–Maddala, Dagum, generalized gamma, and lognormal distributions. We
test the adequacy of each distribution and compare predicted and actual income shares, where the number
of groups used for prediction can differ from the number used in estimation. Estimates and standard errors
for inequality and poverty measures are provided. Supplementary materials for this article are available
online.
KEY WORDS: GMM; Generalized beta distribution; Inequality and poverty.

1.

INTRODUCTION

The estimation of income distributions has played an important role in the measurement of inequality and poverty and,
more generally, in welfare comparisons over time and space.
Access to what is a vast literature on the modeling of income

distributions, the characteristics of different specifications, and
various estimation methods, is conveniently achieved through a
volume by Kleiber and Kotz (2003), the collection of articles in
Chotikapanich (2008), and articles by Bandourian, McDonald,
and Turley (2003) and McDonald and Xu (1995).
For carrying out large-scale investigations that involve many
countries, different time periods, and the estimation of regional
and global income distributions (see, e.g., Bourguignon and
Morrisson (2002), Milanovic (2002), and Chotikapanich
et al. (2012)), compilation of the necessary country-specific
income distribution data is a major research problem. The

data are generally drawn from household expenditure and
income surveys that are conducted once in 5 years in most
countries. Because compilation of data and data dissemination
from these surveys are resource intensive, much of the raw
data are not readily available for researchers. More regularly
disseminated data take the form of summary statistics that
include mean income, measures of inequality such as the
Gini coefficient, and grouped data in the form of income and

population shares; see, for example, the World Bank (http://
go.worldbank.org/6F2DBUXBE0) and the World Institute for
Development Economics Research (WIDER) (http://www.

563

© 2012 American Statistical Association
Journal of Business & Economic Statistics
October 2012, Vol. 30, No. 4
DOI: 10.1080/07350015.2012.707590

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

564

wider.unu.edu/research/Database/en_GB/wiid/). The focus of
this article is the provision of econometric methodology for fitting income distributions to limited country-level data of this
form. Specifically, our objective is to develop a framework for
generalized method-of-moments (GMM) estimation of income
distributions, when the available data are in the form of population shares and group mean incomes, with unknown group

limits. When data are available as population and income shares,
group mean incomes for a country can be computed from readily available data on the country’s mean per capita income.
To achieve comparability across countries and over time, mean
incomes that have been adjusted for purchasing power parity
are available from the World Bank and the Penn World Tables
(http://pwt.econ.upenn.edu/php_site/pwt_index.php).
A limited solution to this problem was offered by
Chotikapanich, Griffiths, and Rao (2007) (hereafter CGR), who
suggested a method-of-moments estimator for the beta-2 distribution, implemented through nonlinear least squares. They
applied it to a sample of eight countries in two time periods
and illustrated how the estimated distributions can be combined
to derive a regional distribution, find Lorenz curves, and measure inequality. In a more extensive study, Chotikapanich et al.
(2012) used the same technique to estimate the global income
distribution as a mixture of beta-2 distributions for 91 countries
in 1993 and 2000. Chotikapanich, Griffiths, and Rao (2007) and
Chotikapanich et al. (2012) described the main features of their
approach and showed that it works well, but their method was
deficient in several respects. Their estimator was not an optimal GMM estimator and was not set up within a formal GMM
framework. They used an arbitrarily specified weight matrix,
and because of the lack of an asymptotic covariance matrix for

the estimator, they did not provide any standard errors. Also,
their methodology was limited to fitting the beta-2 distribution.
This article represents a major step forward from this previous
work. We establish moment conditions in a general form and
construct an optimal weight matrix, leading to a GMM estimator that is asymptotically efficient. Formally incorporating the
optimal weight matrix into GMM estimation makes it possible
to find the asymptotic covariance matrix of the estimator, from
which we can find standard errors for the estimated parameters
and for functions of the parameters used regularly in the area of
income distributions, such as inequality and poverty measures.
Also, our framework can be used to fit any income distribution,
not just the beta-2 distribution considered by CGR. Because
the GMM estimator uses information on both population shares
and class means, we are able to show that the covariance matrix for the GMM estimator is less than that for the maximum
likelihood (ML) estimator applied to a multinomial likelihood
function that uses only the population shares. By deriving both
the weight matrix and its inverse, we are able to show how
the seemingly complex GMM objective function can be written in a relatively simple form that is potentially amenable for
use in standard software. To check the adequacy of an income
distribution, we propose a test based on the standard J-statistic

used for testing the validity of excess moment conditions. In the
empirical work reported in this article, we focus on the generalized beta distribution of the second kind (GB2 distribution) and
its popular special cases—the beta-2, Singh–Maddala, Dagum,
generalized gamma, and lognormal distributions. We show how

Journal of Business & Economic Statistics, October 2012

goodness of fit can be assessed using a comparison of predicted
and observed income shares, where the number of groups used
for prediction can differ from that used for estimation. We also
illustrate how estimates of the parameters and their covariance
matrix can be used to plot income distributions, along with
their confidence bounds, and to compute inequality and poverty
measures, along with their standard errors.
Our work differs from related work by Wu and Perloff (2005,
2007), who also considered estimation of income distributions
from grouped data. Wu and Perloff approximated the underlying
income distribution with a maximum entropy density, estimated
using a two-step GMM estimator with a simulated weight matrix. In this article, we show how the optimal weight matrix can
be expressed in terms of the moments and moment distribution functions of any income distribution; then, in our empirical

work, we estimate the GB2 as a general and flexible class of income distributions. Knowing the parametric specification means
we are able to specify the optimal weight matrix as a function
of the unknown parameters and obtain optimal GMM estimates
in one step. Other distinguishing features of our work are our
emphasis on estimation of class limits, and the asymptotic covariance matrix that can be used to find standard errors for all
estimated parameters (including the class limits) and functions
of those parameters.
The article is organized as follows. In Section 2, the GMM
methodology is described in general for estimating the parameters of any income distribution and the moment conditions are
also specified. For the derivation of the inverse weight matrix,
we refer the readers to the Appendix (provided at the end of
the article). In Section 3, we summarize the expressions that
are needed for GMM estimation of the GB2, beta-2, Singh–
Maddala, Dagum, generalized gamma, and lognormal distributions. These expressions include the moments, distribution
functions, and first-moment distribution functions. The derivatives for computing the GMM asymptotic covariance matrix
can be found in Appendix B. Expressions for inequality and
poverty measures are also provided. Section 4 contains a description of the data used to illustrate the theoretical framework.
We selected eight countries/areas for the year 2005: China rural,
China urban, India rural, India urban, Pakistan, Russia, Poland,
and Brazil. The results presented in Section 5 include parameter

estimates and their standard errors, plots of income densities
and their confidence bounds, goodness-of-fit assessment, and
inequality and poverty measures. Concluding remarks are provided in Section 6.
2.

GMM ESTIMATION

Consider a sample of T observations (y1 , y2 , . . . , yT ),
randomly drawn from a parametric income distribution
f (y; φ), and grouped into N income classes (z0 , z1 ), (z1 , z2 ),
. . . , (zN−1 , zN ), with z0 = 0 and zN = ∞, and with ci being
the proportion of observations in the ith group. In addition to
c′ = (c1 , c2 , . . . , cN ), data generally available from the grouping
are the class mean incomes (ȳ1 , ȳ2 , . . . , ȳN ) or the income shares
(s1 , s2 , . . . , sN ). If the sample mean ȳ is available, class mean
incomes can be obtained from income shares using ȳi = si ȳ/ci .
Given available data on the ȳi and the ci , but not the zi , our
problem is to estimate φ, along with the unknown class limits

Hajargasht et al.: Inference for Income Distributions Using Grouped Data


z1 , z2 , . . . , zN−1 . To tackle this problem using GMM estimation,
we create a set of sample moments
H(θ) =

T
1 
h (yt , θ)
T t=1

(1)

such that E [H(θ)] = 0, where θ = (z1 , z2 , . . . , zN−1 , φ′ )′ . The
GMM estimator θ̂ is defined as

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

θ̂ = arg minθ H(θ)′ W H(θ),

(2)

where W is a weight matrix. In what follows, we first consider
specification of the sample moments H(θ) and then derivation of the weight matrix. We assume that the class limits zi
are unknown parameters to be estimated because grouped data
available from the World Bank and the WIDER do not include
these limits. If the class limits are available, H(θ) and W do not
change, and (2) is minimized with respect to φ instead of θ.
2.1 Specification of the Moments
To set up the moments corresponding to the sample proportions ci , we note that the corresponding population proportions,
which we denote by ki (θ), are given by
 zi
 ∞
ki (θ) =
gi (y)f (y; φ)dy
f (y; φ)dy =
0

zi−1

= E[gi (y)]

i = 1, 2, . . . , N,

(3)

where gi (y) is an indicator function such that

1 if zi−1 < y ≤ zi
.
gi (y) =
0 otherwise

= E[ygi (y)]

i = 1, 2, . . . , N.

(5)

Thus, we specify the remaining elements in H(θ) as
T
1 
yt gi (yt )−E [ygi (y)] = ỹi −µi (θ)
T t=1

i = 1, 2, . . . , N.
(6)

These moments are different from those used by CGR. They
used ȳi − µi (θ)/ki (θ) instead of ỹi − µi (θ). The problem with
ȳi − µi (θ)/ki (θ) is that



T −1 Tt=1 yt gi (yt )
µi (θ)
=
E(ȳi ) = E
,
T
−1
ki (θ)
T
t=1 gi (yt )

although it is true that plim (ȳi ) = µi (θ)/ki (θ). Also, derivation
of the corresponding weight matrix for ỹi − µi (θ) is straightforward, whereas the weight matrix for ȳi − µi (θ)/ki (θ) is more
difficult to derive.
Collecting all the terms in (4) and (6), we can write


g1 (yt ) − k1 (θ)


..


.




⎢ gN−1 (yt ) − kN−1 (θ) ⎥


(7)
h (yt , θ) = ⎢

⎢ yt g1 (yt ) − µ1 (θ) ⎥




..


.



c1 − k1 (θ)


..




.




T
⎢ cN−1 − kN−1 (θ) ⎥
1 

H(θ) =
h (yt ; θ) = ⎢

⎢ ỹ1 − µ1 (θ)
T t=1




..




.


ỹN − µN (θ)


Ti
= ci ,
gi (yt ) =
T
t=1

where Ti is the number of observations in group i. Thus, for the
proportion of observations in each group, we can specify the
first N − 1 elements in H(θ) as
T
1 
gi (yt ) − E [gi (y)] = ci − ki (θ)
T t=1

(4)

The moment for i = N is omitted because the result

N
N
i=1 ki (θ) =
i=1 ci = 1 makes one of the N conditions redundant.
To obtain moments that use the information in the class
mean incomes ȳi , it turns out to be convenient to consider ỹi
defined as
ỹi = ci ȳi =

0

zi−1

and

T


i = 1, 2, . . . , N − 1.

Its corresponding population quantity is
 zi
 ∞
ygi (y)f (y; φ) dy
yf (y; φ)dy =
µi (θ) =

yt gN (yt ) − µN (θ)

The sample moment corresponding to E [gi (y)] = ki (θ) is
1
T

565

T
T
1 
Ti 1 
yt gi (yt ).
yt gi (yt ) =
T Ti t=1
T t=1


N
Since N
i=1 ỹi =
i=1 ci ȳi = ȳ, we interpret ỹi as that part of
sample mean income ȳ that comes from the ith income class.

=



c−N − k−N
ỹ − µ



,

(8)

where c, k, ỹ, and µ are N-dimensional vectors containing
the elements ci , ki , ỹi , and µi , respectively, and the subscript
“−N” denotes a corresponding (N − 1)-dimensional vector,
with the last element excluded. Also, we drop the argument θ
from these vectors to ease notation. If K is the dimension of
φ (the number of unknown parameters in the income density),
then there are 2N − 1 moment conditions and N + K − 1
unknown parameters.
For computational purposes, it is typically more convenient
to express ki (θ)
and µi (θ) in terms of distribution functions.

If µ = E(y) = 0 yf (y; φ)dy is the mean of y, F (y; φ) is its

566

Journal of Business & Economic Statistics, October 2012

Table 1. Moments, distribution functions, and Gini coefficients for income distributions
Density function

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

GB2

ay ap−1
=

p+q
bap B (p, q) 1 + (y/b)a

f (y; φ)

Singh–Maddala
(p = 1)

f (y; φ) =

Dagum (q = 1)

f (y; φ) =

Lognormal

µ(k)

f (y; φ)

Beta-2 (a = 1)

Generalized
gamma

Moments

y p−1
= p
b B (p, q) (1 + (y/b))p+q
ba

aqy a−1
1+q

1 + (y/b)a



1 + (y/b)a

p+1

a
f (y; φ) = ap
y ap−1
a 
 β Ŵ(p)
y
× exp −
β
1
f (y; φ) = √
 y 2π σ 2 
(ln y − µ)
× exp −
2σ 2

1
µ

Fk (y; φ)
= Bu (p + k/a, q − k/a),
with


u = (y/b)a / 1 + (y/b)a

Integral evaluated
numerically

Fk (y; φ)
= Bu (p + k, q − k),
with
u = (y/b)/[1 + (y/b)]

G=

Fk (y; φ)
= Bu (1 + k/a, q − k/a)
with


u = (y/b)a / 1 + (y/b)a

G=1
Ŵ (q) Ŵ (2q − 1/a)

Ŵ (q − 1/a) Ŵ (2q)

Fk (y; φ)
= Gu (p + k/a, β a )

Integral evaluated
numerically

µ(k)

F (y; φ)
bk Ŵ (1 + k/a , q − k/a)
Ŵ (q)

µ(k)
bk Ŵ (p + k/a , 1 − k/a)
=
Ŵ (p)
β k Ŵ (p + k/a)
Ŵ(p)



k2 σ 2
µ(k) = exp kµ +
2

F (y; φ)

 y −a −p
= 1+
b

F (y; φ)
=

y

tf (t; φ)dt


 y a −q
=1− 1+
b

F (y; φ) = Gu (p, β a ),
with u = y a

distribution function, and
F1 (y; φ) =

F (y; φ) = Bu (p, q),
with


u = (y/b)a / 1 + (y/b)a
F (y; φ) = Bu (p, q),
with
u = (y/b)/[1 + (y/b)]

µ(k) =



Moment distribution functions

µ = bp/(q − 1)
µ(2) = bp(p + 1)/(q − 1)(q − 2)

=

apy ap−1
bap

bk B (p + k/a , q − k/a)
=
B (p, q)

Distribution function

(9)

0

is its first moment distribution function, then from (3), (5), and
(9),
ki (θ) = F (zi ; φ) − F (zi−1 ; φ),

(10)

µi (θ) = µ (F1 (zi ; φ) − F1 (zi−1 ; φ)) .

(11)

In Section 3, we describe Table 1, which contains explicit
expressions for µ, F (zi ; φ), and F1 (zi ; φ) for the GB2 distribution, and its special cases—the beta-2, Singh–Maddala,
Dagum, generalized gamma, and lognormal distributions. Inserting these expressions into the moments in (7) and (8), and
including expressions for the elements of the weighting matrix
that we consider in the next section, makes the GMM estimator
operational.
2.2 Derivation of Weight Matrix
The simplest weighting matrix is that where W = I. However,
since the last N moments involving the class means are of a much
higher order of magnitude than the first N − 1, which involve
proportions, setting W = I gives an undesirably large relative
weight to the last N terms. These terms will tend to dominate
the estimation procedure, and as noted by CGR, this can lead to
perverse outcomes where ẑi−1 > ẑi for some i. To ensure both
sets of moments were on a similar scale, CGR used a diagonal
−2
weighting matrix, with diagonal elements (c1−2 , c2−2 , . . . , cN−1
,
−2
−2
−2
ȳ1 , ȳ2 , . . . , ȳN ). The estimator from this weight matrix minimizes the sum of squares of percentage errors, but the weight
matrix is not optimal. Its diagonal elements are not equal to
the inverses of the variances of the moments, and it ignores

ln(y) − µ
σ



Gini coefficient

2B (2p, 2q − 1)
pB 2 (p, q)

Fk (y; φ)
= Bu (p + k/a, 1 − k/a),
with


u = (y/b)a / 1 + (y/b)a

G=
Ŵ (p) Ŵ (2p + 1/a)
Ŵ (p + 1/a) Ŵ (2p)
−1

Fk (y; φ)


G = 2

=

ln(y) − µ − kσ
σ

2



σ

2



−1

correlations between moments. Deriving an optimal weight matrix is crucial for deriving an asymptotically efficient estimator
and facilitates derivation of standard errors for the parameter
estimates.
The optimal weight matrix is given by
−1

T
1 

,
(12)
h (yt , θ) h (yt , θ)
W = plim
T t=1
where W−1 is traditionally estimated from
Ŵ−1 =

T
1 
h(yt , θ̂)h(yt , θ̂)′ ,
T t=1

(13)

with θ̂ being a first-step estimator obtained by minimizing
H(θ)′ WH(θ) for a prespecified W. The estimator Ŵ depends
on both the sample data and estimates of the parameters θ̂. It
turns out not to be 
suitable for our problem because it contains
terms of the form Tt=1 yt2 gi (yt ), which are not available from
the grouped data. However, instead of using (13), we can take
the probability limit in (12) and obtain a result that depends only
on the unknown parameters, not on the sample data. From the
Appendix, we find
W−1 = plim


T
1 
h(yt , θ)h(yt , θ)′
T t=1

D(k−N )


⎢
D(µ−N )
=⎢

0′ N−1




k−N
µ





[ D(µ−N )

k′ −N

(2)

D(µ )


µ′ ,

0N−1 ]





(14)

Hajargasht et al.: Inference for Income Distributions Using Grouped Data

where 0N−1 is an (N − 1)-dimensional vector of zeros, D(x)
denotes a diagonal matrix with elements of a vector x on the
(2)
(2) ′
diagonal, and µ(2) = (µ(2)
1 , µ2 , . . . , µN ) , with
 zi
µ(2)
(θ)
=
y 2 f (y; φ)dy
i
zi−1

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

=




0

2

y gi (y)f (y; φ) dy = E[y gi (y)]. (15)

−D(µ−N /v−N ) (µN /vN ) ιN−1

⎢ + µ(2) /v ι ι′

N N−1 N−1
W=⎢
 N

⎣ −D µ−N /v−N
D (k−N /v−N )
(µN /vN ) ι′ N−1

0

0
kN /vN



⎥,



2
where vi = ki µ(2)
i − µi and v = (v1 , v2 , . . . , vN ) . Then, using
direct matrix multiplication, we can show that the GMM objective function simplifies to

′ 

c−N − k−N
c−N − k−N

W
GMM = H WH =
ỹ − µ
ỹ − µ

=

N


w1i (ci − ki )2 +

−2

N


i=1

i=1

z
where F2 (zi ; φ) = 0 i y 2 f (y; φ)dy/µ(2) is the second moment
distribution function and µ(2) = E(y 2 ) is the second moment for
y. Expressions for F2 (zi ; φ) and µ(2) , for a number of distributions, are provided in Table 1 and described further in Section 3.
2.3

Empirical Implementation of GMM Estimation

2

For minimizing the GMM objective function in (2), we require W, not W−1 . For given values (or estimates) of θ, W
can be readily found by numerically inverting W−1 . However,
an analytical expression for W is useful because it allows us
to simplify the GMM objective function, gain insights into the
minimization process, and improve computational efficiency. In
the supplementary Appendix A, we show that

⎡  (2)

D µ−N /v−N

567

N

i=1

We consider three GMM estimators. The first, θ̂CGR , is
the CGR estimator modified to suit our specification of the
moments. It minimizes (16), but with weights w1i = 1/ci2 ,
w2i = 1/ỹi2 , and w3i = 0. We use WCGR
(2N) to denote the corresponding weight matrix. In this case, the GMM estimator
minimizes the sum of squares of the percentage errors for the
moments. The second estimator is a two-step estimator that
uses θ̂CGR to compute an estimate of the optimal weight matrix,
W(2N) (θ̂CGR ). In our empirical work, we used an iterative version of this estimator, updating W(2N) after each iteration, and
iterating until convergence, with convergence being achieved
within 10–20 iterations. The third estimator is a one-step estimator obtained by minimizing the complete objective function
with respect to θ, where both the moments and W(2N) are functions of θ. Hansen, Heaton, and Yaron (1996) referred to the
one-step estimator as the “continuous-updating estimator.” Although the iterative and continuously updating estimators have
the same asymptotic distribution, they are typically different in
finite samples (Hall 2005, p. 103).
The three estimators are given by
θ̂CGR = arg minθ H(2N) (θ)′ WCGR
(2N) H(2N) (θ),
θ̂2−STEP = arg minθ H(2N) (θ)′ W(2N) (θ̂CGR )H(2N) (θ),

w2i (ỹi − µi )2

w3i (ci − ki )(ỹi − µi ),

θ̂1−STEP = arg minθ H(2N) (θ)′ W(2N) (θ)H(2N) (θ).
(16)
2.4

where w1i = µ(2)
i /vi , w2i = ki /vi , and w3i = µi /vi . The advantages of expressing the GMM objective function in this way
are its potential suitability for use in standard software, and that
it can be conveniently written as a function of 2N rather than
2N − 1 moments, with a weight matrix that consists of four
diagonal blocks. Specifically, we can write
GMM = H′ (2N) W(2N) H(2N)
′ 



c−k
D(w1 ) −D(w3 )
c−k
, (17)
=
ỹ − µ
ỹ − µ
−D(w3 ) D(w2 )
where w1 , w2 , and w3 are N-dimensional vectors containing the
elements w1i , w2i , and w3i , respectively; the subscript “(2N)”
has been used to describe this alternate specification, where it
is no longer necessary to drop a redundant moment. In this formulation, there are no cross-product terms involving moments
from different groups, but there is a cross-product term between
the proportion and mean moments for the ith group.
The extra quantities needed to compute the weights (that
were not also needed to compute the moments) are the µ(2)
i . It
is convenient to write
(2)
µ(2)
i = µ [F2 (zi ; φ) − F2 (zi−1 ; φ)] ,

(18)

Asymptotic Covariance Matrix and Relative
Efficiency

The asymptotic covariance matrix of the one- and two-step
estimators is (see, e.g., Cameron and Trivedi 2005, p. 176)
var(θ̂) = T −1 (G(θ)′ W(θ) G(θ))−1 , where G = ∂H/∂θ′ . It is
straightforward, although tedious, to show that



 



∂H ′
∂H(2N)
∂H(2N) ′
∂H
=
.
W
W
(2N)
∂θ′
∂θ′
∂θ′
∂θ′
Thus, the covariance matrix can also be written in terms of the
complete set of 2N moments and corresponding weight matrix
as
var(θ̂) =

1
(G(2N) (θ)′ W(2N) (θ) G(2N) (θ))−1 ,
T

(19)

where G(2N) = ∂H(2N) /∂θ′ . An estimate of var(θ̂) is obtained
by replacing θ by θ̂. In our empirical work, we successfully
used both analytical and numerical derivatives to calculate the
elements of the G(2N) matrix. Expressions for obtaining some
analytical derivatives for a variety of distributions are given in
the supplementary Appendix B. Also, some Monte Carlo experiments were conducted to assess the accuracy of standard errors
computed from (19). The Monte Carlo average of the variance

568

Journal of Business & Economic Statistics, October 2012

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

estimates from (19) was close to both the true asymptotic variance from (19) and the Monte Carlo variance of the estimates θ̂.
One common way of estimating income distributions is via
ML estimation using a multinomial likelihood function, applied
only to the sample proportions; see, for example, McDonald
(1984) and Bandourian, McDonald, and Turley (2003). To show
that this estimator is less efficient than the GMM estimator that
use information on both sample proportions and class mean
incomes, we rewrite (19) as



D(µ(2) /v) −D (µ/v)
∂k′ ∂µ′
1
var(θ̂) =
T
∂θ ∂θ
−D (µ/v)
D (k/v)

−1
∂k/∂θ′
,
(20)
×
∂µ/∂θ′
where we use D (x/s) to denote a diagonal matrix with diagonal (x1 /s1 , x2 /s2 , . . . , xN /sN ), for any two vectors x and s.
Expanding (20) yields

∂k′
1 ∂k′
∂k
∂µ
var(θ̂) =
D(µ(2) /v) ′ − 2
D (µ/v) ′
T ∂θ
∂θ
∂θ
∂θ
−1

∂µ
∂µ
D (k/v)
+
.
(21)
∂θ
∂θ
Using the definition of v and some algebraic manipulation,
this expression can be written as


 ′
∂k
∂k′
1 ∂k′
∂µ
D (1/k) ′ +

D (µ/k)
var(θ̂) =
T ∂θ
∂θ
∂θ
∂θ
−1

∂k
∂µ
(µ/k)
. (22)

D
× D (k/v)
∂θ′
∂θ′
By setting up the multinomial likelihood function and deriving the information matrix, it can be shown that the covariance
matrix for the ML estimator based on only proportions is (see,
e.g., Cox 1984)


∂k −1
1 ∂k′
D (1/k) ′
.
(23)
var(θ̂ML ) =
T ∂θ
∂θ
Since the second term in (22) is positive definite, it follows
that var(θ̂ML ) − var(θ̂) is positive definite, and hence, the GMM
estimator that uses both sample proportions and class mean
incomes is more efficient than the ML estimator that uses only
sample proportions.

Under the null hypothesis that the moment conditions are
correct plim H(θ̂) = 0 and
d

APPLICATION TO MAJOR INCOME
DISTRIBUTIONS

A large number of probability density functions (pdfs) have
been suggested in the literature for modeling income distributions; see, for example, McDonald and Ransom (1979),
McDonald (1984), McDonald and Xu (1995), Bandourian,
McDonald, and Turley (2003), and Kleiber and Kotz (2003). To
illustrate the methodology, we have chosen the four-parameter
generalized beta distribution of the second kind (GB2) introduced by McDonald (1984), and some of its popular special case
distributions—the beta2, Dagum, Singh–Maddala, generalized
gamma, and lognormal distributions. The GB2 distribution has
analytical properties that make it well suited to the analysis of
income distributions (Parker 1999), and as we will see, it provides a very good fit to the observed data [see also, Bordley,
McDonald, and Mantrala (1996) and Bandourian, McDonald,
and Turley (2003)].
3.1 Moments and Moment Distribution Functions
To specify the moment conditions and weight matrix for each
of the distributions, we need their first and second moments, their
cumulative distributions functions (cdfs), and their first and second moment distribution functions, all expressed as computable
functions of the parameters. These required quantities are summarized in Table 1. They can be found in various places: two
convenient sources are McDonald (1984) and Kleiber and Kotz
(2003).
The beta-2, Singh–Maddala, and Dagum distributions are
readily obtained from the more general GB2 distribution by
setting a = 1, p = 1, and q = 1, respectively. The generalized gamma distribution can be obtained as a special case of
the GB2 distribution by setting b = βq 1/a and taking the limit
as q → ∞; the lognormal distribution can be obtained as a
special case of the generalized gamma distribution by setting
β a = σ 2 a 2 , p = (aµ + 1)/β a , and taking the limit as a → 0
(McDonald 1984). In Table 1, B(·,
·) is the beta function, Ŵ(·) is
u
the gamma function, Bu (p, q) = 0 t p−1 (1 − t)q−1 dt/B(p, q)
is the cdf for a standard beta
random variable defined on
u/b
the (0, 1) interval, Gu (p, b) = 0 t p−1 e−t dt/Ŵ(p) is the cdf
for a standard gamma random variable with parameters p
and b, and φ(·) is the cdf for a standard normal random
variable.
3.2 Inequality and Poverty Measures

2.5 Testing the Validity of an Income Distribution

2
.
J = T H(θ̂)′ W(θ̂) H(θ̂) −→ χN−K

3.

(24)

In traditional GMM estimation, this test statistic is used to
assess whether excess moment conditions are valid. In our case,
we assume a particular form of parametric income distribution
and use its properties to construct both the moment conditions
and the weight matrix. If the assumed distribution is invalid, then
plim H(θ̂) = 0 and plimW(θ̂) = W(θ). Thus, the J statistic can
be used to test the validity of the assumed income distribution.

Income distributions are often estimated to assess inequality
and poverty. To illustrate, we consider two inequality measures,
the Gini and the Theil coefficient, and two poverty measures,
the headcount ratio and the Foster–Greer–Thorbecke measure
with an inequality aversion parameter of 2 (FGT(2)) (Foster,
Greer, and Thorbecke 1984). In our empirical work, these
quantities were computed for the GB2, beta-2, Singh–Maddala,
and Dagum distributions.
The Gini coefficient is given by
 ∞
2
y F (y; φ) f (y; φ) dy.
(25)
G = −1 +
µ 0

Hajargasht et al.: Inference for Income Distributions Using Grouped Data

Expressions for G in terms of the parameters are given in
Table 1 for all distributions except the GB2 and generalized
gamma. Lengthy expressions written in terms of hypergeometric functions are available for these latter two distributions
(McDonald 1984; McDonald and Ransom 2008). However, using MATLAB, we found that numerical evaluation of the integral
in (25) was more efficient and reliable than computing values
for the hypergeometric functions.
The Theil inequality coefficient is given by
 ∞    
y
y
ln
f (y; φ) dy,
(26)
T =
µ
µ
0
and for the GB2 distribution, it can be written as (McDonald
and Ransom 2008)
1
[ψ (p + 1/a) − ψ (q − 1/a)] + ln (b/µ) , (27)
a
where ψ(t) = d log Ŵ(t)/dt is the digamma function. Values
of T for the beta-2, Singh–Maddala, and Dagum distributions
are obtained from (27) by setting a = 1, p = 1, and q = 1,
respectively.
For a given poverty line x, the headcount ratio is the proportion of population with incomes less than x, and so, for
all distributions, it is simply given by F (x; φ) . The measure
FGTx (2) considers not just the proportion of poor but also how
far the poor are below the poverty line. It is defined as

 x 
x−y 2
FGTx (2) =
f (y)dy
x
0

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

T =


µ(2)
F1 (x; φ) + 2 F2 (x; φ). (28)
x
x
The quantities necessary for computing (28) are given in
Table 1.
= F (x; φ) −

569

other regions into 12 groups. Having 12 groups for all regions
has the advantage of uniformity for estimation, and it provides
an opportunity for checking the ability of the estimated model
to predict income shares for groups other than those used for
estimation, a procedure that we consider in Section 5. The population proportions in each region were not identical, but in most
cases, they were approximately 0.05 for the first and last two
groups, and 0.1 for the remaining groups.
Also available from the website is each region’s mean income
ȳ, found from the same surveys used to compute the ci and
si , and then converted using a 2005 purchasing-power-parity
exchange rate. To use the methodology described in Sections 2
and 3, we need the data on ỹi , defined as ỹi = ci ȳi = si ȳ. For
computing standard errors and the J statistic, we also need the
sample sizes T for each of the surveys. Unfortunately, although
the website provides data on the population size of each region,
it does not have comprehensive data on the sample sizes T for
each of the surveys. For the results reported in Tables 2 and
3, we use T = 10,000. This is a conservative value since most
surveys have sample sizes that are much larger. If standard errors
or J-statistics for other sample sizes are of interest, they can be
obtained from our results by multiplying by the appropriate
scaling factor.
5.

Our presentation and discussion of the results begins in Section 5.1, where we consider the estimated income distributions
for the eight regions. Goodness of fit is assessed in Section 5.2,
using J-statistics and a comparison of predicted and observed
income shares. Levels of inequality and poverty obtained from
the different distributions are reported in Section 5.3.
5.1

4.

DESCRIPTION OF DATA AND SOURCES

To illustrate the methodology described in Sections 2 and
3, we use income distribution data (provided in grouped
form) from the PovcalNet website (http://iresearch.worldbank.
org/PovcalNet/index.htm?0,3) developed by the World Bank
Poverty Research Group. This database is set up for the purpose of poverty assessment for individual countries, regions,
and the world over. The data are available for developing countries for a number of years, ranging from 1981 to 2005. The
version of the data we use was updated in August 2008 to incorporate 2005 purchasing power parity estimated by the World
Bank International Comparison Program. To use a reasonably
diverse cross-section of countries to test the performance of the
estimator, we chose as examples Brazil, China, India, Pakistan,
Russia, and Poland for the year 2005. Separate data are available
for rural and urban regions in India and China, making a total of
eight different datasets. We will refer to each dataset as coming
from a region, where a region can be a country, or rural or urban
China or India.
For most of the chosen regions, population shares ci and the
corresponding income shares si were available for 20 groups.
Exceptions were India rural and India urban, which each had
12 groups, and China rural, which had 17 groups. In line with
India rural and India urban, we aggregated the data from the

EMPIRICAL ANALYSIS

Country-Specific Income Distributions

Table 2 contains the estimated class limits and parameters
of the GB2 distribution obtained using the GMM estimation
procedure outlined in Sections 2 and 3. For each region, we
report three different sets of estimates—those from the CGR,
two-stage, and one-stage estimations. Standard errors for both
the two-stage and the one-stage estimates are also reported.
There are no dramatic differences between the estimates from
the three different estimations. The two-stage and one-stage estimates are almost identical, particularly for the class limits, and
the CGR estimates are only slightly different. Similarly, the standard errors obtained from two-stage estimation are very close
to those obtained from one-stage estimation. The magnitudes of
the standard errors are relatively small, particularly those for the
class limits. In all cases, the hypothesis H0 : q = 1 is rejected
at a 5% significance level, suggesting that a Dagum distribution
is not adequate. A similar test for H0 : a = 1 suggests that the
beta-2 distribution would be adequate for India urban, Russia,
and Poland, while India urban is the only region where the
Singh–Maddala distribution (H0 : p = 1) is not rejected. More
light is shed on this issue when we examine goodness of fit.
To save space, estimates and standard errors for distributions
other than the GB2 are not reported; they are available from
the authors upon request. Estimates of the zi and their standard
errors were similar for all distributions. Standard errors for the

570

Journal of Business & Economic Statistics, October 2012

Table 2. Estimated coefficients from GB2 distributions

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

China rural

China urban

CGR

Two-step

SE

One-step

SE

CGR

Two-step

SE

One-step

SE

z1
z2
z3
z4
z5
z6
z7
z8
z9
z10
z11
b
p
q
a

22.334
27.972
33.607
36.411
41.893
47.344
55.631
69.582
83.573
112.010
140.940
32.887
3.774
1.761
1.666

22.437
27.882
33.476
36.263
41.828
47.403
55.743
69.710
83.600
111.320
139.820
25.098
6.094
2.209
1.410

0.108
0.079
0.061
0.060
0.072
0.082
0.110
0.153
0.213
0.387
0.756
4.888
1.954
0.407
0.176

22.438
27.882
33.476
36.263
41.828
47.403
55.743
69.710
83.600
111.320
139.820
25.377
5.970
2.180
1.422

0.108
0.079
0.061
0.060
0.072
0.082
0.110
0.153
0.213
0.387
0.757
4.808
1.884
0.398
0.175

52.402
65.688
85.274
102.100
118.740
136.760
157.800
184.610
223.940
298.540
386.530
122.260
1.371
1.217
2.424

52.557
65.016
84.522
101.420
118.600
137.190
158.460
185.380
223.490
293.870
381.070
107.220
2.437
1.810
1.817

0.262
0.209
0.202
0.197
0.210
0.239
0.292
0.393
0.614
1.180
2.657
4.981
0.409
0.259
0.168

52.562
65.014
84.522
101.420
118.600
137.190
158.460
185.380
223.480
293.880
380.990
107.340
2.366
1.755
1.849

0.262
0.210
0.202
0.197
0.210
0.239
0.292
0.393
0.614
1.181
2.657
4.849
0.389
0.247
0.168

z1
z2
z3
z4
z5
z6
z7
z8
z9
z10
z11
b
p
q
a

20.748
24.284
28.858
32.827
36.611
40.400
45.117
51.502
61.535
80.837
105.730
32.369
1.086
0.547
4.705

20.897
24.183
28.550
32.603
36.623
40.613
45.490
51.836
61.518
79.464
103.040
28.454
2.046
0.814
3.404

India rural
0.075
0.054
0.049
0.046
0.048
0.053
0.066
0.091
0.149
0.297
0.708
1.094
0.308
0.083
0.262

20.899
24.183
28.550
32.603
36.623
40.613
45.490
51.836
61.518
79.463
103.050
28.576
1.982
0.789
3.470

0.075
0.054
0.049
0.046
0.048
0.053
0.066
0.092
0.149
0.297
0.709
1.061
0.294
0.080
0.265

19.668
23.479
29.153
34.692
40.188
46.578
54.746
64.993
81.310
112.840
153.210
9.019
15.470
2.253
1.252

19.871
23.503
28.857
34.404
40.028
46.811
55.319
65.511
81.477
111.860
150.740
2.259
60.168
2.906
1.038

India urban
0.079
0.063
0.065
0.066
0.073
0.088
0.111
0.151
0.249
0.499
1.157
5.589
135.207
0.800
0.204

19.870
23.503
28.857
34.404
40.028
46.811
55.319
65.511
81.477
111.860
150.740
2.265
60.169
2.882
1.042

0.078
0.063
0.064
0.066
0.073
0.088
0.111
0.151
0.249
0.499
1.157
5.587
135.360
0.788
0.204

z1
z2
z3
z4
z5
z6
z7
z8
z9
z10
z11
b
p
q
a

26.890
31.053
36.996
42.141
47.307
52.992
59.760
68.605
81.967
108.540
141.770
33.786
2.647
0.908
2.979

26.788
31.151
37.204
42.335
47.315
52.796
59.415
68.580
82.670
109.080
141.720
39.114
1.562
0.691
3.722

Pakistan
0.100
0.073
0.065
0.061
0.063
0.072
0.091
0.129
0.216
0.440
1.028
1.140
0.206
0.067
0.277

26.788
31.151
37.204
42.335
47.315
52.796
59.415
68.580
82.671
109.080
141.730
39.140
1.555
0.688
3.732

0.100
0.073
0.065
0.061
0.063
0.072
0.091
0.129
0.216
0.440
1.028
1.135
0.204
0.066
0.277

81.818
103.310
137.030
168.190
200.690
237.150
280.880
337.590
420.190
573.990
749.470
70.013
17.791
8.294
0.643

81.272
103.990
137.790
168.520
200.560
237.050
281.450
337.910
419.610
577.570
756.180
174.940
5.418
4.036
1.025

Russia
0.475
0.388
0.381
0.379
0.410
0.476
0.590
0.797
1.257
2.442
5.208
22.264
1.717
1.074
0.161

81.276
103.990
137.790
168.520
200.560
237.050
281.450
337.910
419.610
577.570
756.190
174.940
5.293
3.930
1.039

0.475
0.388
0.381
0.379
0.410
0.476
0.590
0.797
1.258
2.444
5.213
21.735
1.642
1.022
0.160

z1
z2
z3
z4
z5
z6
z7
z8
z9
z10
z11
b
p
q
a

93.757
115.550
148.320
177.540
207.340
240.250
279.230
329.470
402.520
539.020
696.790
170.600
3.588
2.260
1.519

94.504
115.460
146.970
175.990
206.650
240.580
281.510
333.120
403.370
531.820
686.110
133.470
5.990
3.081
1.227

Poland
0.459
0.367
0.356
0.353
0.380
0.437
0.540
0.720
1.098
2.044
4.400
23.178
1.951
0.695
0.176

94.503
115.460
146.970
175.990
206.650
240.580
281.510
333.120
403.370
531.820
686.110
133.470
5.930
3.039
1.235

0.459
0.367
0.356
0.353
0.380
0.437
0.540
0.720
1.098
2.044
4.400
22.926
1.914
0.678
0.176

30.844
45.493
70.465
95.646
124.040
157.780
201.330
261.620
358.960
566.870
854.590
123.720
2.754
2.220
1.001

30.667
46.067
70.652
96.128
125.040
161.170
202.490
256.870
355.000
577.410
892.510
157.830
1.539
1.525
1.311

Brazil
0.298
0.273
0.299
0.323
0.376
0.460
0.584
0.847
1.561
3.804
10.975
8.564
0.240
0.260
0.134

30.671
46.066
70.652
96.128
125.040
161.170
202.490
256.870
355.000
577.450
892.450
158.240
1.533
1.523
1.313

0.297
0.273
0.299
0.323
0.377
0.460
0.584
0.847
1.560
3.804
10.972
8.589
0.239
0.259
0.134

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

Hajargasht et al.: Inference for Income Distributions Using Grouped Data

estimated parameters of the beta-2, Singh–Maddala, and Dagum
distributions (which have parameters in common with the GB2)
were much smaller than those for the GB2, reflecting the drop
from four to three parameters.
In all cases, we computed standard errors using both numerical derivatives and analytical derivatives, and where both sets
were computable, they produced identical results. There were a
few cases where the computation of analytical derivatives failed
[beta-2 estimates for Pakistan and India (rural and urban)] because the hypergeometric function in MATLAB broke down.
These failures corresponded to solutions where p was very large
and b was very small; estimation was unstable, with different
starting values leading to different local minima. Numerical
standard errors could still be found, however. This problem did
not arise with the GB2 and other distributions.
In Figures 1 and 2 in the supplementary Appendix C, we
present a number of graphs. Figure 1 contains graphs of the
estimated GB2 pdfs for China rural and China urban, India
rural and India urban, Brazil, and Poland, along with 95% confidence bounds for these distributions. To find the confidence
bounds, standard errors were computed for the estimated pdfs
at a number of income levels using the covariance matrix of the
parameter estimates, the delta rule, and numerical derivatives.
The narrowness of the confidence bounds suggests that we are
accurately estimating the pdfs, despite relatively large confidence intervals for some of the parameters. A comparison of the
urban and rural pdfs for India and China shows clearly the larger
incomes of the urban populations. In India, it is interesting that
the rural and urban modes are similar, but the urban pdf has
a much fatter tail. Using a population-weighted mixture of the
rural and urban components, in Figure 2, we have graphed the
pdf and cdf for all of China, alongside those for the rural and
urban subpopulations. The modal incomes for rural China and
all of China are similar; including urban China has its biggest
effect on the upper tail of the distribution for all of China. If
more countries are considered, similar mixtures can be obtained
for larger regions, such as continents or the whole world; see,
for example, Chotikapanich et al. (2012).
A potential estimation problem for all distributions other than
the generalized gamma and lognormal is the nonexistence of the
second moment. For the existence of the kth moment, the GB2
distribution requires aq > k. This condition is the same for the
moments of the Singh–Maddala distribution, but becomes q > k
for the beta-2 distribution and a > k for the Dagum distribution. Since the optimal weighting matrix requires the existence
of second-order moments, if the CGR estimates violate one
of these inequalities, we cannot proceed with two-step estima-

571

tion of the offending distribution. Also, our experience suggests
that one-step estimation breaks down. (CGR estimation is still
feasible.) We encountered this problem with Brazil, a country
with relatively high inequality, for estimations with the GB2,
Singh–Maddala, and Dagum distributions, but not the beta-2
distribution. In the results reported in Table 2, we overcame
the problem by minimizing the objective function subject to
the constraint aq > 2. This solution may not be entirely satisfactory. The underlying income distribution may indeed not
have second moments, the standard errors for the boundary solutions that result may not be valid, and inequality appeared
to be underestimated relative to values reported by the World
Bank. The ability of the GB2 distribution and its special cases
to accommodate observed sample moments, particularly skewness and kurtosis, has been explored extensively by McDonald,
Sorenson, and Turley (2012).
We also found that the generalized gamma distribution can be
difficult to estimate. Sometimes, estimation would break down,
particularly when there was a tendency for the estimate for a
to become small. We suspect that small values of a were making calculation of Ŵ (p + 1/a) troublesome. We tried different
parameterizations and different starting values, and in all cases,
managed to get convergence. However, we are not confident that
all our solutions correspond to global minima.
5.2

Goodness-of-Fit Analysis

In this section, we assess the adequacy of the various distributions for modeling the observed population and income shares.
Two criteria are used: (a) the J test to test whether the moment conditions are valid for each of the distributions, and (b) a
comparison of observed and predicted income shares.
Table 3 presents the p-values for the J-statistics calculated for
all distributions considered and for all example regions, obtained
using the two-stage estimates, and assuming T = 10,000. Under the null hypothesis that the moment conditions are correct,
the J-statistic has a χ 2 distribution, with degrees of freedom
equal to the number of excess moment conditions. In the case
of the GB2 distribution, we have 23 moment conditions and
15 parameters, giving degrees of freedom of 8. For the threeparameter distributions, the degrees of freedom are 9, and for the
lognormal, they are 10. At a 5% significance level, the moment
conditions for the GB2 distribution are not rejected for five out
of the eight regions. All the other distributions are rejected for
all other countries with the exception of the beta-2 distribution
for China rural and Russia. These test outcomes are not as pleasing as one might hope. It would have been more satisfactory if

Table 3. p-values from J-statistics

China rural
China urban
India rural
India urban
Pakistan
Russia
Poland
Brazil

GB2

B2

Singh–Maddala

Dagum

Generalized gamma

Lognormal

0.8463
0.0297
0.0944
0.0110
0.1404
0.4973
0.0555
0.0000

0.3142
0.0000
0.0000
0.0141
0.0000
0.5933
0.0424
0.0000

0.0000
0.0000
0.0000
0.0000
0.0042
0.0000
0.0000
0.0000

0.0001
0.0000
0.0121
0.0000
0.0041
0.0000
0.0000
0.0000

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

572

Journal of Business & Economic Statistics, October 2012

Table 4. Observed and estimated percentage shares of income based on GB2
China rural

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:51 11 January 2016

Estimated
0.013
0.052
0.171
1.007
2.105
3.334
2.060
4.604
5.055
7.985
12.726
10.788
8.873
6.902
5.683
4.314
24.328

China urban

1.694
2.192
2.483
2.7