Directory UMM :Data Elmu:jurnal:A:Advances In Water Resources:Vol23.Issue6.2000:

Advances in Water Resources 23 (2000) 571±578

Basin level statistical properties of topographic index for North
America
Praveen Kumar a,*, Kristine L. Verdin b, Susan K. Greenlee b
a

Environmental Hydrology and Hydraulic Engineering, Department of Civil and Environmental Engineering, University of Illinois, Urbana, IL 61801,
USA
b
Earth Resources Observation Systems (EROS) Data Center, Sioux Falls, SD 57198, USA
Received 17 July 1999; accepted 14 October 1999

Abstract
For land±atmosphere interaction studies several Topmodel based land-surface schemes have been proposed. For the implementation of such models over the continental (and global) scales, statistical properties of the topographic indices are derived using
GTOPO30 (30-arc-second; 1 km resolution) DEM data for North America. River basins and drainage network extracted using this
dataset are overlaid on computed topographic indices for the continent and statistics are extracted for each basin. A total of 5020
basins are used to cover the entire continent with an average basin size of 3640 km2 . Typically, the ®rst three statistical moments of
the distribution of the topographic indices for each basin are required for modeling. Departures of these statistical moments to those
obtained using high resolution data have important implications for the prediction of soil-moisture states in the hydrologic models
and consequently on the dynamics of the land±atmosphere interaction. It is found that a simple relationship between the statistics

obtained at the 1 km and 90 m resolutions can be developed. The mean, standard deviation, skewness, L-scale and L-skewness all
show approximate linear relationships between the two resolutions making it possible to use the moment estimates from the
GTOPO30 data for hydrologic studies by applying a simple linear downscaling scheme. This signi®cantly increases the utility value
of the GTOPO30 datasets for hydrologic modeling studies. Ó 2000 Elsevier Science Ltd. All rights reserved.

1. Introduction
The treatment of land-surface heterogeneity in land±
atmosphere coupled model studies has emerged as a
pressing research issue during the last several years. The
concept of topographic index (also called wetness index), originally proposed in the Topmodel [1], for characterizing the distribution of moisture states in a basin
has gained considerable attention and successful models
based on this concept have been developed for land±
atmosphere interaction studies [3,4,11,15]. For this
purpose basins, and not typically used rectangular grids
conforming to the atmospheric models, are more appropriate units for modeling the terrestrial hydrologic
processes as they are better suited to capture the heterogeneity arising from topographic controls over surface
and sub-surface ¯ow. Regions of ¯ow convergence and
low vertical soil moisture de®cit are identi®ed as large
values for the topographic index, and low values correspond to uphill areas of ¯ow divergence and/or high
*


Corresponding author.
E-mail address: [email protected] (P. Kumar).

vertical soil-moisture de®cit. In Topmodel a hydrologic
similarity assumption is invoked. This states that all
locations in a basin with the same topographic index will
have the same hydrologic response. Models for land±
atmosphere studies that utilize the probability distribution of the topographic index over the basin to capture
this behavior for land±atmosphere interaction studies
have been tested and validated for individual basins or
over limited areas [4,15], but have not been implemented
in GCMs, among other reasons, for lack of basin level
topographic characteristics with continental scale
coverage.
Recently the United States Geological Survey
(USGS) has developed a digital elevation model (DEM)
at 30-arc-second resolution with global coverage [5].
This enables the ecient estimation of derivative information such as slope, aspect, hydrologic ¯ow paths, ¯ow
accumulation and basin boundaries [17]. The basins are

represented hierarchically at ®ve levels of subdivisions
with the average basin size ranging from 2; 209; 207 km2
at Level 1 to 3640 km2 at Level 5. These developments
pave the way for the implementation of basin level
hydrologic models with continental and global coverage.

0309-1708/00/$ - see front matter Ó 2000 Elsevier Science Ltd. All rights reserved.
PII: S 0 3 0 9 - 1 7 0 8 ( 9 9 ) 0 0 0 4 9 - 4

572

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

The objective of this paper is to describe the characteristics of the topographic indices at the basin level
for North America, extracted from the 30-arc-second
DEM. Modeling studies using the data are not discussed. Usually, the ®rst three statistical moments of
the distribution of the topographic indices for each
basin is required for modeling. These moments enable
the parameter estimation of the probability distribution. Typically a 3-parameter gamma (or Pearson-III)
distribution is used [14]. The properties of the ®tted

distribution are sensitive to the DEM resolution and
this impacts the performance of the hydrologic model
[19]. In order to address this problem in the use of 30arc-second DEM, we adopt the following procedure:
(i) estimate topographic index using the single ¯ow
algorithm from the 30-arc-second DEM data; (ii)
overlay basin boundaries and estimate statistical moments; (iii) analyze topographic indices obtained from
high resolution 3-arc-second data for selected regions
representing a range of topographic features to quantify the impact of resolution on the estimates of the
statistical moments; (iv) establish a functional relationship (downscaling function) to convert statistical
moments from the 1 km data to equivalent estimates
at the higher resolution. Once the downscaling function is identi®ed, we can convert the estimates from
the 30-arc-second data to estimates that would be
representative of topographic index if it was estimated
from high resolution data. This method is useful
because using high resolution data directly for the estimation of the distributional properties of the topographic index is very dicult due to the enormous
volume of the data, especially if global applications are
considered.
In a similar study, Wolock and McCabe [20] studied
the di€erences in topographic attributes obtained at the
two di€erent resolutions to identify if the di€erences

were due to terrain-discretization e€ects or smoothing.
Our work is complimentary in that it is geared toward
studying the properties at the basin scale for GCM applications. Analysis regions for the higher resolution
data are chosen to coincide with that of Wolock and
McCabe [20] so that the two results can be used together
for further studies.

A subset of this dataset corresponding to the North
American Continent is used for this study. The dataset
was ®rst projected from geographic coordinates to
Lambert Azimuthal Equal Area coordinate system at a
resolution of 1 km. This renders each cell, regardless of
the latitude, to represent the same ground dimensions
(length and area) as every other cell. Consequently, derivative estimates such as drainage areas, slope, etc. are
easier, consistent, and reliable. The extraction of these
hydrographic features from the 30-arc-second dataset
are based on the drainage analysis algorithm of Jenson
and Domingue [9]. This algorithm ®rst identi®es and ®lls
spurious sinks (or pits). E€ort is made to preserve natural sinks such as lakes in the landscape. Then for each
cell, direction of steepest descent from among its eight

neighbors is computed. This information is then used to
compute the ¯ow accumulation for each cell. A 1000
km2 threshold is then applied to the ¯ow accumulation
values to obtain a drainage network in raster format and
then vectorized [17]. The drainage network is then used
to identify basins and sub-basins. In order to represent
the basins hierarchically, a system developed by Pfafstetter [12] is used which utilizes an ecient coding
scheme [17] (the reader is encouraged to refer to the web
publication [16]). Basins at ®ve levels of subdivision are
developed. For this study, Level 5 description was
chosen with the mean basin size of 3640 km2 . Fig. 1 illustrates a typical layout of basin patterns at Level 5 for
a region in the north-eastern United States. It is found
that at this level the basins provide a sucient subgrid
resolution for GCM applications [3,10,11].
In order to assess the impact of the resolution on the
estimates of topographic indices, a higher resolution
dataset (3-arc-second) available from USGS is also used
for selected regions (see Fig. 2) These datasets are
available in blocks of 1°  1° latitude±longitude coverage for the entire United States. The ground dimensions
are 92.6 m along the latitude and 92:6  cos…ph=180† m

along the longitude where h is the latitude in degrees.
Consequently the ground resolution along the longitude
ranges from roughly 80 m in the southern US to 60 m
near the Canadian border. This dataset is also referred
to as the 90 m resolution data in this paper. Thirty-®ve
1°  1° latitude±longitude grids, as shown in the Fig. 2,
were selected for comparative analysis at the high and
low resolutions.

2. Data description
US Geological Survey's EROS (Earth Resources
Observation Systems) Data Center in Sioux Falls, SD
has developed a global digital elevation model called
GTOPO30. The resolution of this dataset is 30-arc-second (8 13  10ÿ3 degrees). The vertical resolution is 1 m,
and the elevation values for the globe range from ÿ407
to 8752 m above mean sea level. Additional details
about the dataset are available in [5].

3. Analysis
The topographic index at any location in the watershed is de®ned as ln…a= tan b† where a is the upstream

contributing area, from the watershed divide, per unit
contour length, and tan b is the local slope. It was estimated by applying the single ¯ow algorithm [18] for the
entire continent using the 30-arc-second dataset, and for

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

573

Fig. 1. Level 5 subdivision of a region of the eastern United States overlaid on the GTOPO30 DEM. Black and white lines represent the streams and
the basin boundaries, respectively.

the selected 1°  1° latitude±longitude regions (Fig. 2)
using the 3-arc-second dataset. At these resolutions the
multiple ¯ow algorithm [13] does not provide a better
estimate and consequently were not used. Appropriate
measures are taken to account for the distortions in the

estimates of contributing area and slopes due to the
curvilinear latitude±longitude coordinates of the 3-arcsecond data.
Spatial statistical moments of topographic indices

over each of the 5020 basins were computed. Figs. 3

Fig. 2. 1 km DEM data for the North American basin. Overlaid 1°  1° latitude±longitude boxes indicate regions where 90 m data were used to
extract the ®ner resolution topographic indices.

574

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

and 4 show the spatial distribution of these statistical
moments. We observe that the mean is generally larger
in ¯at areas and the standard deviation is generally larger
in mountainous regions. This is intuitive since in ¯at
regions we tend to have larger contributing areas and
smaller slopes giving rise to a larger topographic index.
The variability of the topographic relief and local slopes
in mountainous regions gives rise to more spread in the
distribution function and consequently to larger values
of the standard deviation. Fig. 5 shows the spatial distribution of the skew. As expected the skew is generally
larger in areas where there are large ¯ow accumulations.

For a very small fraction of the continental area (2.2%)
the skew was found to be negative. The basins corresponding to these areas were typically very small. Consequently very few values were used in computing the
skew and therefore the estimates have large estimation
errors. We believe that the presence of the negative value
is a consequence of the limitation of statistical estimation due to small sample size rather than due to the
estimation of topographic index from a low resolution
dataset.
For each of the thirty-®ve 1°  1° latitude±longitude
regions selected for high resolution analysis, the spatial
statistical moments were computed from estimates
obtained from both the 3-arc-second and 30-arc-second
DEM data. Fig. 6 shows the plots of the three moments and skewness obtained at the two resolutions.
The regression lines (solid lines) are obtained using the
least trimmed square robust regression which is based
on a genetic algorithm [2]. This algorithm provides

better linear ®ts by excluding outliers as compared to
the usual least-squares regression (plotted as dotted
lines). On the downside, however, no coecient of
correlation can be estimated. As seen from the ®gures,

there are nice linear relationships for the ®rst three
moments. However, there are noticeably large deviations from linearity in the skewness plots (no systematic dependence of these deviations on topographic
features could be established). This can have signi®cant
in¯uence on the estimate of the probability distribution
and subsequently on the predicted hydrologic response.
For example, the parameter a of the gamma distribution (see Eq.(A.1)), often used to model the probability
distribution of the topographic index [14], is completely
determined by the skewness (see Eq. (A.10)). Thus error in this parameter quickly propagates to the model
response.
The estimates of second and higher order statistical
moments su€er from the limitation that a few very
large values can distort the estimate, particularly if the
sample size is small. The parameter estimates for a
probability distribution obtained from these moments
can be severely a€ected. In order to overcome this
limitation, L-moments based on probability weighted
moments [6±8], can be used. A probability weighted
moment of order r, for a random variable X with cumulative distribution function F …x†, is given as ar ˆ
EfX ‰1 ÿ F …x†Šr g. The L-moments of the ®rst four orders are given as
k 1 ˆ a0 ;

k2 ˆ a0 ÿ 2a1 ;

Fig. 3. Spatial distribution of basin mean of the topographic index over North America.

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

575

Fig. 4. Spatial distribution of basin standard deviation of the topographic index over North America.

Fig. 5. Spatial distribution of basin skew of the topographic index over North America. Negetive skew values for approximately 2.2% of the total
continental area are displayed as 0 (dark blue).

k3 ˆ a0 ÿ 6a1 ‡ 6a2 ;

k4 ˆ a0 ÿ 12a1 ‡ 30a2 ÿ 20a3 :

Notice that k1 is the usual mean. The moment k2 is
called the L-scale as it measures the spread of the distribution. The L-moment ratio of order r is de®ned as

sr ˆ kr =k2 ;

r ˆ 3; 4; . . . :

The quantities s3 and s4 are called L-skewness and
L-kurtosis, respectively. Note that all moments are
estimated using linear combination of X thereby

576

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

Fig. 6. Downscaling functions for obtaining 90 m equivalent of usual statistical moments of topographic index from estimates using the 1 km DEM
data for the North American Continent. Solid lines represent linear ®t using the least trimmed square robust regression and the dotted lines represent
the usual least squares regression ®t. The equations correspond to the solid lines.

overcoming the problem associated with raising a few
large values to higher powers. Parameters for several
distributions including gamma, normal, log-normal, etc.
can be estimated using the L-moments [8].
Fig. 7 (top) shows the plots of L-moments at the two
resolutions. Since the ®rst moment is the same as the
usual ®rst moment, i.e., the mean, only the second (k2 :
L-scale) and the third (s3 : L-skewness) moments are
tested for linearity at the two resolutions. We see that Lskewness shows signi®cantly improved linear relationship as compared to the usual skewness estimates. No
noticeable improvement in the estimate of L-scale as
compared to skewness is gained. Linear approximation
for the L-kurtosis between the two resolutions is very
weak and hence is not shown. These results suggest that:
(i) simple linear relationship between spatial moments of
topographic indices obtained at the two di€erent resolutions (1 km and 90 m) exists; and (ii) L-moments
provide a better regression equation to downscale the
moment estimates from the low resolution of 1 km to
higher resolution of 90 m.
A sense for the appropriate distribution to describe
the topographic indices can be obtained from the Lmoment ratio diagram which is a plot between Lskewness and L-kurtosis [8]. Fig. 7 (bottom) shows this
diagram obtained from the estimates at the two resolutions. Comparing these with the theoretical values of

gamma distribution (solid line) we see that in the majority of the cases it provides a better approximation at
the higher resolution. The theoretical curves for other
commonly used 3-parameter distributions such as lognormal lie above the gamma distribution curve and are
not plotted. In Appendix A we summarize the parameter
estimation equations for the gamma distribution using
the L-moments, for completeness. Additional details can
be found in [8].

4. Discussion and conclusions
The topographic index obtained using the GTOPO30
DEM data captures the general spatial distribution over
the North American continent. The existence of simple
linear downscaling functions to obtain the ®rst three
statistical moments (usual and L-moments) from the 30arc-second DEM data with global coverage to a resolution that is a factor of 10 higher signi®cantly increases
its utility for hydrologic applications. The analysis also
suggests that the use of L-moments for downscaling will
provide better estimates than the usual moments. For
hydrologic applications, this is particularly signi®cant
since the models utilizing the data are quite sensitive to
the tail of the probability distribution. The strong linear
relationship of L-skewness suggests that information

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

577

Fig. 7. (Top) Same as 6 but for L-scale and L-skewness. (Bottom) L-moment ratio diagram at the two resolutions. Dots represent the observed
values. The theoretical values for the 3-parameter gamma distribution are plotted as the solid line.

about the tail of the distribution is simply scaled but not
lost. The L-moment ratio diagram suggests that gamma
distribution provides a reasonable approximation to the
probability distribution function at the higher resolution
of 90 m. Its parameters can be estimated from the ®rst
three L-moments.
Several issues can be raised about the use of these
results for Topmodel application where typically DEM
data at 30 m or higher resolution are recommended.
However, the empirically observed linear relationships
suggest that in the absence of high resolution data, a
course resolution data as used in this paper, along with
the linear downscaling scheme, provide a viable means
for the application of Topmodel concepts over signi®cantly larger areas. Obviously, care has to be taken that
other model assumptions should be valid in the region
of interest. In addition, sensitivity studies should be
performed to assess the magnitude of the model response error associated with the approximations resulting from the linear downscaling scheme.
The theoretical reason of the empirically observed
linear relationship is not clear but we speculate that it is
a result of similarity in topographic features at the different resolutions. Similar conclusions were also reached
by Wolock and McCabe [20] where they also present a
more detailed account of the e€ects of discretization and
smoothing on the estimates of the slopes and contributing area. Whether fractal or multifractal characteris-

tics of topographic features can give rise to such a
behavior is an intriguing hypothesis and will be pursued
in a separate study.

Acknowledgements
This research was partially funded by NASA grants
NAG5-3661, NAGW-5247, and NAG5-7170, and NSF
grant EAR 97-06121. Thanks are also due to Dave
Wolock for providing the code to estimate the topographic index from the GTOPO30 dataset using ARC/
INFO and to Margie Caisley for performing a lot of the
data analysis work.

Appendix A. Estimation of gamma distribution using
L-moments
Parameter estimation for the 3-parameter gamma
distribution using the L-moments is described here for
completeness (see [8] for details). The probability distribution for the gamma distributions is given as
aÿ1

f …x† ˆ

…x ÿ n† eÿ…xÿn†=b
;
ba C…a†

…A:1†

where a, n, and b are the parameters of the distribution.
The L-moments are given as

578

P. Kumar et al. / Advances in Water Resources 23 (2000) 571±578

k1 ˆ n ‡ ab;

…A:2†



1
k2 ˆ pÿ1=2 bC a ‡
C…a†;
2

…A:3†

s3 ˆ 6I1=3 …a; 2a† ÿ 3:

…A:4†

Here Ix …p; q† is the incomplete beta function ratio
Z
C…p ‡ q† x pÿ1
qÿ1
t …1 ÿ t† dt:
…A:5†
Ix …p; q† ˆ
C…p†C…q† 0
If 0 < j^
s3 j < 1=3, let z ˆ 3p^
s23 , where s^3 is the estimated
value of s3 . Then estimate a as
a^ 

1 ‡ 0:2906z
:
z ‡ 0:1882z2 ‡ 0:0442z3

…A:6†

If 1=3 6 j^
s3 j < 1, let z ˆ 1 ÿ j^
s3 j. Then
a^ 

0:36067z ‡ 0:59567z2 ‡ 0:25361z3
:
1 ÿ 2:78861z ‡ 2:56096z2 ÿ 0:77045z3
Given a^, estimate

…A:7†

mean : l^ ˆ k^1 ;

…A:8†



p
1
^
aC…^
a†=C a^ ‡
;
std: dev: : r^ ˆ k2 p^
2

…A:9†

s3 †;
…A:10†
skewness : c^ ˆ 2^
aÿ1=2 sign…^
where k^1 and k^2 are the estimated values of the L-moments k1 and k2 , respectively. Then the estimates of
parameters n and b are given by the equations
n^ ˆ l^ ÿ 2^
r=^
c;
1
b^ ˆ r^j^
cj:
2

…A:11†
…A:12†

References
[1] Beven KJ, Kirkby MJ. A physically based variable contributing
area model of basin hydrology. Hydrol Sci Bull 1979;24(1):43±69.
[2] Burns PJ. A genetic algorithm for robust regression estimation.
Statsci Technical Note 1992.
[3] Ducharne A, Koster RD, Suarez MJ, Kumar P. A catchmentbased land-surface model for GCMs and the framework for its
evaluation. Phys Chem Earth 1999;24(7):769±73.
[4] Famiglietti JS, Wood EF. Multiscale modeling of spatially
variable water and energy balance processes. Water Resour Res
1994;30(11):3061±78.

[5] Gesch DB, Verdin KL, Greenlee SK. New land surface digital
elevation model covers the Earth, EOS, transactions, American
Geophysical Union February 9 1999;80(6):69±70.
[6] Greenwood JA, Landwehr JM, Matalas NC, Wallis JR. Probability weighted moments: de®nition and relation to parameters of
several distributions expressible in inverse form. Water Resour
Res 1979;15:1049±54.
[7] Hosking JRM. L-moments: analysis and estimation of distributions using linear combinations of order statistics. J Royal Stat
Soc Ser B 1990;52:105±24.
[8] Hosking JRM, Wallis JR. Regional frequency analysis: an
approach based on L-moments. Cambridge: Cambridge University Press, 1997. p. 224.
[9] Jenson S, Domingue J. Extracting topographic structure from
digital elevation data for geographic information system analysis.
Photogrammetric Eng Remote Sensing 1988;54:1593±600.
[10] Koster R, Suarez MJ, Kumar P, Ducarne A. A catchment-based
land surface model for GCMs, presented at American Geophysical Union Fall Meeting, December 1997 [Abstract published
in EOS, Transactions, American Geophysical Union 1997;78
(46):F259].
[11] Chen J, Kumar P. Study of hydrologic response over North
America using a basin scale model, presented at American
Geophysical Union Spring Meeting, May 1999 [Abstract published in EOS, Transactions, American Geophysical Union
1999;80(17):S159].
[12] Pfafstetter O. Classi®cation of hydrographic basins: coding
methodology, unpublished manuscript, DNOS, 18 August 1989,
Rio de Janeiro [translated by Verdin JP. US Bureau of Reclamation, Brasilia, Brazil, 5 September 1991].
[13] Quinn P, Beven K, Chevallier P, Planchon O. The prediction of
hillslope ¯ow paths for distributed hydrologic modeling using
digital terrain models. Hydrol Process 1991;5:59±79.
[14] Sivapalan M, Beven KJ, Wood EF. On hydrologic similarity, 2, a
scaled model of storm runo€ production. Water Resour Res
1987;23(7):1289±99.
[15] Stieglitz M, Rind D, Famiglietti J, Rosenzweig C. An ecient
approach to modeling the topographic control of surface hydrology for regional global cimate modeling. J Climate 1997;10:118±37.
[16] Verdin KL. A system for topologically coding global drainage
basins and stream networks, http://edcwww.cr.usgs.gov/landdaac/
gtopo30/hydro/P311.html, 1997.
[17] Verdin KL, Verdin JP. A topological system for delineation and
codi®cation of the Earth's river basins. J Hydrol 1999;218:1±12.
[18] Wolock DM. Simulating the variable-source-area concept of
stream¯ow generation with the watershed model TOPMODEL.
Water-Resources Investigations Report 93-4124, US Geological
Survey, 1993.
[19] Wolock DM, Price CV. E€ects of digital elevation model map
scale and data resolution on a topography-based watershed
model. Water Resour Res 1994;30(11):3041±52.
[20] Wolock DM, McCabe GJ Jr. Di€erences in topographic characteristics computed from 100- and 1000-meter resolution digital
elevation model. Hydrol Process 2000, to appear.