Directory UMM :Data Elmu:jurnal:A:Advances In Water Resources:Vol23.Issue3.1999:

Advances in Water Resources 23 (1999) 253±260

Does the river run wild? Assessing chaos in hydrological systems
Gregory B. Pasternack

1

Department of Land, Air and Water Resources, University of California, 211 Veihmeyer Hall, Davis, CA 95616-8628, USA
Received 7 April 1998; received in revised form 3 November 1998; accepted 3 February 1999

Abstract
The standing debate over whether hydrological systems are deterministic or stochastic has been taken to a new level by controversial applications of chaos mathematics. This paper reviews the procedure, constraints, and past usage of a popular chaos time
series analysis method, correlation integral analysis, in hydrology and adds a new analysis of daily stream¯ow from a pristine
watershed. Signi®cant problems with the use of correlation integral analysis (CIA) were found to include a continued reliance on the
original algorithm even though it was corrected subsequently and failure to consider the physics underlying mathematical results.
The new analysis of daily stream¯ow reported here found no attractor with D 6 5. Phase randomization of the Fourier Transform of
stream¯ow was used to provide a better stochastic surrogate than an Autoregressive Moving Average (ARMA) model or gaussian
noise for distinguishing between chaotic and stochastic dynamics. Ó 1999 Elsevier Science Ltd. All rights reserved.
Keywords: Chaos; Time series analysis; Stream¯ow analysis; Non-linear dynamics

1. Introduction

Chaos mathematics has been increasingly perceived
as the de facto tool for studying dynamical systems that
deterministic and stochastic models have had limited
success with predicting. The ®rst step in applying chaos
mathematics is to determine whether a particular
hydrologic system is in fact chaotic. This assessment can
be accomplished by investigating the limits of predictability and error propagation in current operational
forecasting models or by searching for indicators of
chaotic dynamics in recorded time series of dynamical
system variables.
The chaotic nature of instantaneous weather has been
®rmly established by numerical experiments with global
circulation models (GCMs). Studies using the most sophisticated GCMs demonstrate that forecasts have a
sensitive dependence on their initial conditions [17]. As a
result, even if future GCMs perfectly simulate the atmosphere, the predictability of weather variables would
approach zero for forecasts beyond two weeks [16,26].
Detailed analyses of simple atmospheric models have
been used to study the underlying characteristics of
chaotic behavior [15,18,21].


1

Tel.: +1 530 754 9243; fax: +1 530 752 5262; e-mail: gpast@ucdavis.edu.

Because sophisticated dynamical models are not
available for many systems of interest, chaos-based time
series analysis o€ers an alternative means for identifying
chaotic behavior in cases where high quality, long term
hydrologic records are available. The primary tool used
to look for chaos in time series has been correlation
integral analysis (CIA). This fractal scaling method was
introduced by Grassberger and Procaccia [10], but was
precipitously adopted before important quali®cations
(e.g. 9) were publicized. Subsequent attempts at CIA
relied on the accuracy of the original algorithm without
due consideration of relevant constraints or the physics
underlying mathematical results. Consequently, the
wave of initial analyses in some ®elds has been followed
by a wave of corrections and counterclaims. Unfortunately, use of CIA in hydrology has followed this path,
as exempli®ed by the ongoing debate over the nature of

rainfall in Boston [8,22,23,27,33].

2. Correlation integral analysis
The correlation dimension (Dc ) is a measure of the
dimension (D) of an attractor governing the trajectories
of solutions of a dynamical system in phase space. If D is
non-integer then the attractor is called a `strange
attractor' because it has a complex structure that is selfsimilar at all scales. Strange attractors are examples of

0309-1708/99/$ - see front matter Ó 1999 Elsevier Science Ltd. All rights reserved.
PII: S 0 3 0 9 - 1 7 0 8 ( 9 9 ) 0 0 0 0 8 - 1

254

G.B. Pasternack / Advances in Water Resources 23 (1999) 253±260

fractals [21]. Because the precise de®nitions of `fractal',
`chaotic', and `strange attractor' are still debated, it is
not possible to conclude that a strange attractor is
necessarily chaotic or that chaos implies fractal geometry [8]. Nevertheless, obtaining a non-integer, ®nite Dc

for a time series when Dc for a corresponding control
stochastic surrogate (to be discussed) is unbounded
demonstrates fractal scaling and suggests chaos.
The procedure for computing Dc (see Appendix A)
derives from Grassberger and Procaccia [10] but with
important additional constraints from the subsequent
literature. By this method a time series of a single variable is transformed into a time series of a set of variables
by lagging the data m times and assigning the ith lagged
series to the ith dimension Fig. 1. The distances between
one of the points (the reference) and others in the reconstructed m-dimensional phase space are measured
Fig. 2 and compared to the radius, r, of a sphere centered on the reference point. The correlation integral for
a given radius, C(r), is the fraction of distances that are
less than the radius, after averaging over many di€erent
reference points from the set. When ln[C(r)] versus ln[r]
is plotted for a given embedding dimension m, the range
of ln[r] where the slope of the curve is constant is the
scaling region where fractal geometry is indicated. In
this region C(r) increases as a power of r, with the
scaling exponent being the correlation dimension, Dc . If
the time series is chaotic, then for increasing embedding

dimension the computed Dc must become independent
of m, i.e. `saturate'. The ®nite value of Dc where saturation occurs is the ®nal estimate of D for the attractor,
with Dc 6 D.
Only a small fraction of reported studies that use CIA
to characterize hydrologic systems follow the procedure
including all of the necessary precautions. Grassberger
and Procaccia [10] state that the embedding time lag (s)
may be chosen arbitrarily, but in all applications the
amount of data is limited, so s must be long enough for
data points to be independent to yield a meaningful
attractor reconstruction [30,33]. Where this constraint
has been accounted for, suitable lag times have been
selected on the basis of the autocorrelation function
[5,12,27,33], the mutual information function [7], and
the more general redundancy criteria for multidimensional systems [6]. Studies that have sought to assess the

Fig. 1. Transformation of (a) measured data into (b) lagged 2-D set.

Fig. 2. Illustration of distance measurements made in correlation integral analysis.


e€ect of s on Dc (e.g. 14) have used very small ranges of
values (e.g. 4±16 days), which are insigni®cant compared
to typical decorrelation times in hydrologic systems that
may exceed 50±200 days [31]. Furthermore, many authors have neglected the `proximal points' constraint
that the distances between points that are closer in time
than s should be excluded from the computation of C(r)
[5,9,31]. The geometric explanation for this constraint is
that points close in time fall on a low-dimensional surface that overcontributes to the correlation integral.
Rather than measuring the low dimensional structure
along a trajectory, as analysis of proximal points does,
appropriate application of CIA seeks to measure the
fractal geometry of the distances to other `loops'
(Fig. 2). The overcontribution of proximal points arti®cially ¯attens the slope of C(r) and thus depresses the
computed Dc .
A ®nal important consideration for assessing the reliability of the CIA-computed dimension is the size of
the embedded time series, n. In geometric terms, the
series must be long enough for the points along one
``edge'' of the attractor to reasonably represent the
hypersurface. Wilcox et al. [31] and Tsonis et al. [30]
summarize the literature on the number of points

needed. Criteria such as 10A or 10…2‡0:4m† data points,
where A is the greatest integer lower than Dc and m the
embedding dimension, mean that few hydrologic records can be assessed for greater than 5-D attractors
since as many as 10,000 points require a 27 yr daily
record. Also, di€erent variables of a given system may
require di€erent numbers of data points to obtain Dc
depending on how each is coupled to the rest of the
system and whether each exhibits thresholds in its behavior [12,18,32].
Even if sucient points appear to be available, the
number will be substantially diminished by embedding.
For example, a data set of 3316 points reduced to 1956
after embedding to m ˆ 10 with a delay corresponding to
the time for the autocorrelation function to reach 0.5
[27]. This is a problem because the interval over which

G.B. Pasternack / Advances in Water Resources 23 (1999) 253±260

scaling exists diminishes as m increases up to a critical
embedding dimension, beyond which no scaling region
can be accurately de®ned. If the critical embedding dimension for a data set is less than Dc for that data, then

Dc cannot be correctly determined [3,33]. The consequence of ignoring these constraints is a signi®cant underestimate of the true dimension of a system.
A useful tool for assessing the dimensional limit of a
given data set has been used in some studies but not
widely discussed. The approach is to conduct a parallel
control CIA experiment using a stochastic surrogate
derived from the original data. Using CIA on the stochastic surrogate embedded to dimension m, the slope of
C(r) for low r will be found to closely approach m given
enough data points [22]. In other words, a sucient
quantity of random data will ®ll all available dimensions
in space. In some cases stochastic systems with an in®nitely large number of degrees of freedom have been
found to have ®nite values of Dc , particularly if the data
set is too small [8]. However, the slope of C(r) for the
control serves as a baseline for comparing real data of
the same size to assess whether the underlying system is
stochastic or chaotic. If the slope of C(r) tends toward
independence from m for the real data faster than that
for the stochastic surrogate, then the hydrologic system
does not have in®nite degrees of freedom. On the other
hand, if slopes for the real data are dependent on m and
thus do not become constant before the scaling region

vanishes as m increases, then the correlation dimension
of the attractor cannot be characterized. Also, if the
slope of C(r) for increasing m for the control shows so
much deviation from m that the real data cannot be
distinguished, then Dc again cannot be computed. These
concepts will be illustrated in an example below.
Studies of hydrologic records that have included
stochastic controls have used either arbitrary random
numbers [3,27] or sets generated using Autoregressive
Moving Average (ARMA) models (e.g. 14). While the
ARMA model does preserve a fraction of the power
spectral density of the original data, the most appropriate and objective baseline would be a stochastic time
series with the same power spectrum (and thus autocorrelation) as the original series. Such a stochastic
surrogate can be generated by calculating the Fourier
Transform of the time series, randomly re-assigning
phases between 0 and 2p to the transform, and then
returning the data to the time domain using the inverse
transform algorithm [19]. No CIA studies of hydrologic
records have used this approach.


3. Problems encountered with CIA
Table 1 lists CIA studies from hydrology (broadly
de®ned) and relevant parameters. For comparison, it
begins with three well known chaotic systems whose

255

correlation dimensions were computed and compared to
known fractal dimensions [10]. Early CIAs examined
short oxygen isotope records with additional interpolated values that were highly correlated [4,20]. Grassberger [9] re-analyzed those data and found problems
with applying CIA in those instances. Fraedrich [4,5]
looked for a winter weather attractor by concatenating
November±February data. This novel approach addressed the mathematical needs of CIA. However, in
terms of the physical phenomenon, transition periods
for the weather trajectory to settle into or out of a winter
attractor exist. It is dicult to know when transition
periods occurred and for how long, so doing CIA for the
Nov-Feb concatenated record may not be indicative of
the nature of this type of attractor.
Analyses of stream¯ow have been conducted for

daily, monthly, and discharge derivative values. Savard
[25] analyzed the 27009 data point Merced River discharge derivative (Q…t‡1† ÿ Qt ) and found no low dimensional attractor. Wilcox et al. [31] presented a
thorough CIA analysis of a standardized (periodicity
removed), log-transformed runo€ record and also found
no low dimensional attractor. In their analysis, they
showed that using a short time lag results in a signi®cant
underestimate of Dc . Jayawardena and Lai [14] investigated rivers in Hong Kong, but used very short (2±3
days) time lags. Their reported Dc values of 0.45 violate reality; no fewer than 3 degrees of freedom can
generate chaos, and chaotic attractors cannot have
D