Distinction between deterministic chaos and stochastic noise

state vector by using the nearest neighbor of the present state. We now locate nearby M-dimensional points in the phase space and choose a minimal neighborhood with K closest neighbors such that the predictee the point from which the prediction is made is contained within the smallest simplex. To enclose a point in an M-dimensional space, we require a simplex with a minimum of M þ 1 points. Then, to obtain a prediction, we project the domain of the chosen nearest neighbors T P prediction step steps forward and compute F p to get the predicted value. Since it becomes increasingly difficult to define an enclosing simplex for higher dimen- sional embedding spaces, we have extended the above idea to the nearest neighbors in an Euclidean sense. A minimum of M þ 1 nearest neighbors are chosen based on the Euclidean distance between the neighbor and the predictee. Then, we project the domain of the chosen neighbors T p step forward and estimate the predicted value. We have explored several estimation kernels including arithmetic average, weighted average and weighted regression to estimate the predicted value. It was found that arithmetic average provides comparable prediction accuracy and requires no tuning parameters and hence we have chosen arithmetic average of projected neighboring points to obtain the predicted value in this study. There are only two parameters to be chosen for this phase-space prediction model: embedding dimension M, and number of nearest neighbors K. In general, M min . 2D þ 1 where D is the attractor dimension. An estimate of the attractor dimension may be obtained from the corre- lation dimension 4,5 . Prediction results are sensitive to the choice of M 10 . We will look at the prediction accuracy correlation between predicted and observed as a function of embedding dimension to choose an optimum value of M for our prediction algorithm. Since to enclose a point in an M -dimensional space, we require to construct a simplex with a minimum of M þ 1 points, one has K min . M þ 1. Use of the phase space to develop a forecasting model may appear to be similar to an autoregressive model: a pre- diction is estimated based on time-lagged vectors. However, the crucial difference is that understanding phase-space geometry frames forecasting as recognizing and then repre- senting underlying dynamical structures. For example, two neighboring points in a phase space may not be close to each other within the context of a time sequence. The traditional autoregressive AR model relies on time-lagged signals that are neighbors in a temporal sense, whereas a neighbor in a phase space is close in a dynamic sense. In addition, once the number of lags exceeds the minimum embedding dimension, the geometry of the underlying dynamics will not change. A global linear model, such as the AR model, must do this with a single hyperplane with no fundamental insight into the underlying geometric structure. Unlike traditional AR models, the proposed methodology also promises to make a tentative distinction between stochastic noise and low-dimensional chaos. A characteristic feature of chaotic dynamics is that the prediction accuracy exponen- tially decays as the prediction time increases. On the other hand, for a noisy system the prediction accuracy does not decay sharply with prediction lead time 9,10 .

2.3 Distinction between deterministic chaos and stochastic noise

Below, we show how a phase-space-based forecasting model works by applying it to a known chaotic time series generated from the well-studied chaotic Henon map. Additionally, as an example of noisy dynamics, we study uncorrelated additive noise superimposed on a sine wave. Such uncorrelated noise can be thought of as measurement error superimposed on a hypothetical streamflow signal with a pronounced seasonal cycle. We have used a total of 5000 points for each time series; the first 4000 points are used as a training set while the other 1000 points are used to make predictions and estimate prediction accuracy as a function of prediction lead time. Fig. 4 shows the prediction accuracy for the chosen Fig. 3. Schematic representation of the nearest-neighbor method for phase-space-based prediction. The present state Xt and its unknown future value Xt þ T are denoted by open circles, while the black dots inside the circle represent the neighborhood of Xt in the phase space. By finding a suitable function linear or non- linear that approximates how neighbors move, a prediction of the current state is made 6,19 . Fig. 4. Prediction accuracy, defined as the correlation between the observed and predicted values of a particular time series, as a function of prediction lead time. The dotted line represents a sinewave with additive noise, while the solid line depicts the chaotic Henon map. 466 Q. Liu et al. chaotic and noisy time series. Here, the prediction accuracy is defined as the correlation between the observed and pre- dicted values of a particular time series. The dotted line shows that the correlation does not decline for additive noise here white noise is superimposed on a periodic signal as one tries to forecast further into the future. In contrast, the solid line for a time series generated from a chaotic Henon map shows the declining signature charac- teristic of a chaotic sequence. For a detailed discussion on the Henon map including its stability and phase-space characteristics we refer to Ref. 19 . The correlation coeffi- cient of the Henon map prediction drops abruptly from 0.95 for T p ¼ 1 to 0.16 for T p ¼ 3. Such a sharp drop in the prediction accuracy is a characteristic signature of a chaotic signal. If there is a periodicity in the signal which is less than the maximum prediction lead time then the effects of the periodicity of the signal will show up in the prediction accuracy. To avoid such an influence, usually a difference time series is used 9,10 . On the other hand, the correlation coefficient for the noisy time series does not show such an exponential loss of information with prediction lead time. In the following section, we will explore the utility of this diagnostic tool to characterize the nature of daily streamflow. 3 PHASE-SPACE-BASED MODEL FOR STREAMFLOW PREDICTION

3.1 Analysis of daily streamflow from the southwestern United States