Reconstruction of the phase space from data by time delay embedding

rather than with actual model variables, it is customary to call this state space a phase space. We will use a time-delay embedding defined in Section 2 to reconstruct the phase- space from the observed streamflow signals. In principle, the phase space contains the knowledge about the internal dynamics of the system and thus can be used as a predictive tool. The basic idea here is that since the embedding map preserves the underlying dynamic structure, the future can be predicted from the behavior of the past. As shown by Takens 2 , the phase space retains essential properties of the original state space including the dimensionality of the underlying system. Now, if one can reconstruct the determi- nistic rules underlying the data in a phase space then one can attempt to predict the future states from the history of the data embedded in the phase space. Several recent studies have successfully used phase-space-based models for chaotic signal characterization 3–5 , prediction 6 , noise reduc- tion 7 and lake level prediction 8 . In addition, as we will see, a phase-space-based model can also be used to make short- term prediction and provide a tentative distinction between low-dimensional determinism and noise 9,10 . Hydrologists have long maintained that large basins are smoother in their streamflow response behavior than small basins. This assumption has not been really substantiated from a quantitative, data based, point of view, although arguments based on the smoothing effect resulting from larger storage property constitute a reasonable basis for its acceptance. Such a smoothing effect of large basins is fre- quently translated in the assertion that because of their inherent larger degree of linearity, their response e.g. run- off is easier compared to the smaller basins to predict. It is commonly argued that as the time and spatial averaging increase, then the rainfall–streamflow relationships may become more linear and hence the streamflow becomes more predictable. However, even if the above is true, it is not clear how much the predictability of streamflow will increase in terms of accuracy and prediction lead time. Recent studies have shown possible presence of chaos in streamflow 11,12 . If the underlying streamflow signal is chaotic, it is quite possible that its inherent predictability will be quite limited irrespective of the basin area. In this study, we will describe an alternative model for streamflow prediction. This model will be used to investi- gate the characteristic signatures of streamflow signals e.g. low-order determinism vs stochastic noise at the daily scale. For example, does streamflow change dynamics non- linear to linear with increasing basin area? What is the impli- cation of the nature of streamflow characteristics on its predictability? We will use recent developments in nonlinear modeling, phase-space reconstruction from a time series and related diagnostic tools to address the above issues. 2 STREAMFLOW MODELING: A DYNAMICAL SYSTEM PERSPECTIVE Due to the dramatic expansion of digital data acquisition and processing, it is now possible to develop predictive models for streamflow dynamics from a ‘theory-poor’ and ‘data-rich’ perspective. By theory-poor we mean that our approach does not require explicit formulation of governing partial differential equations. The idea of data intensive modeling is by no means new—an autoregressive model 13,14 is a good example. What is new is the emergence of a set of concepts and tools such as phase-space recon- struction, neural network, etc. that combine broad approxi- mation abilities and few specific assumptions 15 . We will take this data-rich and theory-poor perspective to construct a predictive model directly from streamflow time series. Building this type of dynamical model from a time series involves two steps: i reconstruction of the phase space from data by time delay embedding; and ii development of a methodology for phase-space prediction.

2.1 Reconstruction of the phase space from data by time delay embedding

Let X t be the time series of a dynamical variable from a potentially complex natural system e.g. streamflow signal. As the M variables {X k t} describing the system satisfy a set of first-order differential equations, successive differen- tiation in time reduces the problem to a single highly non- linear differential equation of Mth order for one of these variables. Thus, instead of X k t, k ¼ 0, 1, … M ¹ 1, we may use X t, the variable of the time series data, and its M ¹ 1 successive derivatives X k t , k ¼ 1, … M ¹ 1, to be the M variables of the problem spanning the phase space of the system 1 . Therefore, in principle, sufficient information is given in a one-dimensional time series to construct a multidimensional phase-space for studying the system dynamics. A simple procedure, suggested originally by Ruelle 16 , avoids the problem of calculating X k t from a time series of X t and uses multiple time delays as a surrogate for successive derivatives. A point in an M-dimensional phase-space X t is then defined as X t ¼ [ X t , X t þ t , X t þ 2t , …, 3 X {t þ M ¹ 1 t } ] To construct a well-behaved phase space by time delay, a careful choice of t is critical. A popular choice for this characteristic time scale is chosen from the autocorrelation function of the original time series. Here, the time delay t is chosen such that the autocorrelation drops to 1e 4 . As shown by Takens 17 , the phase space retains essential properties of the original state space including the dimen- sionality. In addition, as we will see, a phase space can be used to make short-term predictions and to make a practical distinction between low-dimensional determinism and noise 9,10 . As an example, Fig. 1a shows the first 100 points from the so-called Henon map. For the chosen para- meter, this is a chaotic map. This time series, in many ways, is indistinguishable from a white noise sequence 464 Q. Liu et al. [Fig. 1b]. A phase-space plot of the Henon map and a white noise sequence in Fig. 2, on the other hand, reveals remarkable structure in the chaotic Henon map while the white noise sequence fills up the entire plane with no apparent structure.

2.2 Develop a methodology for phase-space prediction