5 Deterministic Chaos Fundamentals
5.2 Sensitive Dependence and Butterfly Effect
5.3 Average Mutual Information (AMI)
The ancient word “chaos”
originally denoted a complete lack of form or systematic arrangement, but today
is often used to imply the absence of some kind of order that ought to be
present. Moreover chaos is regarded as a
universal phenomenon that is observed in many fields. Terms such as
non-linearity, complexity, and randomness are often used more or less
synonymously with chaos in one or several of its senses. Chaos could be compared to the manner in
which many disorganized systems can spontaneously acquire organization, just as
a shapeless liquid mass can, upon cooling, solidify into an exquisite
crystal. Mathematicians have defined
chaos as stochastic behavior occurring in a deterministic system. Therefore,
chaos theory is the popular label for a body of theory about certain
mathematical models and their applications that study deterministic systems so
sensitive to measurement that their output appears random.
Classic systems that vary
deterministically as time progress, such as mathematical models, are known as
dynamic systems. At least in the case
of the models, the state of the system may be specified by the numerical values
of one or more variables. A
deterministic sequence is one in which only one thing can happen next because
is governed by precise laws. On the
other hand, a random sequence of events is one in which anything that can ever
happen can happen next. Usually it is
also understood that the probability that a given event will happen next is the
same as the probability that a like event will happen at any later time. Hence, a random system is a system in which
the progression from earlier to later states is not completely determined by
any law. It could also be expressed as
a system that is not deterministic.
In general, supposing a
real-world phenomenon whose state at a particular time can be characterized by
the values of the n variables x
, x
,…, x
, (so x
might represent the angular position and velocity of a
swinging pendulum, or might indicate the relative concentrations of certain
chemicals in a mixture, or the velocity and temperature gradients in a
convecting fluid). If we choose the
right quantities to represent these state variables, then we may be able to
specify the dynamics of the phenomenon, the way that the phenomenon evolves
over time by giving the rate of change of each variable as some function of x
. In other words, we may
be able to describe the system dynamics by means of a set of n linked
differential equations in the canonical form
|
|
If the
in the set of
equations 5-1 satisfy some relatively mild constraints, then there will be a unique
solution to the equations for a given set of initial conditions. In other words, a particular setting of the
parameters and the initial values of
at time
will fix a unique set of values for
at least for some
interval of time around
(and perhaps for all
times). In the general case, we will
not be able to write down an explicit solution – we won’t be able to specify
the value of each
in terms of
polynomial or trigonometrical functions of time. So to use the equations we will have to resort to numerical
integration by computer. But the point
of the principle remains: the set of equations determines a unique evolution of
the state variables over some period of time.
Hence, the equations describe a mathematical model that is deterministic
in a straightforward sense.
Generally, it is helpful to
look at things geometrically. So
imagine the values of the n state-variables
as giving the
coordinates of a point in an abstract n-dimensional space, a so-called state
space or phase space. A
point x in this phase space with the coordinates
will then represent a
particular instantaneous state of our dynamic system. And, given a point
representing the state at some initial time
, the dynamic equations (with fixed parameters) will, in the
deterministic case, fix a unique trajectory or path traced out in phase space
by the point
representing the state at later times
. (See Figure 5.1)
If we are to apply a
mathematical model to predict the evolution of some real-world dynamic phenomenon,
then we must start by fixing the initial conditions to feed into the
model. But, we can only know the actual
initial conditions with some margin of error.
If we input a small error in representing the initial real-world state,
then the dynamic equations will output a correspondingly erroneous prediction
about where the system ends up at later time t (and the predictive error
may very well grow over time).
|
|
To put it geometrically, we
can only pin down the point representing the initial state of the dynamic
system to within some small fuzzy-boundaried “ball” of phase space and our
dynamic equations will then map that fuzzy initial region of phase space onto a
possibly much more spread-out region that will only contain the point
representing the later state at t as shown in Figure 5.2.
|
Figure 5.2
A small ‘ball’ of initial states spread out by the dynamics.
[35] |
The hypothetical
multidimensional space in which such a diagram would have to be drawn, and
describes a chaotic system by using various indices, is known as phase space.
In other words, phase space is a hypothetical space having as many dimensions
as the number of variables needed to specify a state of a given dynamic
system. Each point represents a
particular state of a dynamic system.
The coordinates of a point in phase space (distances in mutually
perpendicular directions from some reference point, called the origin) are
numerically equal to the values that the variables assume when the state
occurs. Even though the concept of these diagrams can be useful, sometimes the
diagrams cannot be drawn in the phase space to include as many dimensions as
the number of variables in the system.
In the phase space of
chaotic dynamic systems, two orbits slightly separated from each other will
differ exponentially with time. The
degree with which two infinitesimally separated orbits move away from or
approach each other is measured by the Lyapunov exponent, and is calculated by
the long time average of the algorithm of the amplification (reduction) rate of
the difference between the two orbits.
Since the number of directions for the deviation of the two orbits is
equal to the number of degrees of freedom in the phase space, the number of
degrees of separation is equal to the dimension of the phase space. Thus, the number of Lyapunov exponents is
the same as the number of degrees of freedom.
Chaos is often characterized by a system having at least one positive Lyapunov
exponent. In other words, when an initial value is changed only slightly, a
later state becomes very different. The system is said to have a sensitive
dependence on initial conditions. This
instability of orbits generates a sensitive dependence on initial conditions
and can be measured by the Lyapunov exponent.
One mark of chaos is
sensitive dependence on initial conditions because a chaotic system starting
from two very similar initial states can develop in radically divergent ways.
An immediate consequence of sensitive dependence in any system is the
impossibility of making perfect predictions or even mediocre predictions
sufficiently far into the future. This
assertion presupposes that we cannot make measurements that are completely free
of uncertainty. When Edward Lorenz
published his paper: “Predictability: Does the Flap of a Butterfly’s wings in
Brazil Set off a Tornado on Texas?” such sensitive dependence on initial
conditions is often referred to as “The Butterfly Effect, ” this is because
very small changes in initial conditions can become greatly amplified by later
events in ways that prevent useful prediction [35].
“A small blue butterfly,
let’s suppose, sits on a cherry tree in a remote province of China. As is the way of butterflies, while it sits
it occasionally opens and closes its wings.
It could have opened its wings twice just now; it in fact it moved them
only once. And- because the weather
system exhibits sensitive dependence – the minuscule difference in the
resulting eddies of air around the butterfly eventually makes the difference
between whether, two months later, a hurricane sweeps across southern England
or harmlessly dies out over the Atlantic.
Or so the story goes.”[35].
Chaos is a type of
unpredictable motion generated by deterministic equations (differential
equations or difference equations).
Lorenz for purposes of experimentation created a new system with three
nonlinear differential equations
(Equations 5-2, 5-3 and 5-4) to simulate an extremely simple model of
convection in the atmosphere.
|
Equation 5‑4 |
Even though these equations
do not have an explicit analytic solution, a simple computer numerical program
can solve the Lorenz’s system of equations.
When Lorenz performed the numerical integration, he found that, for
almost any initial state, the model soon settles with the values of x, y and z
confined between definite limits.
Within those limits though, the values vary in highly complex ways.
Figures 5.3, 5.4 and 5.5
show sample runs of the system for an arbitrary initial conditions set of x=y=z=t=0.0001. The combination of all three variables
locates a point in three-dimensional space and results in the phase space
diagram all over time. The thinking
behind the phase space plot is to provide an idea of what the system is like by
containing the output for a long period of time in a single graph.
|
|
|
|
Figure 5.6 shows the phase
space plot of the system. The variables
x, y, and z are acting together over a period of thirty two seconds in this
simulation. The image is known as the “Lorenz Attractor” and is one of the
earliest examples of chaos ever recorded.
It is also been referred to as “Lorenz’s Butterfly.” The Lorenz attractor always has the familiar
butterfly shape, no matter how random each variable may appear to be on its
own, the combination of the three always produces the same picture.
Figure 5.7 shows another run
of the Lorenz attractor program for slightly different initial conditions.
|
Figure 5.7
Lorenz Attractor X vs. Time for slightly different initial conditions
[42] |
Note in Figure 5.8, where it
is plotted against the earlier x variable results, how the outputs stay nearly
the same for a good portion of time at the beginning, but diverge into
completely different patterns.
|
|
Figure 5.9 shows again the
Lorenz attractor, except this time with the same slight variation in initial
conditions as observed in Figures 5.7 and 5.8.
It manages to maintain its same butterfly shape, despite the utter lack
of correlation to Figure 5.8.
|
Figure 5.9
The Lorenz Attractor with slightly different initial conditions
[42] |
Mutual information is a
general measure based on information theory of the extent to which the values
in a time series can be predicted by earlier values. But it is not limited to linear dependence as is the
autocorrelation function. The
correlation function estimates the correlation or how much related are two
random processes from each other. The
true cross-correlation sequence is defined by
![]()
Equation 5‑5
Where
and
are stationary random
processes,
<n<
and E {} is the expected value operator. Autocorrelation is handled, as a special
case of correlation and it is useful in obtaining a partial description of a
time series for forecasting. The
autocorrelation function of
is defined as
![]()
Equation 5‑6
Where the average is taken
over N samples and m is the autocorrelation time in samples.
In order to explain AMI
let’s consider an experiment A with possible outcomes
,
,
, …
. If the respective
probabilities are
,
,
,…,
, the uncertainty of the outcome can be assessed. If a system is deterministic there is no
point in performing an experiment because all
are zero except one.
On the other hand, all
are
equiprobable, the uncertainty of the outcome is at the maximum and the
information gained by carrying out the experiment is also maximal. Consequently, the information obtained by a
measurement of the outcome of a finite scheme A can be expressed through
the corresponding entropy H(A)
|
Equation 5‑7 |
Whereas H(A) is
defined as the entropy that is a measure of randomness, the more random a
variable is, the more entropy it has as is presented in Figures 6.2 and 6.3.
|
|
|
In order to determine higher order
relationships, it is necessary to introduce higher order measures. For example, if measurements are collected
from two schemes A
and B
, the mutual information I(A,B) is the measure of
how much can be said about the one given the other.
|
Equation 5‑8 |
|
Equation 5‑9 |
Here H(A,B) refers to the
information obtained considering A and B together,
|
Equation 5‑10 |
In which
denotes conditional entropy
– the entropy of A given B.
If A and B are independent, the terms
and H(A)
become equal, reducing H(A,B) to H(A)+H(B) and finally implying
that mutual information between A and B amounts to zero - I(A,B)=0. It should also be noted that I(A,B)
so that there are
no negative values as in the case of autocorrelation function.
Defined in other way,
average mutual information is the reduction in uncertainty for one variable due
to knowing about another. Therefore,
mutual information
|
Equation 5‑11 |
Figure 6.1 presents graphically the
average mutual information
between the random
variables A and B.
|
Figure 5.12 Average Mutual
Information, I(A,B) |
Within the context of
nonlinear deterministic systems and chaos theory, AMI is used to determine the
time delay for the phase space reconstruction.
For the analysis of a signal
,
is considered to be
the measurement of the signal at time n and
is the measurement of
the signal a time
later
. Then the first
minimum of
(AMI) is selected as
the time delay to use in making vectors out of the observed one-dimensional
data
. So we take as the
set of measurements
the values of the
observable
and for the
measurements, the
values of
. Then, the AMI
between these two measurements, that is, the amount in bits learned by
measurements of
through measurements
of
is
|
Equation 5‑12 |
By general arguments, ![]()
,
is directly related
to the Kolmogorov –Sinai entropy [1]. When
T becomes large, the chaotic behavior of the signal makes the measurements
and
become independent in
a practical sense, and
will tend to
zero. The function
must be used as a
kind of nonlinear autocorrelation function to determine when the values of
and
are independent
enough of each other to be useful as coordinates in a time delay vector but not
so independent as to have no connection with each other at all [1].