Directory UMM :Data Elmu:jurnal:P:Physics of Life:

Physica A 283 (2000) 568–574
www.elsevier.com/locate/physa

Evolutionary freezing in a competitive population
N.F. Johnsona; ∗ , D.J.T. Leonarda , P.M. Huib , T.S. Lob
a Clarendon

Laboratory, Department of Physics, University of Oxford, Parks Road, Oxford, OX1 3PU, UK
of Physics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong

b Department

Received 30 March 2000

Abstract
We show that evolution in a population of adaptive agents, repeatedly competing for a limited
resource, can come to an abrupt halt. This transition from evolutionary to non-evolutionary
behavior arises as the global resource level is changed, and is reminiscent of a phase transition
to a frozen state. Its origin lies in the inductive decision-making of the agents, the limited global
c 2000 Elsevier
information that they possess and the dynamic feedback inherent in the system.

Science B.V. All rights reserved.
PACS: 89.60.+x; 05.40.+j; 64.60.Lx; 87.10.+e
Keywords: Self-organized phenomena; Complex adaptive systems; Minority game

An evolving population in which individual members (‘agents’) adapt their behavior according to past experience while repeatedly competing for some limited global
resource, is of great interest to the social, economic and biological sciences [1–8].
Arthur [5] has introduced a simple example, called the bar-attendance problem, which
typi es the complexity often encountered in these di erent disciplines [3]: potential
customers of a particular bar with a limited seating capacity have to repeatedly decide
whether or not to attend on a given night each week [9]. A special limiting case of
this problem, the so-called minority game [10 –12], is currently attracting much attention see e.g. Refs. [13–18] and has been shown to have a fascinating connection with
disordered spin systems [13].
In this paper we introduce and study a simple implementation of the general barproblem which includes evolutionary learning by the attendees. We nd that the evolutionary process can come to an abrupt halt as the global resource level is varied. This
transition is dynamical in origin: it results from the inductive decision-making of the
Corresponding author. Tel.: +44 1865 272287; fax: +44 1865 272400.
E-mail address: [email protected] (N.F. Johnson).



c 2000 Elsevier Science B.V. All rights reserved.

0378-4371/00/$ - see front matter
PII: S 0 3 7 8 - 4 3 7 1 ( 0 0 ) 0 0 2 3 0 - 2

N.F. Johnson et al. / Physica A 283 (2000) 568–574

569

agents, the limited global information that they possess, and the dynamical feedback
present in the system’s memory.
Consider N agents repeatedly deciding whether to attend a bar with a seating capacity
(i.e., global resource level) of L. Let the actual attendance at the bar at timestep t be At .
If At 6L , the outcome is the signal ‘undercrowded’. In contrast, if At ¿ L then the outcome is the signal ‘overcrowded’. Hence, the outcome can be represented by a binary
string of 0s (representing, say, ‘undercrowded’) and 1s (representing ‘overcrowded’).
The outcome is the only information made known to all agents: the value L is not
announced, and the agents cannot communicate with each other. All agents have the
same level of capability: speci cally, each agent has access to a common register or
‘memory’ containing the outcomes from the most recent occurrences of all 2m possible bit-strings { } of length m. This register can hence be written at timestep t as
a 2m -dimensional vector ht with binary components h t ∈ {0; 1} corresponding to the
outcome of the most recent occurrence of history : we will call ht the ‘predicted
trend’ at time t. We assign each agent a single strategy p. Following a given m-bit

sequence , p is the probability that the agent will choose the same outcome h t as
that stored in the common register, i.e., she follows the predicted trend; 1 − p is the
probability she will choose the opposite, i.e., reject the predicted trend.
The ‘good’ decisions are attending (not attending) the bar with the outcome being
‘undercrowded’ (‘overcrowded’). The ‘bad’ decisions are attending (not attending) the
bar with the outcome being ‘overcrowded’ (‘undercrowded’). After the outcome at a
given time t is announced, the agents making ‘good’ (‘bad’) decisions gain (lose) one
point. If an agent’s score falls below a value d ¡ 0, then her strategy is modi ed,
i.e., the agent gets a new p value which is chosen with an equal probability from
a range of values, centered on the old p, with a width equal to R. Upon strategy
modi cation, the agent’s score is reset to zero. Hence d is the net number of times an
agent is willing to be wrong before modifying her strategy. Although this evolutionary
procedure provides a fairly crude ‘learning’ rule as far as machines are concerned [19],
in our experience it is not too dissimilar from the way that humans actually behave
in practice. Changing R allows the way in which the agents learn to be varied. For
R = 0, the strategies will never change (though the memory register will). If R = 2, the
strategies before and after modi cation are uncorrelated. For small R, the new p value
is close to the old one. Our results are insensitive to the particular choice of boundary
conditions employed. For L=(N −1)=2, the model reduces to the evolutionary minority
game introduced in Ref. [20].

Fig. 1 shows the numerical results for the mean and standard deviation of the bar
attendance as a function of the global resource level (i.e., seating capacity) de ned as
the percentage ‘ = (100 × L=N )%. Here N = 1001, m = 3, d = −4 and R = 2 although
the same general features arise for other parameter values. Averages are taken over
104 timesteps within a given run, and then over 10 separate runs with random initial
conditions: this average is denoted by h· · ·i. Here we will focus on the mean attendance
2 1=2
hAi; the standard deviation of the attendance A given by [hA2 i − hAi
R 1 ] ; the average
1

predicted trend given by hh i; the average strategy hpi given by N 0 pP(p) dp where

570

N.F. Johnson et al. / Physica A 283 (2000) 568–574

Fig. 1. Numerical results for the mean bar attendance hAi and standard deviation A as a function of the
global resource level (i.e., seating capacity) de ned as the percentage ‘ = (100 × L=N )%. The parameters
(see text) are N = 1001, m = 3, d = −4 and R = 2.


R1
P(p) is the strategy distribution among the agents. P(p) satis es 0 P(p) dp = N and
is time- and run-averaged as discussed above. From the de nition of the game, games
with cuto L′ ≡ N − L are related to games with cuto L in that the mean attendance
in the game with L′ can be found from the mean population of agents not attending
the bar in a game with L. Hence we focus on L¿N=2. The resulting symmetry about
‘ = 50% is clearly shown in Fig. 1.
At ‘ = 50%, the mean attendance hAi = N=2 while A is smaller than for the
‘random’ game in which agents independently decide by tossing a coin (Arandom ≈

N =2=15:8). As L and hence ‘ increase (‘ ¿ 50%), the mean attendance hAi initially
shows a small plateau-like structure, then increases steadily while always lying below
the value of L. This increase in hAi occurs despite the fact that the value L (and hence
‘) is unknown to the agents. The standard deviation A increases rapidly despite the
fact that the number of available seats in the bar is actually increasing. It seems that by
increasing the level of available resources, the system actually becomes less ecient in
accessing this resource. Most surprising, however, is the abrupt transition in both the
mean attendance and standard deviation which occurs around ‘c = 75%. (The precise
value of ‘c depends on the strategy modi cation range R). For ‘ ¿ ‘c , both the mean

attendance and standard deviation become constant, regardless of the increase in the
seating capacity L. In this region, the bar is practically always undercrowded since
(hAi + A).L, thereby o ering a signi cant ‘arbitrage’ opportunity for the selling of
‘guaranteed’ seating in advance.
Fig. 2 shows the corresponding variation in the average predicted trend hh i and the
average strategy hpi, as a function of the global resource level ‘. Remarkably, hh i has
a step-like structure with abrupt changes at ‘c and ‘ = 50%. These abrupt transitions
are a direct result of the dynamical feedback and memory in the system. A random ht

N.F. Johnson et al. / Physica A 283 (2000) 568–574

571

Fig. 2. Numerical results for the average predicted trend hh i and the average strategy hpi as a function of
the global resource level ‘.

generator as provided by an exogenous source – e.g. sun-spot activity for prediction of
nancial markets – will not contain these discrete steps and hence does not reproduce
the corresponding abrupt transitions shown in Figs. 1 and 2. For example, a coin-toss
would yield hh i = 21 for all ‘; however the endogenously-produced predicted trend in

the present model only takes this value at exactly ‘ = 50%.
The strategy distribution P(p) is symmetric at ‘ = 50%, with peaks around p = 0
and p = 1 [18,20]. As ‘ increases, P(p) becomes asymmetric. The peak around p = 1
becomes larger while that around p = 0 becomes smaller: the agents hence tend to
follow the predicted trend h t more often than not. However, an abrupt change occurs
at ‘ = ‘c . Fig. 3 (upper) shows P(p) just either side of ‘c , at ‘ = ‘c ± 2%. For
‘ = ‘c − 2% (solid line) the asymmetric distribution is clear but P(p) remains non-zero
for all p. For ‘ = ‘c + 2% (dashed line), P(p) = 0 for p ¡ 21 but has an irregular
shape for p ¿ 21 . The nal form P(p) depends on the initial strategy distributions at
time t = 0 and the strategy modi cation range R. For ‘c ¡ ‘6100%, the distribution
P(p) and hence hpi remain essentially unchanged.
R1
Fig. 4 shows the average lifespan per agent de ned as  = (1=N ) 0 P(p)(p) dp,
where (p) is the average number of turns between strategy-modi cations experienced
by an agent at p. As ‘ → ‘c , this lifetime  becomes e ectively in nite (i.e., the duration of the simulation) and remains at this value for all ‘ ¿ ‘c . Above ‘c all agents
have a strategy value p ¿ 12 : the agents do not modify their strategies and hence keep
this same p value inde nitely. There is hence no evolution in this system for ‘ ¿ ‘c ,
despite the fact that the bar has increased seating capacity which could be exploited
(Fig. 1). In essence, the agents are winning enough points to avoid strategy-modi cation:
although the population as a whole is not acting optimally, the system chooses not

to employ any additional evolution. The frozen, non-optimal steady state has a huge

572

N.F. Johnson et al. / Physica A 283 (2000) 568–574

Fig. 3. Numerical results for the strategy distribution P(p) (upper graph) and (p) (lower graph – see text
for de nition) near the transition point ‘ = ‘c . Solid line: ‘ = ‘c − 2%. Dashed line: ‘ = ‘c + 2%.

Fig. 4. Numerical results for the average lifespan per agent  as a function of the global resource level ‘.

degeneracy. These ndings suggest to us the following informal analogy with a magnetic system with 06p ¡ 21 representing spin-down and 1=2 ¡ p61 representing spin-up.
The point ‘ = 50% corresponds to an antiferromagnet at zero eld: since P(p) is symmetric, there is no net magnetization. As the external ‘ eld’ (‘ − 50)% increases, there
is some weak paramagnetism yielding a small value of the magnetization because
P(p) is now biased towards p = 1 (Fig. 3 upper graph, solid line). At the critical
eld (‘c − 50)%, there is a transition to a ferromagnetic state (i.e., pure spin-up with
p ¿ 12 ) as a result of the change in P(p) shown in Fig. 3.

N.F. Johnson et al. / Physica A 283 (2000) 568–574


573

A more formal connection between the behaviors of the ‘macroscopic’ variables in
Fig. 1 and the ‘microscopic’ variables in Fig. 2, can be made analytically. In particular,
we have shown using a mean- eld approximation which averages over the dynamics,
that the mean attendance is given by
hAi = N [hh i + hpi − 2hh ihpi]

(1)

and the standard deviation is given by
A = 2N [hh i(1 − hh i)]1=2 |hpi − 12 | :

(2)

Here hh i and hpi are implicitly functions of ‘. At ‘ = 50%, hh i = 1=2 and hpi = 1=2
(Fig. 2) hence hAi = N=2 in agreement with Fig. 1. For 50% ¡ ‘ ¡ ‘c , hh i ≈ 1=4: the
ratio of the slopes of hAi and N hpi in Figs. 1 and 2 is 0:50 which agrees exactly with
the analytic result obtained by di erentiating Eq. (1) with respect to hpi. Similarly,
the ratio of the slopes of A and N hpi in Figs. 1 and 2 is 0.87 which agrees exactly

with the analytic result obtained by di erentiating Eq. (2) with respect to hpi. For
‘ ¿ ‘c , hh i = 0 hence Eqs. (1) and (2) yield hAi ≈ N hpi and A ≈ 0, which is
again in good agreement with the numerical results of Figs. 1 and 2. Note that the
result hh i = 0 implies that the bar is always predicted to be undercrowded: clearly this
is consistent with the outcome since the mean hAi is well below L and the standard
deviation is very small. Hence we can see why the undercrowded state provides an
attractor for the game dynamics above the freezing transition at ‘c . We note in passing
that the evolutionary freezing which arises for our generalized bar model (Figs. 1– 4)
is qualitatively di erent from the freezing described by Challet and Marsili for the
(non-evolutionary) minority game [13].
In summary, we have reported transitions involving an evolving population of adaptive agents who repeatedly compete for a limited, but adjustable, global resource.
Abrupt changes arise for both microscopic and macroscopic variables of the system,
as the level of available resources is varied. The present results depend crucially for
their existence on the dynamical feedback and non-local time correlations present in
the system.
References
[1] P. Bak, How Nature Works, Oxford University Press, Oxford, 1997.
[2] S. Kau man, The Origins of Order: Self-organization and Selection in Evolution, Oxford University
Press, New York, 1993.
[3] J.L. Casti, Would-be worlds, Wiley, New York, 1997.

[4] J.D. Farmer, Santa Fe Institute preprint 98-12-117.
[5] W.B. Arthur, Am. Econ. Assoc. Papers Proc. 84 (1994) 406.
[6] B. Huberman, R. Lukose, Science 277 (1997) 535.
[7] K. Sigmund, Games of Life: Explorations in Ecology, Evolution and Behaviour, Penguin, London, 1992.
[8] L.A.N. Amaral, S.V. Buldyrev, S. Halvin, M.A. Salinger, H.E. Stanley, Phys. Rev. Lett. 80 (1998)
1385.
[9] N.F. Johnson, S. Jarvis, R. Jonson, P. Cheung, Y.R. Kwong, P.M. Hui, Physica A 258 (1998) 230.
[10] D. Challet, Y.C. Zhang, Physica A 246 (1997) 407.
[11] D. Challet, Y.C. Zhang, Physica A 256 (1998) 514.

574
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]

N.F. Johnson et al. / Physica A 283 (2000) 568–574
Y.C. Zhang, Europhys. News 29 (1998) 51.
D. Challet, M. Marsili, Phys. Rev. E 60 (1999) R6271.
D. Challet, M. Marsili, R. Zecchina, Phys. Rev. Lett. 84 (2000) 1824.
R. Savit, R. Manuca, R. Riolo, Phys. Rev. Lett. 82 (1999) 2203.
N.F. Johnson, M. Hart, P.M. Hui, Physica A 269 (1999) 1.
A. Cavagna, J.P. Garrahan, I. Giardina, D. Sherrington, Phys. Rev. Lett. 83 (1999) 4429.
R. D’hulst, G.J. Rodgers, Physica A 270 (1999) 514.
R.S. Sutton, A.G. Barto, Reinforcement Learning, MIT Press, Cambridge, 1998.
N.F. Johnson, P.M. Hui, R. Jonson, T.S. Lo, Phys. Rev. Lett. 82 (1999) 3360.