B ab 7 Pembangk it an B i langan A cak

Random numbers play a key role in discret e event simulat ion. We have used t he uniformly dist ribut ed random numbers in many programming assign- ment s before simulat ion. I n simulat ion, we need random numbers wit h ot her dist ribut ion besides t he uniformly dist ribut ed one.

7.1 Sifat -Sifat B ilan gan A cak

A sequence of random numbers, R 1 ;R 2 ;R 3 ; : : : must have two import ant propert ies:

1. Unifor mit y, i.e. t hey are equally probable every where

2. I ndependence, i.e. t he current value of a random variable has no rela- t ion wit h t he previous values

Each random number R i is an independent sample drawn from a cont in- ueous uniform dist ribut ion bet ween zero and one.

² expect at ion

1 E(R) = xdx =

² variance

2 2 V (R) = 1 x dx ¡ [E (R)] =

Some consequences of t he uniformity and independence pr opert ies

1. I f t he int erval (0; 1) is divided int o n sub-int ervals of equal lengt h, t he expect ed number of observat ions in each int erval is N=n wher e N is t he t ot al number of observat ions. Not e t hat N has t o be su¢ cient ly large t o show t his t rend.

2. The probabilit y of observing a value in a part icular int erval is independent of t he previous values drawn.

7.2 Pemb an gkit an B il an gan A cak Pseu do

² I n comput er simulat ion, we oft en do not want t o have pur e random numbers because we would like t o have t he cont rol of t he random numbers so t hat t he experiment can be repeat ed.

² I n general, a syst emat ic way t o generat epseudo-random number is used t o generat e t he random numbers used in simulat ion. Some algorit hms are needed.

² We generat e t he uniformly dist ribut ed random numbers …rst ; t hen we use t his t o generat e random numbers of ot her dist ribut ion.

² Some desired pr opert ies of pseudo-random number generat ors: – The rout ine should be fast .

– The rout ine should be por t able across hardware plat forms and programming languages.

– The rout ine should have su¢ cient ly long cycle. ¤ A cycle lengt h represent s t he lengt h of t he random number

sequence before previous numbers begin t o repeat t hemselves in an earlier order. For example

¤ A special case of cycling is degenerat ing where t he same r andom numbers appear repeat edly.

¤ Because we use an algorit hm t o generat e random number, cycling cannot beavoided. But long cycles (e.g. a few millions or a few billions) serve t he purpose of general simulat ions.

– The random numbers should be replicable. – Most import ant ly, t he generat ed random numbers should closely

approximat e t he ideal st at ist ical propert ies of uniformit y and independence.

7.3 Tekn ik Pemb an gkit an B il an gan A cak

Many di¤erent met hods of generat ing pseudo-random numbers ar e available. This t ext int roduces two of t hem, wit h one in gr eat det ail.

7.3.1 M et ode K ongr uen L i ni er

The linear congruent ial met hod produces a sequenceof int egersX 1 ;X 2 ;X 3 ;::: bet ween zero and m-1 according t o t he following recursive relat ionship:

X i+1 = (aX i + c) mod m; i = 0; 1; 2; : : : ² The init ial value X 0 is called t he seed;

² a is called t he const ant mult iplier; ² c is t he increment ² m is t he modulus

Theselect ion of a, c, m and X 0 drast ically a¤ect s t hest at ist ical propert ies such as mean and variance, and t he cycle lengt h. When c 6 = 0, t he form is called t he mix ed congr uent ial m et hod; When

c = 0, t he form is known as t he mult iplicat ive congr uent ial m et hod. Cont oh 8.1 on page 292

I ssues t o consider : ² The numbers generat ed from t he example can only assume values from

t he set I = f 0; 1=m; 2=m; :::; ( m ¡ 1)=mg. I f m is very large, it is of less

² To achieve maximum densit y for a given range, proper choice of a; c; m and X 0 is very import ant . Maximal period can be achieved by some proven select ion of t hese values.

– For m a power of 2, i.e. , m = 2 b and c 6 = 0 , t he longest possible period is P = m = 2 b , when c is r elat ively prime t o m and a =

1 + 4k where k is an int eger. – For m a power of 2, i.e. , m = 2 b and c = 0, t he longest possible

period is P = m=4 = 2 b¡ 2 , when X 0 is odd and t he mult iplier, a is given by a = 3 + 8k or a = 5 + 8k where k is an int eger.

– For m a prime number and c = 0, t he longest possible per iod is P = m ¡ 1when a sat is…es t he propert y t hat t he smallest k such

a k ¡ 1 t hat is divisible by m is k = m ¡ 1. For example, we choose m = 7 and a = 3, t he above condit ions sat isfy. Here khas t o be 6.

¤ when k = 6, a k ¡ 1 = 728 which is divisible by m ¤ when k = 5, a k ¡ 1 = 242 which is not divisible by m ¤ when k = 4, a k ¡ 1 = 80 which is not divisible by m ¤ when k = 3, a k ¡ 1 = 26 which is not divisible by m

Of course, t he longest possible period here is 6, which is of no pract ical use. But t he example shows how t he condit ions can be checked.

Examples 8.2, 8.3 and 8.4 on page 294 and page 295.

7.3.2 M et ode K ongr uen L i ni er K ombi nasi

By combining t wo or more mult iplicat ive congruent ial gener at ors may increase t he lengt h of t he period and result s in ot her bet t er st at ist ics. See Example 8.5 on page 297.

7.4 Test B ilangan A cak

When a random number generat or is devised, one needs t o t est it s propert y. The two propert ies we are concer ned most are uniformit y and independence.

A list of t est s will be discussed. The …rst one t est s for uniformit y and t he second t o … ft h ones t est independence.

1. Frequency t est

4. Gap t est

5. Poker t est The algorit hms of t est ing a random number generat or are based on some st at ist ics t heory, i.e. t est ing t he hypot heses. T he basic ideas are t he following, using t est ing of uniformit y as an example.

We have t wo hypot heses, onesays t he random number generat or is indeed uniformly dist ribut ed. We call t his H 0 , known in st at ist ics as null hypot hesis. The ot her hypot hesis says t he random number generat or is not uniformly dist ribut ed. We call t his H 1 , known in st at ist ics as alt ernat ive hypot hesis. We are int erest ed in t est ing result of H 0 , reject it , or fail t o reject it . To see why we don’t say accept H null, let ’s ask t his quest ion: what does it mean if we had said accepting H null? That would have meant t he dist ribut ion is t ruely uniform. But t his is impossible t o st at e, wit hout exhaust ive t est of a real random generat or wit h in… nit e number of cases. So we can only say failure to reject H null, which means no evidence of non- uniformity has been det ect ed on t he basis of t he t est . This can be described by t he saying “ so far so good” .

On t he ot her hand, if we have found evidence t hat t he random number generat or is not uniform, we can simply say reject H null. It is always possible t hat t he H 0 is t rue, but we reject ed it because a sample landed in t he H 1 region, leading us t o reject H 0 . This is known as Type I er ror. Similarily if H 0 is false, but we didn’t reject it , t his also result s in an error, known as Type I I er ror. Wit h t hese informat ion, how do we st at e t he result of a t est ? (How t o perform t he t est will be t he subject of next a few sect ions)

² A level of st at ist ical signi…cance has t o be given. The level is t he probabilit y of reject ing t he H null while t he H null is t rue (t hus, Type

I error).

® = P(r ej ect H 0 jH 0 t r ue)

² We want t he probabilit y as lit t le as possible. Typical values ar e 0.01 (one percent ) or 0.05 ( … ve percent ).

² Decreasing t he probability of Type I err or will increase t he probabilit y of Type I I error. We should t ry t o st rike a balance.

For a given set of random numbers produced by a random number generat or, t he more t est s are, t he more accur at e t he r esult s will be.

7.4.1 Tes Fr ekuensi

² The frequency t est is a t est of unifor mit y. ² Two di¤erent met hods available, Kolmogorov-Smirnov t est and t he chi-

square t est . Bot h t est s measure t heagreement bet ween t hedist ribut ion of a sample of generat ed random numbers and t he t heoret ical uniform dist ribut ion.

² Bot h t est s are based on t he null hypot hesis of no signi… cant di¤erence

bet ween t he sample dist ribut ion and t he t heoret ical dist ribut ion. Tes K olm ogor ov-Sm ir nov

This t est compar es t he cdf of uniform dist ribut ion F(x) t o t he empirical cdf of t he sample of N observat ions.

² F ( x) = x, 0 · x · 1

number of R 1 ;R 2 ;:::;R N ·x

²S N (x) =

² A s N becomes larger, S N (x) should be close t o F (x) ² Kolmogorov-Smirnov t est is based on t he st at ist ic

D = max jF (x) ¡ S N (x)j

t hat is t he absolut e value of t he di¤erences. ² Here D is a r andom variable, it s sampling dist ribut ion is t abulat ed in

Table A.8. ² I f t he calcualt ed D value is great er t han t he ones list ed in t heTable, t he

hypot hesis (no disagreement bet ween t he samples and t he t heoret ical value) should be reject ed; ot herwise, we don’t have enough infor mat ion t o reject it .

² Following st eps are t aken t o perform t he t est .

1. Rank t he dat a from smallest t o largest

R ( 1) ·R (2) ·:::·R (N)

2. Comput e

3. Comput e

D = max(D ¡ ;D )

4. Det ermine t he crit ical value, D ® , from Table A.8 for t he speci…ed signi…cance level and t he given sample size N .

5. If t he sample st at ist ic D is great er t han t he crit ical value D ® , t he null hypot hsis t hat t he sample dat a is from a uniform dist ribut ion is reject ed; if D·D ® , t hen t her e is no evidence t o reject it .

² Example 8.6 on page 300. Chi-Squar e t est

The chi-square t est looks at t he issue from t he same angle but uses di¤erent met hod. Inst ead of measure t he di¤erence of each point bet ween t he samples and t he t rue dist ribut ion, chi-square checks t he “ deviat ion” from t he “ expect ed” value.

(O i ¡E i )

i=1

wher e n is t he number of classes (e.g. int ervals), O i is t he number of samples obseved in t he int erval, E i is expect ed number of samples in t he int erval. If t he sample size is N , in a uniform dist ribut ion,

See Example 8.7 on page 302.

7.4.2 Tes R uns

R uns up dan down The runs t est examines t he arrangement of numbers in a sequence t o t est

t he hypot hesis of independence. See t he t ables on page 303. Wit h a closer look, t he numbers in t he …rst

A run is de… ned as a succession of similar event s proceded and followed by a di¤erent event .

E.g. in a sequence of t osses of a coin, we may have HTTHHTTTHT The …rst t oss is proceded and t he last t oss is followed by a ” no event ” .

This sequence has six runs, … rst wit h a lengt h of one, second and t hird wit h lengt h two, fourt h lengt h t hree, … ft h and sixt h lengt h one.

A few feat ures of a run

– two charact erist ics: number of runs and t he lengt h of run – an up run is a sequence of numbers each of which is succeeded

by a larger number; a down run is a squence of numbers each of which is succeeded by a smaller number

If a sequence of numbers have t oo few runs, it is unlikely a real random sequence. E.g. 0.08, 0.18, 0.23, 0.36, 0.42, 0.55, 0.63, 0.72, 0.89, 0.91, t he sequence has one run, an up r un. I t is not likely a random sequence.

If a sequence of number s have t oo many runs, it is unlikely a real random sequence. E.g. 0.08, 0.93, 0.15, 0.96, 0.26, 0.84, 0.28, 0.79, 0.36, 0.57. It has nine runs, …ve up and four down. I t is not likely a random sequence.

If a is t he t ot al number of runs in a t ruly random sequence, t he mean and variance of a is given by

For N > 20, t he dist ribut ion of a is reasonably approximat ed by a normal dist ribut ion, . Convert ing it t o a st andardized normal dist ribut ion by

a¡¹

t hat is

a ¡ [(2N ¡ 1) =3] Z 0 = p

(16N ¡ 29) =90

Failure t o reject t he hypot hesis of independence occurs when , where t he is t he level of signi…cance. See Figure 8.3 on page 305. See Example 8.8 on

R uns ab ove dan below t he m ean. The previous t est for up runs and down runs are import ant . But t hey are

not adquat e t o assure t hat t he sequence is random. Check t he sequence of numbers at t he t op of page 306, where t hey pass t he runs up and down t est . But it display t he phenomenon t hat t he …rst 20 numbers are above t he mean, while t he last 20 are below t he mean.

Let n 1 and n 2 be t he number of individual observat ions above and below t he mean, let b t he t ot al number of runs. For a given n 1 and n 2 t he mean and variance of b can be expressed as

For eit her n 1 or n 2 great er t han 20, b is approximat ely normally dist ribut ed

Failure t o reject t he hypot hesis of independence occurs when ¡ z ®=2 · Z 0 ·z ®=2 , ® where is t he level of signi…cance. See Example 8.9 on page 307

R uns t est : lengt h of r uns. The example in t he book:

0.16, 0.27, 0.58, 0.63, 0.45, 0.21, 0.72, 0.87, 0.27, 0.15, 0.92, 0.85,... If t he same pat t ern cont inues, two numbers below average, t wo numbers above average, it is unlikely a random number sequence. But t his sequence will pass ot her t est s.

We need t o t est t he randomness of t he lengt h of r uns. Let Y i

be t he number of runs of lengt h i in a sequence of N numbers.

E.g. if t he above sequence st opped at 12 numbers (N = 12) , t hen Y 1 =Y 3 =

Y 4 =:::=Y 11 = 0 dan Y 2 =6

Obviously Y i is a r andom variable. Among various runs, t he expect ed value for runs up and down is given by

The number of runs above and below t he mean, also random variables, t he expect ed value of Y i is approximat ed by

where E (I ) t he approximat e expect ed lengt h of a run and w i is t he approximat e probabilit y of lengt h i. w i is given by

³n ´ 1 i ³n ´ ³n

2 1 ´³n 2 i

E (I ) is given by

E (I ) =

+ ; N > 20

The approximat e expect ed t ot al number of runs ( of all lengt h) in a sequence of lengt h N is given by

(t ot al number divided by expect ed run lengt h). The appropriat e t est is t he chi-square t est wit h O i being t he observed number of runs of lengt h i

where L = N ¡ 1 for runs up and down, L = N for runs above and belown t he mean. See Example 8.10 on page 308 for lengt h of runs up and down.See Example 8.11 on page 311 for lengt h of above and below t he mean.

7.4.3 Tes A ut o-cor r el at i on

The t est s for aut o-correlat ion are concerned wit h t he dependence bet ween numbers in a sequence.

The list of t he 30 numbers on page 311 appears t o have t he e¤ect t hat every 5t h number has a very large value. If t his is a regular pat t ern, we can’t really say t he sequence is random.

Thus t he aut ocorrelat ion ½ im bet ween t he following numbers would be of int erest .

R i ;R i+ m ;R i + 2m ;:::;R i+ ( M + 1)m

The value M is t he largest int eger such i + (M + 1)m · N t hat where N is t he t ot al number of values in t he sequence. E.g. N = 17; i = 3; m = 4, t hen t he above sequence would be 3; 7; 11; 15 (M = 2). The r eason we require M + 1inst ead of M is t hat we need t o have at least two numbers t o t est (M = 0) t he aut ocor relat ion.

Since a non-zero aut ocorrelat ion implies a lack of independence, t he following t est is appropriat e

H 0 :½ im =0

H 1 :½ im 6 =0

For large values of M , t he dist ribut ion of t he est imat or ½ im , denot ed as

b ½ im , is approximat ely normal if t he values R i ;R i+m ;R i + 2m ;:::;R i + ( M + 1) m are uncorrelat ed.

Form t he t est st at ist ic

b ½ im Z 0 = ¾ b ½ im

which is dist ribut ed normally wit h a mean of zero and a variance of one. The act ual formula for and t he st andard deviat ion is

½ b im = R i + km R (k+ 1)m ¡ 0:25

b ½ im = 12(M + 1)

Aft er comput ing Z 0 , do not reject t he null hypot hesis of independence if ¡

z ®=2 ·Z 0 ·z ®=2

where ® is t he level of signi… cance. See Example 8.12 on page 312.

7.4.4 Tes Gap

The gap t est is used t o det ermine t he signi…cance of t he int erval bet ween recurrence of t he samedigit . A gap of lengt h x occurs bet ween t he recurrence

See t he example on page 313 where t he digit 3 is under lined. There are

a t ot al of eight een 3’s in t he list . Thus only 17 gaps can occur. Theprobability of a part icular gap lengt h can be det ermined by a Ber noulli t rail.

P (gap of n) = P(x 6 = 3)P( x 6 = 3) : : : P(x 6 = 3)P(x = 3)

I f we are only concerned wit h digit s bet ween 0 and 9, t hen

P(gap of n) = 0:9 n (0:1)

Thet heoret ical fr equency dist ribut ion for randomly ordered digit s is given by

P(gap of n) = F (x) = 0:1 x+1 (0:9) = 1 ¡ 0:9

n= 0

St eps involved in t he t est . St ep 1.

Specify t he cdf for t he t heoret ical frequency dist ribut ion given by Equat ion (8.14) based on t he select ed class int erval widt h (See Table 8.6 for an example).

St ep 2. Arrange t he observed sample of gaps in a cumulat ive dist ribut ion

wit h t hese same classes. St ep 3. Find D; t he maximum deviat ion bet ween F (x) and S N (x) as in

Equat ion 8.3 (on page 299). St ep 4. Det ermine t he crit ical value, D ® , from Table A.8 for t he speci…ed

value of and t he sample size N: St ep 5.

I f t he calculat ed value of D is great er t han t he t abulat ed value of D ® , t he null hypot hesis of independence is reject ed.

² See t he Example 8.13 on page 314

7.4.5 Tes Poker

The poker t est for independence is based on t he frequency in which cert ain digit s are repeat ed in a series of numbers. For example 0.255, 0.577, 0.331, 0.414, 0.828, 0.909, 0.303, 0.001... In each case, a pair of like digit s appears in t he number.

In a t hree digit number, t here are only t hree possibilit ies.

1. T he individual digit s can be all di¤erent . Case 1.

2. T he individual digit s can all be t he same. Case 2.

3. T here can be one pair of like digit s. Case 3. P(case 1) = P(second di¤er from t he …rst ) * P (t hird di¤er from t he …rst

and second) = 0:9¤0:8 = 0:72; P(case 2) = P(second t he same as t he …rst ) * P(t hird same as t he …rst ) = 0:1¤0:1 = 0:01 P(case 3) = 1¡ 0:72¡ 0:01 = 0:27

² See Example 8.14 on page 316

B ab 8 Pembangk it an Var iabel

B ab 7 Pembangk it an B i langan A cak