Experimental results Directory UMM :Data Elmu:jurnal:J-a:Journal of Economic Behavior And Organization:Vol44.Issue 2.Feb2001:

F.-F. Tang J. of Economic Behavior Org. 44 2001 221–232 225 on-line help function as much as they wanted. About half of the subjects never used this function, and only about a fourth of the subjects used this function repeatedly. All the irrelevant keys on the computer keyboards were “sealed” by the experimenter. The master program recorded vitually everything the subjects had done on their keyboards and all the information they had seen on their screens. Additionally, each subject was provided with a pencil and one page of blank paper, but only about 20 percent of the subjects ever used the pencil and paper. All subjects were interviewed, with the help of the laboratory staff, immediately after the experimental sessions, as they were waiting for their payments in cash. The conversations were typewritten later by the experimenter. Subjects talked freely about how they made their choices, what they thought during the game playing, what frequency patterns if any they had observed, how they tried to make use of their observations, and so on. All the detailed records are available from the author upon request.

4. Experimental results

4.1. Summary statistics Table 1 gives relative frequencies of pure strategies for the five experimental sessions, for each game. For game 1, subjects in the player 2 population choose among pure strategies 7, 8, and 9 with empirical frequencies close to the theoretical prediction of 16, 13, 12. On the other hand, the player 1 population chooses among the three pure strategies with roughly equal likelihood. Of course, if one population sticks to the equilibrium strategy, the average payoff to the other population is in equilibrium regardless of its strategy. For game 2, neither population is close to the Nash equilibrium frequencies. Fig. 2 depicts the movement of subjects’ choices over time for the first two sessions for each game. The first two sessions are representative. The data are organized into a sequence of 10-round groups. The points on the graph are aggregate frequencies within a Table 1 Overview of experimental data Game session Total rounds Strategies: 1, 2, 3 Strategies: 7, 8, 9 Relative frequencies Relative frequencies 1–1 150 [0.314, 0.334, 0.351] [0.132, 0.304, 0.563] 1–2 100 [0.355, 0.311, 0.333] [0.162, 0.332, 0.507] 1–3 150 [0.316, 0.292, 0.392] [0.157, 0.279, 0.564] 1–4 100 [0.240, 0.382, 0.378] [0.217, 0.320, 0.463] 1–5 150 [0.348, 0.306, 0347] [0.129, 0.318, 0.553] Game 1 overall 650 [0.317, 0.322, 0.361] [0.155, 0.308, 0.537] 2–1 150 [0.261, 0.292, 0.447] [0.277, 0.319, 0.404] 2–2 150 [0.204, 0.450, 0.346] [0.404, 0.193, 0.402] 2–3 150 [0.324, 0.268, 0.408] [0.256, 0.307, 0.438] 2–4 150 [0.266, 0.347, 0.388] [0.297, 0.356, 0.348] 2–5 150 [0.181, 0.360, 0.459] [0.333, 0.260, 0.407] Game 2 overall 750 [0.247, 0.343, 0.409] [0.313, 0.287, 0.400] 226 F.-F. Tang J. of Economic Behavior Org. 44 2001 221–232 F.-F. Tang J. of Economic Behavior Org. 44 2001 221–232 227 228 F.-F. Tang J. of Economic Behavior Org. 44 2001 221–232 10-round group. The notation “10-RA” means “10-round aggregate”. Similar graphs for the other sessions, and 5-RA and 1-RA graphs for all sessions, are available from the author upon request. Of course, the 5-RA and 1-RA graphs show more variability than the 10-RA graphs. The left and right graphs are for the player 1 and player 2 populations, respectively. For game 1 Fig. 2A the most prominent feature is striking: the frequency curves of strategies 1, 2, and 3 twist around the 13 horizontal line, while the frequency curves of strategies 7, 8, and 9 separate clearly, moving around 16, 13, 12, respectively. This pattern is persistent for all five sessions, in the 10 round data aggregation and in the 5 round data aggregation. For game 2 Fig. 2B there is no such clear pattern. Apparently, the Nash equilibrium prediction is not well supported by the data. Then, what do the subjects play? This is a fundamentally difficult question, debated by O’Neill 1987,1991 and Brown and Rosenthal 1990. Every simple theory is wrong, but one might like to know how wrong each theory is. In addition to the Nash model and the random play model, McKelvey and Palfrey 1995 have developed another general framework: the quantal response equilibrium QRE for normal form games. QRE is basically a statistical extension of the Nash equilibrium concept gotten by introducing certain error structures. The authors have fit the QRE model to a variety of experimental data sets and found that the QRE model out-performs the Nash or the random play model in most cases. The quantal response model has been fit to the data of this experiment. See Appendix B. In nearly all sessions, the Nash and random predictions can be rejected in favor of the QRE at the 1 level using a likelihood ratio test. 4.2. Test of Selten’s stability prediction Although subjects do not converge to Nash equilibrium play, we can test Selten’s stability prediction in a weaker sense by asking whether subjects show less variability of behavior in game 1 than in game 2. One approach is to compute variability across sessions. Let f 1 ,f 2 ,f 3 and f 7 ,f 8 ,f 9 be the frequencies across all sessions for the two populations for a given game; and let x 1 ,x 2 ,x 3 and x 7 ,x 8 ,x 9 be the corresponding frequencies for a single session for that game. Then s= q x 1 − f 1 2 + x 2 − f 2 2 + x 3 − f 3 2 + x 7 − f 7 2 + x 8 − f 8 2 + x 9 − f 9 2 measures the extent to which the particular session varies from the average. There is such a measure for each of the five sessions for game 1 and for each of the five sessions for game 2. The 10 measures are presented on Table 2. It is clear from inspection that there is less variability and greater stability for game 1 than for game 2. As discussed in the footnote to Table 2, the difference is significant by a Wilcoxon test. Another approach to testing for stability of behavior is to organize observations within a given session into 10-round groups, as on Fig. 2, and then to look at variability of the 10-round averages relative to the overall average of the session. Let x,y,z be the frequencies for a particular session overall, and let x k ,y k ,z k be the frequencies for the kth of the 10-round groups. Then, letting K be the number of groups, s ∗ = K −1 K X k=1 q x k − x 2 + y k − y 2 + z k − z 2 F.-F. Tang J. of Economic Behavior Org. 44 2001 221–232 229 Table 2 Between-session distance measures a Between-session distance measurement of stability Game 1 Distance Rank Game 2 Distance Rank Session 1 0.039 2 Session 1 0.081 5 Session 2 0.062 4 Session 2 0.192 10 Session 3 0.059 3 Session 3 0.130 8 Session 4 0.139 9 Session 4 0.092 7 Session 5 0.032 1 Session 5 0.091 6 a Wilcoxon test: the sum of rankings for game 1, namely W = 1 + 2 + 3 + 4 + 9 = 19 will be this small with probability PW ≤ 19 = 0.0476. Thus, the null hypothesis of no difference between sessions can be rejected at significance level of 5 in favor of the hypothesis that game 1 is less variable. Table 3 Within-session distance measures a Probability associated with the occurrence of the Wilcoxon statistic under null hypothesis of no difference For strategies 1,2,3 For strategies 7,8,9 10-RA k = 10 0.0079 0.004 5-RA k = 5 0.0476 0.004 a Wilcoxon test: in every case, the null hypothesis of no difference can be rejected at significance level of 5 in favor of the hypothesis that game 1 is less variable. measures variability within the session in question. As discussed in the footnote to Table 3, the difference is significant by a Wilcoxon test. Similar calculations for five-round groupings lead to the same conclusion.

5. Conclusion