An example for data analysis with ITA and ITA

370 M . Schrepp Mathematical Social Sciences 38 1999 361 –375

5. An example for data analysis with ITA and ITA

As a practical application of ITA and ITA we analyse a dataset from Schrepp 1995. In this investigation a group of 51 subjects tried to solve 20 letter series completion problems, which are shown in Table 3. Such letter series completion problems have a simple structure. Given a sequence of letters of the alphabet a subject is required to find a continuation which is in accordance with the rule used to generate the sequence. The letter sequences commonly used, for example in intelligence tests, are a mixture of simple sequences in which every letter is in a specific relation to its predecessor. The number of simple sequences in a letter series is called the period of the series. We used in our problem set same letter as S, alphabetic successor N, alphabetic predecessor P, and is double next letter D as alphabetic relations between the letters of the simple sequences. Consider, for example, the letter sequence ekfjgihh. It consists of the simple sequences efgh and kjih. In efgh every letter is the alphabetic successor of its predecessor in the sequence, while in kjih every letter is the alphabetic predecessor of its predecessor in the sequence. So the period of the problem is 2 and N and P are the alphabetic relations holding between the letters of the simple subsequences. Table 4 shows the observed values b for the 20 letter series completion problems. ij First, we describe the analysis of the data with ITA. Only the relations and 34 Table 3 Overview of the letter series completion problems used in the experiment. The third column shows the alphabetic relations holding between the subsequences of the series Problem Letter Alphabetic Correct number sequence relations continuation 1 cdcdcdcd SS cdc 2 gohpiqjrks NN ltm 3 tbaxtbaxtb SSSS axt 4 fiehdgcfbe PP adz 5 abyabxabwab SSP vab 6 adbecfdgeh NN fig 7 jkorklpslmqt NNNN mnr 8 ehgjilknmp DD orq 9 urtustuttu SNS utu 10 pvountmslr PP kqj 11 eafgbhicjkdl DND men 12 dhefcgdebfcd PPPP aeb 13 npaoqapraqsa NNS rta 14 axdcxfexhgxj DSD ixl 15 ekofkngkmhk NSP lik 16 jkillminnoip DDSD pqi 17 fgjggkhglig NSN mjg 18 wrnfvqmeupld PPPP tok 19 dhcbfjedhlgf DDDD jni 20 mrnorpqrrsrt DSD urv M . Schrepp Mathematical Social Sciences 38 1999 361 –375 371 Table 4 The observed values for b ij i\j 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 5 5 2 4 3 1 4 4 3 1 4 2 4 4 3 4 2 3 4 14 11 14 13 11 7 8 12 11 11 3 12 9 12 12 10 9 1 8 5 1 1 6 7 5 7 4 6 2 3 6 5 4 5 5 6 4 5 3 4 7 18 13 18 11 17 13 11 17 15 15 3 14 12 15 14 13 11 3 11 8 13 9 13 7 12 9 6 12 10 10 3 12 8 11 10 10 7 7 9 3 2 3 1 2 2 2 2 2 2 1 2 1 2 2 2 2 1 2 10 4 3 4 1 3 2 1 1 3 1 2 1 2 1 2 1 1 11 4 2 4 1 3 1 1 1 3 1 2 1 2 1 2 1 1 12 34 30 34 23 33 27 19 24 32 30 30 29 26 30 28 26 22 8 26 13 5 4 5 3 4 3 1 4 4 3 3 3 3 2 2 3 3 14 9 6 9 4 8 7 3 4 7 6 6 1 7 6 6 5 3 5 15 4 3 4 2 3 3 1 2 3 2 2 2 1 2 1 1 1 2 16 7 6 7 5 6 4 3 4 6 4 4 1 4 4 5 5 2 1 3 17 9 7 9 5 8 7 4 6 8 7 7 1 6 5 6 7 4 1 6 18 13 12 13 8 12 9 6 7 12 10 10 1 11 7 10 8 8 1 8 19 34 29 34 21 33 27 19 21 32 30 30 8 29 25 31 28 26 22 26 20 10 7 10 4 9 7 3 4 9 7 7 2 8 6 8 6 7 5 2 are transitive. The CA-value for is higher than the CA-value for . The best 34 surmise relation accordingly to ITA is therefore . This surmise relation is shown in Fig. 1 as a Hasse-Diagram. The surmise relation is organized in four layers. This surmise relation allows some conclusions concerning the difficulty of a letter series completion problem. The problems 1, 3, and 5 layer 1 and 2 which contain an obvious identity relation S are the easiest, i.e. they are implied by all other problems. The problems 2, 4, 6, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, and 20 layer 3 cannot be compared concerning their difficulty. The problems 7, 12, and 19 layer 4 which contains a difficult alphabetic relation and have a high period seem to be the most complex problems. They imply some of the problems in layer 3. This example shows that the handling of intransitivities in ITA is not satisfactory. If an intransitivity occurs the whole relation is dropped. This is problematic especially for Fig. 1. The surmise relation as a Hasse-Diagram. 372 M . Schrepp Mathematical Social Sciences 38 1999 361 –375 Table 5 Minimal level L with i8 j. If L . b the value is displayed bold L ij i\j 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 18 18 2 18 4 1 18 5 5 1 5 2 5 5 3 4 2 3 4 18 18 18 18 18 7 9 18 18 18 3 18 18 18 18 18 18 1 18 5 1 1 6 18 18 18 5 18 2 4 18 18 18 18 5 18 7 5 4 4 7 18 18 18 18 18 18 18 18 18 18 3 18 18 18 18 18 18 3 18 8 18 18 18 8 18 18 6 18 18 18 3 18 18 18 18 18 18 18 9 18 2 18 1 18 2 2 2 2 2 1 2 1 2 2 2 2 1 2 10 18 18 18 1 18 3 1 1 18 2 2 1 2 1 2 1 1 11 18 18 18 1 18 3 1 1 18 1 2 1 2 1 2 1 1 12 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 8 34 13 18 18 18 3 18 4 1 4 18 18 18 3 18 3 2 3 3 14 18 18 18 5 18 18 3 4 18 18 18 1 18 18 18 5 4 5 15 18 18 18 2 18 3 1 2 18 3 3 2 1 2 1 1 1 2 16 18 18 18 5 18 5 3 4 18 18 18 1 18 5 18 5 3 1 3 17 18 18 18 5 18 18 4 6 18 18 18 1 18 6 18 18 4 1 6 18 18 18 18 8 18 18 6 8 18 18 18 1 18 18 18 18 18 1 18 19 34 34 34 34 34 34 34 34 34 34 34 8 34 34 34 34 34 34 34 20 18 18 18 5 18 18 3 4 18 18 18 2 18 7 18 18 7 5 2 big item sets, since the probability that at least one intransitivity occurs in increases L with the size of the item set I. We describe now the analysis of the dataset with ITA . Table 5 shows for each item pair the minimal level L for which i8 j is true. The values with b ± L are displayed L ij bold. Our inductive construction process allows us to construct 12 different surmise relations 8 from the dataset R. L This example clearly shows the advantage of our inductive construction process. In ITA we have the choice between only two transitive relations and . Using our 34 inductive construction process we can select the best relation out of 12 transitive relations. As an example for the inductive construction look at the items 6, 10 and 11. We have b 5 1, b 5 1 and b 5 2. Thus, 10 11 and 11 6 but 10 ⁄ 6. Therefore, 10,11 11,6 10,6 1 1 1 1 is intransitive. In ITA the pair 10, 11 is therefore not added in step 1 to the relation 8 . Table 6 shows the size of 8 , Repro8 , R, and diff8 , R for the levels. If a L L L Table 6 Size of 8 , Repro8 , R and diff8 , R for the relevant levels L L L L Level 1 2 3 4 5 6 7 8 9 18 34 u8 u 88 125 156 176 189 207 212 216 221 222 344 400 L Repro8 1 0.98 0.97 0.96 0.95 0.94 0.94 0.93 0.92 0.92 0.88 0.83 L diff8 , R 4.2 3.4 2.9 2.6 2.5 2.7 2.8 2.9 3.3 3.4 14.2 59.6 L M . Schrepp Mathematical Social Sciences 38 1999 361 –375 373 level L9 is not displayed in the table, then 8 is identical to 8 for the highest L 9 L displayed level L L9. The minimal diff-value is observed for L 5 4. Therefore, 8 is the optimal surmise 4 relation. Note that 8 is more than double the size of 8 while the reproducibility 4 coefficient decreases only a little. The relation 8 is shown in Fig. 2 as a Hasse-Diagram. 4 The relation 8 allows the following observations. If the alphabetic relations between 4 the letters of the simple sequences are identical, the difficulty increases in general with period length. See, for example, the implications 7 → 6, 12 → 8, and 19 → 4. For problems with identical period length the difficulty increases in general with the complexity of the alphabetic relations used in the problem. See, for example, 8 → 2 → 1 or 12,19 → 7 → 3. Here S is less complex than N, N is less complex than P and D, while no such difference in difficulty between P and D could be observed. When both influences are mixed nothing can be said. A problem with high period length and easy alphabetic relations can be more difficult than a problem with low period length and complex alphabetic relations for example 7 → 10 and vice versa for example 4 → 13. Fig. 2. The surmise relation 8 as a Hasse-Diagram. 4 374 M . Schrepp Mathematical Social Sciences 38 1999 361 –375 But in general the complexity of the alphabetic relations seems to be more important for problem difficulty than period length. These results are consistent with formal theories of problem solving in the area of letter series completion problems for example Kotovsky and Simon, 1973 or Klahr and Wallace, 1970. These theories assume that problem solvers first search for a possible relationship between two letters of the series. If such a relationship is detected, then it is used to uncover the period of the series. Finally, the information about the period is used to uncover the whole regularity in the series. Thus, problem complexity should mainly depend upon period length and the complexity of the alphabetic relations in the series.

6. Conclusions