370 M
. Schrepp Mathematical Social Sciences 38 1999 361 –375
5. An example for data analysis with ITA and ITA
As a practical application of ITA and ITA we analyse a dataset from Schrepp 1995. In this investigation a group of 51 subjects tried to solve 20 letter series completion
problems, which are shown in Table 3. Such letter series completion problems have a simple structure. Given a sequence of
letters of the alphabet a subject is required to find a continuation which is in accordance with the rule used to generate the sequence.
The letter sequences commonly used, for example in intelligence tests, are a mixture of simple sequences in which every letter is in a specific relation to its predecessor. The
number of simple sequences in a letter series is called the period of the series. We used in our problem set same letter as S, alphabetic successor N, alphabetic predecessor
P, and is double next letter D as alphabetic relations between the letters of the simple sequences.
Consider, for example, the letter sequence ekfjgihh. It consists of the simple sequences efgh and kjih. In efgh every letter is the alphabetic successor of its predecessor in the
sequence, while in kjih every letter is the alphabetic predecessor of its predecessor in the sequence. So the period of the problem is 2 and N and P are the alphabetic relations
holding between the letters of the simple subsequences.
Table 4 shows the observed values b for the 20 letter series completion problems.
ij
First, we describe the analysis of the data with ITA. Only the relations and
34
Table 3 Overview of the letter series completion problems used in the experiment. The third column shows the
alphabetic relations holding between the subsequences of the series Problem
Letter Alphabetic
Correct number
sequence relations
continuation 1
cdcdcdcd SS
cdc 2
gohpiqjrks NN
ltm 3
tbaxtbaxtb SSSS
axt 4
fiehdgcfbe PP
adz 5
abyabxabwab SSP
vab 6
adbecfdgeh NN
fig 7
jkorklpslmqt NNNN
mnr 8
ehgjilknmp DD
orq 9
urtustuttu SNS
utu 10
pvountmslr PP
kqj 11
eafgbhicjkdl DND
men 12
dhefcgdebfcd PPPP
aeb 13
npaoqapraqsa NNS
rta 14
axdcxfexhgxj DSD
ixl 15
ekofkngkmhk NSP
lik 16
jkillminnoip DDSD
pqi 17
fgjggkhglig NSN
mjg 18
wrnfvqmeupld PPPP
tok 19
dhcbfjedhlgf DDDD
jni 20
mrnorpqrrsrt DSD
urv
M . Schrepp Mathematical Social Sciences 38 1999 361 –375
371 Table 4
The observed values for b
ij
i\j 1
2 3
4 5
6 7
8 9
10 11
12 13
14 15
16 17
18 19
20 1
2 5
5 2
4 3
1 4
4 3
1 4
2 4
4 3
4 2
3 4
14 11
14 13
11 7
8 12
11 11
3 12
9 12
12 10
9 1
8 5
1 1
6 7
5 7
4 6
2 3
6 5
4 5
5 6
4 5
3 4
7 18
13 18
11 17
13 11
17 15
15 3
14 12
15 14
13 11
3 11
8 13
9 13
7 12
9 6
12 10
10 3
12 8
11 10
10 7
7 9
3 2
3 1
2 2
2 2
2 2
1 2
1 2
2 2
2 1
2 10
4 3
4 1
3 2
1 1
3 1
2 1
2 1
2 1
1 11
4 2
4 1
3 1
1 1
3 1
2 1
2 1
2 1
1 12
34 30
34 23
33 27
19 24
32 30
30 29
26 30
28 26
22 8
26 13
5 4
5 3
4 3
1 4
4 3
3 3
3 2
2 3
3 14
9 6
9 4
8 7
3 4
7 6
6 1
7 6
6 5
3 5
15 4
3 4
2 3
3 1
2 3
2 2
2 1
2 1
1 1
2 16
7 6
7 5
6 4
3 4
6 4
4 1
4 4
5 5
2 1
3 17
9 7
9 5
8 7
4 6
8 7
7 1
6 5
6 7
4 1
6 18
13 12
13 8
12 9
6 7
12 10
10 1
11 7
10 8
8 1
8 19
34 29
34 21
33 27
19 21
32 30
30 8
29 25
31 28
26 22
26 20
10 7
10 4
9 7
3 4
9 7
7 2
8 6
8 6
7 5
2
are transitive. The CA-value for is higher than the CA-value for . The best
34
surmise relation accordingly to ITA is therefore . This surmise relation is shown in Fig. 1 as a Hasse-Diagram.
The surmise relation is organized in four layers. This surmise relation allows
some conclusions concerning the difficulty of a letter series completion problem. The problems 1, 3, and 5 layer 1 and 2 which contain an obvious identity relation S are
the easiest, i.e. they are implied by all other problems. The problems 2, 4, 6, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, and 20 layer 3 cannot be compared concerning their difficulty.
The problems 7, 12, and 19 layer 4 which contains a difficult alphabetic relation and have a high period seem to be the most complex problems. They imply some of the
problems in layer 3.
This example shows that the handling of intransitivities in ITA is not satisfactory. If an intransitivity occurs the whole relation is dropped. This is problematic especially for
Fig. 1. The surmise relation as a Hasse-Diagram.
372 M
. Schrepp Mathematical Social Sciences 38 1999 361 –375 Table 5
Minimal level L with i8 j. If L . b the value is displayed bold
L ij
i\j 1
2 3
4 5
6 7
8 9
10 11
12 13
14 15
16 17
18 19
20 1
2 18
18 2
18 4
1 18
5 5
1 5
2 5
5 3
4 2
3 4
18 18
18 18
18 7
9 18
18 18
3 18
18 18
18 18
18 1
18
5 1
1 6
18 18
18 5
18 2
4 18
18 18
18 5
18 7
5 4
4 7
18 18
18 18
18 18
18 18
18 18
3 18
18 18
18 18
18 3
18
8 18
18 18
8 18
18 6
18 18
18 3
18 18
18 18
18 18
18
9 18
2 18
1 18
2 2
2 2
2 1
2 1
2 2
2 2
1 2
10 18
18 18
1 18
3 1
1 18
2 2
1 2
1 2
1 1
11 18
18 18
1 18
3 1
1 18
1 2
1 2
1 2
1 1
12 34
34 34
34 34
34 34
34 34
34 34
34 34
34 34
34 34
8 34
13 18
18 18
3 18
4 1
4 18
18 18
3 18
3 2
3 3
14 18
18 18
5 18
18 3
4 18
18 18
1 18
18 18
5 4
5 15
18 18
18 2
18 3
1 2
18 3
3 2
1 2
1 1
1 2
16 18
18 18
5 18
5 3
4 18
18 18
1 18
5 18
5 3
1 3
17 18
18 18
5 18
18 4
6 18
18 18
1 18
6 18
18 4
1 6
18 18
18 18
8 18
18 6
8 18
18 18
1 18
18 18
18 18
1 18
19 34
34 34
34 34
34 34
34 34
34 34
8 34
34 34
34 34
34 34
20 18
18 18
5 18
18 3
4 18
18 18
2 18
7 18
18 7
5 2
big item sets, since the probability that at least one intransitivity occurs in increases
L
with the size of the item set I. We describe now the analysis of the dataset with ITA . Table 5 shows for each item
pair the minimal level L for which i8 j is true. The values with b ± L are displayed
L ij
bold. Our inductive construction process allows us to construct 12 different surmise relations 8 from the dataset R.
L
This example clearly shows the advantage of our inductive construction process. In ITA we have the choice between only two transitive relations and . Using our
34
inductive construction process we can select the best relation out of 12 transitive relations.
As an example for the inductive construction look at the items 6, 10 and 11. We have b
5 1, b 5 1 and b
5 2. Thus, 10 11 and 11 6 but 10 ⁄ 6. Therefore,
10,11 11,6
10,6 1
1 1
1
is intransitive. In ITA the pair 10, 11 is therefore not added in step 1 to the relation 8
. Table 6 shows the size of 8 , Repro8 , R, and diff8 , R for the levels. If a
L L
L
Table 6 Size of 8 , Repro8 , R and diff8 , R for the relevant levels L
L L
L
Level 1
2 3
4 5
6 7
8 9
18 34
u8 u 88
125 156
176 189
207 212
216 221
222 344
400
L
Repro8 1
0.98 0.97
0.96 0.95
0.94 0.94
0.93 0.92
0.92 0.88
0.83
L
diff8 , R 4.2
3.4 2.9
2.6 2.5
2.7 2.8
2.9 3.3
3.4 14.2
59.6
L
M . Schrepp Mathematical Social Sciences 38 1999 361 –375
373
level L9 is not displayed in the table, then 8 is identical to 8
for the highest
L 9 L
displayed level L L9. The minimal diff-value is observed for L 5 4. Therefore, 8 is the optimal surmise
4
relation. Note that 8 is more than double the size of 8 while the reproducibility
4
coefficient decreases only a little. The relation 8 is shown in Fig. 2 as a Hasse-Diagram.
4
The relation 8 allows the following observations. If the alphabetic relations between
4
the letters of the simple sequences are identical, the difficulty increases in general with period length. See, for example, the implications 7
→ 6, 12
→ 8, and 19
→ 4. For
problems with identical period length the difficulty increases in general with the complexity of the alphabetic relations used in the problem. See, for example, 8
→ 2
→ 1
or 12,19 →
7 →
3. Here S is less complex than N, N is less complex than P and D, while no such difference in difficulty between P and D could be observed. When both
influences are mixed nothing can be said. A problem with high period length and easy alphabetic relations can be more difficult than a problem with low period length and
complex alphabetic relations for example 7 →
10 and vice versa for example 4 →
13.
Fig. 2. The surmise relation 8 as a Hasse-Diagram.
4
374 M
. Schrepp Mathematical Social Sciences 38 1999 361 –375
But in general the complexity of the alphabetic relations seems to be more important for problem difficulty than period length.
These results are consistent with formal theories of problem solving in the area of letter series completion problems for example Kotovsky and Simon, 1973 or Klahr and
Wallace, 1970. These theories assume that problem solvers first search for a possible relationship between two letters of the series. If such a relationship is detected, then it is
used to uncover the period of the series. Finally, the information about the period is used to uncover the whole regularity in the series. Thus, problem complexity should mainly
depend upon period length and the complexity of the alphabetic relations in the series.
6. Conclusions