362 M
. Schrepp Mathematical Social Sciences 38 1999 361 –375
I should form a reflexive and transitive relation on I. We call such a relation in accordance with Doignon and Falmagne 1985 a surmise relation on I.
There are a number of data-analytic techniques which try to derive such logical implications between items from binary datasets. See, for example, Leeuwe 1974,
Flament 1976, Buggenhaut and Degreef 1987, Duquenne 1987, and Theuns 1994. There are some obvious connections between these data-analytic techniques and the
theory of knowledge spaces Doignon and Falmagne, 1985. In this approach a knowledge domain is represented as a finite set I of questions. The subset of questions
from I that a subject is capable of solving is called the knowledge state of that subject. In general, not every subset of I will be a possible knowledge state of a subject. Thus,
the set of possible knowledge states, which is called the knowledge structure, will be a proper subset of the power set of I. This can be used for an efficient adaptive diagnosis
of students knowledge see Falmagne and Doignon, 1988.
A data-analytic technique which derives the set of valid logical implications from observed response patterns determines a knowledge structure on the item set. This
knowledge structure consists of all subsets A of I which are consistent with the derived implications, i.e. which fulfil the condition j
→ i
∧ j [ A
⇒ i [ A for all i, j [ I for
details see Doignon and Falmagne, 1985.
2. Problems in reconstructing implications from data
Let I 5 hi , . . . , i j be an item set. Assume that R is a set of m binary response
1 n
patterns to I. R contains for each subject who responded to the items in I a response pattern r. Such a response pattern r can be considered as a mapping r: I
→ h0, 1j. Our
goal is to derive all valid implications j →
i from this dataset R. We assume that the dataset R contains all relevant patterns. If many patterns occurring
in the population are not contained in R, for example because of a bad sampling, we cannot hope to derive the valid implications from this dataset.
Assume that there is an implication j →
i between two items i, j [ I. In this case r j 5 1 should imply ri 5 1 for every response pattern r [ R.
A direct approach to derive the implications from the data is to accept j →
i if there is no response pattern r [ R with ri 5 0 and r j 5 1. We call such a response pattern r a
counterexample for the implication j →
i. Such a direct approach is clearly insufficient, since some of the response patterns may be influenced by random errors.
We have to consider two different error types. Assume that t is the ‘true’ state of a subject, i.e. the response pattern of the subject under ideal conditions where no errors
occur. Let r be the observed response pattern of the subject. We define the conditional probabilities a and b by:
a: probability that ri 5 0, given ti 5 1, b: probability that ri 5 1, given ti 5 0.
M . Schrepp Mathematical Social Sciences 38 1999 361 –375
363
If, for example, I is a set of problems, then a is the probability of a careless error and b is the probability of a lucky guess.
A method which tries to derive the valid implications from data has to deal with the influence of random errors in the data. Therefore, an implication may be accepted even
if a small number of counterexamples for this implication are observed.
3. Item tree analysis ITA
We describe in the following a data-analytic method which was proposed by Bart and Krus 1973 and further developed by Leeuwe 1974. The central idea of this method is
to detect an ‘optimal’ tolerance level L and to accept all implications for which no
opt
more than L counterexamples are observed.
opt
3.1. The algorithm We describe the algorithm of ITA with the symbols from Leeuwe 1974, where an
implication j →
i is written as i j. We define for items i, j [ I:
b [uhr [ Ruri 5 0 ∧
r j 5 1 ju.
ij
The value b is the number of counterexamples for j
→ i in the dataset R. Every
ij
tolerance level L 5 0, . . . , m defines a binary relation by:
L
i j: ⇔
b L.
L ij
According to our interpretation of this relation should be transitive, i.e. i j and
L L
j k should imply i k. Leeuwe 1974 showed that is transitive, but that
L L
transitivity can be violated for L . 0. Since a relation
is constructed for every tolerance level L the problem arises of
L
how to determine the most adequate tolerance level L .
opt
A simple measure for the fit between a binary relation on an item set I and a set R of observed response patterns to the items in I is the reproducibility coefficient
Repro , R. Let W be the set of all possible response patterns which are consistent with .
This set is defined by: W [hw: I
→ h0, 1ju;i, j i j
∧ w j 5 1
→ wi 5 1
j. In the terminology of knowledge space theory Doignon and Falmagne, 1985 the set
W is called the quasi-ordinal knowledge space corresponding to the relation . Note that W W
for L9 L. Thus, the higher the tolerance level L is, the
L L 9
lower is the number of response patterns consistent with .
L
Define for w [ W and r [ R the distance dr,w by: dr, w[uhi [ Iuwi ± riju.
364 M
. Schrepp Mathematical Social Sciences 38 1999 361 –375
The distance between w and r is the number of items i for which wi and ri have different values.
We define now the reproducibility coefficient Guttman, 1944 by: min
hdr, wuw [ W j ]]]]]]]
Repro , R[1 2
O
uRu uIu
r [R
Repro , R is the proportion of cells in the data matrix which are consistent with . The higher Repro , R is, the better is the fit between and R. Notice that Repro ,
R is a direct generalization of the reproducibility coefficient of a Guttman-Scale. Repro , R is a useful measure to evaluate the ability of a binary relation to explain
a set of response patterns. But Repro , R cannot be used to determine the best relation .
L
Consider two tolerance levels L, L9 with L9 L. This implies W W .
L L 9
Thus, min hdr, wuw [ W j minhdr, wuw [ W j for each r [ R. This
L 9 L
directly shows Repro , R Repro , R, i.e. Repro , R decreases with an
L 9 L
L
increasing tolerance level L. Note that Repro , R 5 1. Therefore, Leeuwe 1974 introduced a different method to evaluate the adequacy of a
relation .
L
The fit between and the dataset R is evaluated by a comparison of the observed
L
correlations between items with the expected correlations if it is assumed that is the
L
correct relation. The fit between the dataset R and is measured by the correlation
L
agreement coefficient Leeuwe, 1974 which is defined by: 2
2
]]] CA [1 2
O
r 2 r ,
L ij
ij
nn 2 1
i ,j
where r is the Pearson’s correlation phi-coefficient between item i and item j and r
ij ij
is defined by: 1
if i j ∧
j i,
L L
]]]]]] 1 2 p p 1 2 p p
if i j ∧
j ⁄
i,
i j
j i
L L
œ
r 5
ij
]]]]]] 1 2 p p 1 2 p p
if j i ∧
i ⁄
j,
j i
i j
L L
5
œ
otherwise. The value p is the relative frequency of subjects who responded positive 1 to item i
i
and p is the relative frequency of subjects who responded positive to item j.
j
The higher the value of CA is, the better is the fit between the relation and
L L
the dataset R. Note that we should have r 5 r
if is the correct relation and j
→ i is a correct
ij ij
L
implication. But for item pairs i, j with i¢ j and j¢i it is possible that r . r 5 0,
ij ij
even if is the correct relation and the data are noiseless. This can be illustrated by
L
the following example.
Example. Let I 5
ha,b,c,dj be a set of four items. Assume that the implications a →
b,
M . Schrepp Mathematical Social Sciences 38 1999 361 –375
365
a →
c, a →
d, and c →
d are true. Thus, each subject should show one of the following seven response patterns:
d c
b a
1 1
1 1
1 1
1 1
1 1
1 1
1 Assume that each of these response patterns occurs with the same frequency in the
population and that no errors are possible a 5 b 5 0. Then we should have r 5
bd
0.09 . r 5 0 and r 5 0.167 . r
5 0.
bd bc
bc
As the example shows, the expected correlation r can differ from the true correlation
ij
r for non-connected item pairs i, j , i.e. items i, j [ I with i¢ j and j¢i.
ij
Leeuwe 1974 suggests the following procedure to detect the optimal tolerance level L
. First, a critical value 0 d 1 for the reproducibility coefficient Repro , R is
opt L
set. Second, for this critical value d we define the set + by:
d
+ [hL [ h0, . . . , mjuRepro , R d ∧
is transitive j.
d L
L
The set + consists of all tolerance levels L for which
is transitive and the
d L
reproducibility coefficient Repro , R is higher than the critical value d. Note that
L
+ ± 5, since is always transitive and Repro , R 5 1. Third, the level L [ +
d d
for which CA shows the highest value is chosen as the optimal tolerance level L .
L opt
This strategy to derive the optimal tolerance level L and the best surmise relation
opt
is problematical because of two points.
L
opt
First, a single intransitivity in results in excluding the whole relation
from
L L
the process which searches for the best relation. If intransitivities are likely small number of observed response patterns, high error probabilities many of the relations
will contain intransitivities. The choice of the best fitting relation is in such cases
L
based on a small number of transitive relations .
L
Second, the remaining transitive relations are compared with the correlation agree- ment coefficient. This coefficient is based on a comparison of r and r . As we have
ij ij
shown in the example it is possible that r . r for non-connected item pairs i, j .
ij ij
Thus, for relations which contain many non-connected item pairs it seems possible that the correct relation
will not have the best CA value.
L L
3.2. A simulation study The ability of ITA to reconstruct the valid implications from observed data was tested
in a simulation study. Four surmise relations on a set of eight items from an investigation of Held et al. 1995 were used in the study.
366 M
. Schrepp Mathematical Social Sciences 38 1999 361 –375
The four surmise relations 8 ,8 ,8 and 8 differ in the number of non-connected
1 2
3 4
item pairs. The relation 8 is a linear order and contains therefore no non-connected
1
pairs, the relation 8 contains three non-connected pairs, the relation 8 contains 11
2 3
1
non-connected pairs, and the relation 8 contains 15 non-connected pairs .
4
In each simulation the following steps were repeated m times. 1. A pattern r is chosen randomly from the quasi-ordial knowledge space W8
corresponding to 8. 2. For each item i:
• if ri 5 1, then ri is changed with probability a to 0,
• if ri 5 0, then ri is changed with probability b to 1.
The result of this process is a simulated dataset Ra, b, m. This simulated dataset was analysed with ITA. Let 8
be the best solution concerning ITA, i.e. the transitive
ITA
relation with the highest CA-value.
L
The relation 8 is then compared to the true surmise relation 8 by:
ITA
D8, 8 5
uhi, jui8j ∧
i W j
∨ i8
j ∧
i Wj ju.
ITA ITA
ITA
D8, 8 is the number of pairs i, j for which the true dependency i.e. i8j or
ITA
i Wj is not properly reconstructed by ITA. Table 1 shows the simulation results for different combinations of values for a, b and
m. For each combination of values 50 simulated datasets are constructed and the mean of D8, 8
over these 50 datasets is shown.
ITA
As the results show the value of D8, 8 depends on the values for a, b and m.
ITA
The higher the error probabilities are, the higher is the error rate in reconstructing the
Table 1 Mean values for the number of item pairs misclassified by ITA
a b 8
8 8
8
1 2
3 4
m 50
100 200
50 100
200 50
100 200
50 100
200 0.03 0.03
2.8 0.5
0.1 3.7
1.9 1
4.4 2.4
1.1 3.9
4.1 2.2
0.03 0.05 4
1.7 0.2
5 2.7
1.4 5
3.7 2.5
5.3 4.6
4.1 0.03 0.07
6 2.5
0.7 5.9
3.7 2.8
6.5 6
4.2 6.4
5.9 6
0.05 0.03 2.9
1.4 0.2
4.7 2.6
1.1 5
3.5 1.6
5.1 3.7
2.5 0.05 0.05
4.3 2.9
0.3 5.5
3.9 1.8
5.9 4.8
3.3 5.8
5.3 4.9
0.05 0.07 6.8
4.4 0.5
7.3 6
3.8 6.9
6.8 5.9
7.2 6.8
6.6 0.07 0.03
4.6 1.9
0.4 5.4
4.2 1.5
5.6 4.3
2.3 5.2
3.9 3.7
0.07 0.05 5.8
3.1 0.6
7.1 5.1
3.6 7
6.1 4.2
6.4 6.1
6 0.07 0.07
9.1 4.8
1.5 9.2
7.3 5.5
8.4 7.7
7.3 7.9
7.8 8.2
1
Another possibility to measure the degree of linearity of a surmise relation is the dimension dim , i.e. the number of items in a maximal antichain. For the surmise relations used in our simulations we have
dim8 51, dim8 52, dim8 53, and dim8 54.
1 2
3 4
M . Schrepp Mathematical Social Sciences 38 1999 361 –375
367
correct implications. The more simulated data patterns are available, the better is the ability of ITA to reconstruct the correct implications.
The simulation results are quite different for the four surmise relations. Therefore, the structure of the surmise relation also had an influence on the error rate.
As the results show, the error rate is low if the error probabilities a, b are low and the number of non-connected pairs in the surmise relation is low i.e. if the surmise relation
is more or less linear. The error rate increases significantly with the number of non-connected pairs in the surmise relation.
For example, for 8 and a 5 b 5 0.07 the error rate is around 8. So in average eight
4
of the 56 possible pairs i, j are misclassified. The error rate does not decrease with m, thus the error is systematical.
4. Improvement of ITA