The three-parameter logistic model

3.2 The three-parameter logistic model

A serious criticism of the Rasch model and the 2PLM is that these models are not capable of describing accurately the behaviour of students in tests where some or all of the items have a forced choice format, such as multiple choice items. If the ability is very low, both models predict a success probability very near zero, but correct answers may come about by some guessing strategy. If the item is a multiple choice item with four alternatives, picking an alternative at random will guarantee a success probability of 0.25. Formally, this is handled by adding another parameter to each item, which changes the lower asymptote from zero to some positive (but unknown) constant c i . The IRFs for this model are given by

= exp[ ( =+ αθβ − )] PX (

1 |) θ c i ( 1 − c i )

1 + exp[ ( α θ ββ i − i )]

The parameter c i is known as the guessing parameter, and the parameters ␣ i and ␤ i are the discrimination and difficulty parameters just as in the 2PLM. This

model is known as the three-parameter logistic model (3PLM), although the IRF defined by (11) is not a logistic function. In Figure 8.4 the item response functions are displayed for two items, i and j, having the same discrimination and difficulty, but c i = 0 and c j = 0.25. The location of the difficulty parameter is indicated by the dashed line.

Notice that in this model, the difficulty parameter no longer has the elegant interpretation as the ability that grants a 50 per cent probability of a correct response. If the right-hand side of (11) is evaluated at the point ␪ = ␤ i , the result is (1 + c i )/2, which, in the figure, yields 0.5 for item i and 0.625 for item j.

When it comes to a choice between the Rasch model, the 2PLM or the 3PLM, the problem seems to be trivial: as the latter model is the most general, it will (by definition) fit the data at least as well as the other two. Along this line of reasoning, it has even been proposed to use the so-called four-parameter logistic model, which has on top of the three parameters per item present in the 3PLM also an extra parameter to shift the upper asymptote away from one. The rationale for this parameter is to explain carelessness errors, for cases where the correct answer is ‘known’ almost certainly, but for some reason (carelessness, for example) it is not written down. However, the unbridled growth in complexity of models by adding more and more parameters has its price, in at least two respects:

Using Item Response Theory 167

Figure 8.4 Two items in the 3PLM with equal discrimination and difficulty

• Commonly the parameters are estimated from a single data set, which consists just of a table filled with ones and zeros. Adding parameters to the model means adding more sources of insecurity (about their ‘true’ values), but the amount of information one has available to solve this insecurity remains the same. The consequence will inevitably be that the standard errors of the estimates will increase as the number of parameters increases, and even worse, the correlations between (some) parameter estimates will tend to become very high (in absolute value). This is the case, for example, in the 3PLM, for the guessing parameter and the difficulty parameter of the same item: their estimates show usually a high negative correlation, suggesting a trade-off between guessing and difficulty.

• The second aspect bears more on the construct validity of the model, or formulated more accurately, on the inferences one can make from the model. Here is an example. Suppose the 3PLM is applied with a test of 100 items, and the guessing parameters have estimates all close to 0.25. If some student has answered correctly about one quarter of the items, one might be tempted to say that this student has really guessed on all items. However, there is no direct evidence of this; nobody has ‘seen’ this student guessing, and maybe the student knew the answer to 25 of the 100 items, has guessed (incorrectly) on some others and had a misconception about the remaining ones, all leading to an incorrect answer. If one sticks to the simple table with ones and zeros as the only observation to be analysed, the processes having led to these answers are caught in a black box, and there is no evidence beyond the match of the model to these data to make further inferences. But, strictly speaking, the model is nothing else than a formal description of the data in statistical terms, and one should not overplay one’s hand

168 Different methodological orientations in drawing substantive conclusions from such a description; much more

convincing evidence would be obtained by an interview of the students on how they came to their answers.