The rating formulation model and the binomial model

260 F . Cristante, E. Robusto Mathematical Social Sciences 38 1999 259 –274 dependence, interpreted as a sort of implication of one another’s responses. In other words, if the persons are two, for instance, and one person gives a positive answer to a question, his her answer might imply a positive answer of the other person also. However implication does not mean always equal answers, it might also be that a positive answer of one person implies tendentially a negative answer of the other person. According to this perspective, responses of members of well-defined subgroups of persons which belong to a larger group are likely to be dependent, even if independence of the responses of the members of the larger group is previously demonstrated. That is, independence of persons’ responses might fail if such persons are analyzed within well-defined subgroups. The model presented here – called Response Dependence of Subjects Model RDSM – belongs to the family of the Rasch models Fisher and Molenaar, 1995. It is inscribed in the context of the properties of the Rating Formulation for ordered response categories by Andrich 1978, sharing the same algebraic form. Before proceeding to the presentation of the RDSM, some principal aspects of the Rating Formulation Model and the Binomial Model are reviewed in the next section.

2. The rating formulation model and the binomial model

2.1. The Rating Formulation Model According to Andrich 1985 the general rating model can be derived by making the following assumptions and definitions: • in a Likert-type questionnaire the rating mechanism is based on a continuum with m 1 1 successive categories separated by thresholds t , c 51,2, . . . , m; c • at each threshold a separate dichotomous response is given; • the Simple Logistic Model by Rasch is applied at each threshold t , giving c exp 2 t 1 y b 2 d s d f g c ]]]]]] P Y 5 y b , d, t 5 , 1 u h j c c g where g is a normalizing factor; Y is a Bernoulli random variable at the threshold c c that takes the value y 51 if the threshold is exceeded and y 50 otherwise; b2d corresponds to the combination of the person parameter b and the item parameter d ; the parameters b and d represent the locations of the person and the item respectively on the latent trait; m • from the set of all 2 possible outcomes the acceptable response patterns are restricted to the m 11 patterns forming a Guttman Guttman, 1954 structure, that is 0, 0, . . . , 0, 1, 0, . . . , 0, 1, 1, . . . , 0, . . . , 1, 1, . . . , 1, where each indicates the successive thresholds exceeded; • the random variable X is introduced, which takes the value 0 if the response is in the first category no threshold exceeded, 1 for the second category first threshold F . Cristante, E. Robusto Mathematical Social Sciences 38 1999 259 –274 261 exceeded, 2 for the third category first and second thresholds exceeded, and so on until the last category for which it takes the value m. On the bases of the above definitions the Rating Formulation Model takes the form exp c 1 x b 2 d s d f g x ]]]]]] P X 5 x b , d, c, m 5 2 u h j g x where c 5 2 o t , with c 5 c 5 0, x 5 1,2, . . . , m. x c 51 c m The c category coefficients are expressed in terms of the m thresholds t , t , . . . , t x 1 2 m on the continuum. m From the definition of c and c , it follows that o t 5 0 for all thresholds. In the x m c 51 c case that the thresholds are equidistant t 2 t 5t 2 t 5 l.0. With this assump- x 11 x x x 21 tion, the coefficients c take the following simple structure x 1 ] c 5 x m 2 x l 5 x m 2 x l s d s d x 2 where l 5 2l giving the model exp x m 2 x l 1 x b 2 d f s d s dg ]]]]]]]] P X 5 x b , d, l, m 5 3 u h j g x That c 5 x m 2 x l 5 2 o t can be easily demonstrated with any value of m. For s d x c 51 c instance, consider m 54 in which case there are 5 ordered categories. Then from c 5 x m 2 x l, c 5 0, c 5 3l, c 5 4l, c 5 3l and c 5 0. Further, since c 5 2 s d x 1 2 3 4 x x o t 5 2 t 1 ... 1 t , c 5 c 2 t , so that t 5 c 2 c . Then, for m54, t 5 2 s d c 51 c 1 x x x 21 x x x 21 x 1 3l, t 5 2 l, t 5 l, t 5 3l, in which case the successive distances between thresholds 2 3 4 are, as required, given by t 2 t 5 t 2 t 5 t 2 t 5 2l. 2 1 3 2 4 3 2.2. The Binomial Model The general form of the Binomial Model is m x m 2x P X 5 x p , m 5 p 1 2 p 4 u h j s d s d x where 0 and m are the minimum and maximum values for X and p is the parameter of the probability distribution. Defining the probability parameter p by exp b 2 d s d ]]]]] p 5 5 1 1 exp b 2 d s d and entering 5 into 4 transform the model into the form x m 2x exp b 2 d exp b 2 d s d s d m ]]]]] ]]]]] P X 5 x b , d, m 5 1 2 u h j s d F G F G x 1 1 exp b 2 d 1 1 exp b 2 d s d s d m x exp ln 1 x b 2 d s d f s d g exp b 2 d s d m x ]]]]]] ]]]]]]] 5 5 6 s d m m x 1 1 exp b 2 d 1 1 exp b 2 d f s dg f s dg 262 F . Cristante, E. Robusto Mathematical Social Sciences 38 1999 259 –274 Table 1 Category coefficients and threshold values for m 52, m 53 and m 54 Variable X m 52 m 53 m 54 1 2 1 2 3 1 2 3 4 m ln 5 c 0.69 1.10 1.10 1.39 1.79 1.39 s d x x t 5 c 2 c 20.69 0.69 21.10 1.10 21.39 20.41 0.41 1.39 x x 21 x where b 2d corresponds to the combination of the person parameter b and the item parameter d ; m corresponds to the number of response categories of an item. A characteristic of the Binomial Model is that it is formed from dichotomous Bernoulli responses satisfying independence and having the same probability parameter p. These properties permit the model to provide a frame of reference for interpreting dependence. 2.3. The dependence parameter l m If c 5 ln , the Rating Formulation Model of Eq. 2 is equal to the Binomial Model s d x x m of Eq. 6. Thus, being c 5 ln in the Rating Formulation Model, then the response s d x x categories of an item can be interpreted as being equivalent to those formed from independent Bernoulli responses with the same probability parameter. On the other hand, as shown in Andrich 1985, if the estimates of the thresholds are closer together than values that correspond to the binomial coefficients, then the responses can be considered dependent. Table 1 shows the values of the thresholds for binomial coefficients for m 52, m 53 and m 54. For m 52 and m 53 the thresholds are equidistant. In these cases half the distance between thresholds can be considered the limiting value for depen- dence, they are 0.69 and 0.55 respectively. For m 4 the distances between thresholds are not equal, therefore the limiting value for dependence can be defined, adopting a prudential criterium, as half the shortest distance between thresholds, that is 0.41 for m 54. Now, since in the model of Eq. 3 the l value is estimated, if such value is less than the limiting value, it can be inferred that dependence is present. In Table 2 the l l limiting values for different m are given. Table 2 Critical values of l for different m l m l l 2 0.69 3 0.55 4 0.41 5 0.35 6 0.29 7 0.25 8 0.22 9 0.20 10 0.18 F . Cristante, E. Robusto Mathematical Social Sciences 38 1999 259 –274 263 As Andrich 1985, 1982 points out, it is most interesting to notice that the coefficient c 5 x m 2 x of the parameter l is quadratic and symmetric, thus l is scored s d x quadratically by a function of successive integers. In this way, the parameter l can also be considered as an index of dispersion of the responses. An other important feature of l is that although it is defined initially by t 2 t 5 2l . 0, it is possible to interpret both x 11 x a value of zero and a negative value for l. As l is a coefficient of a quadratic function, it characterizes the curvature of the exponent of Eq. 3 and therefore also the curvature of the distribution of this equation. If l.0 the distribution is unimodal, if l50 the distribution is uniform and if l ,0 the distribution is U-shaped. 2.4. Estimation procedure A method presented by Andrich et al. 1982 is considered here. The person parameter is conditioned out so that the item parameter d and the dependence parameter l are estimated simultaneously, but independently of the unknown person parameter b. The person parameter is then estimated unconditionally taking as known the estimate of parameters d and l. The estimation procedure involves considering items in pairs and is a generalization of a method for dichotomous items Andrich, 1988. The set of simple N sufficient statistics is s 5 o x for the item location parameter, where the summation is v 51 N over N persons, v is any person and x 5 0, 1, . . . , m; t 5 o x m 2 x for the item s d v 51 k dependence parameter, where the summation is also over N persons; and r 5 o x for i 51 the person location parameter, where the summation is over k items and i is any item. The values s and t are the jointly sufficient statistics for parameters d and l respectively. 2.5. Goodness of fit As far as goodness of fit is concerned for item and person parameters d and b, the item–person interaction procedure is taken into consideration Andrich, 1988. Given the model of Eq. 4 and the estimates of the parameters d, b and l, in this procedure the probability of any outcome is predicted by inserting these values in the equation. Tests of fit for specific persons and items are then obtained by summing and transforming standardized residuals across items and persons respectively.

3. Response dependence of subjects model RDSM