Reliability Item Difficulties Data Collection

34

3.2.3.6 Reliability

Another necessary characteristic of any good test is reliability. When the test as an instrument is stable or consistent, it means that the test is reliable. Reliability is defined as the extent to which the result to be considered consistent or stable Brown, 1998:98. In other words, the reliability of a test refers to its consistency with which it yields the same rank for an individual taking the test in several times. To measure the reliability, the writer used the Kuder-Richardson formula 20 or KR 20 as follows: where: r 11 = the reliability of the item; n = the number of items; p = proportion of the subject answering the item correctly; q = proportion of the subject answering the item incorrectly; s 2 = standard deviation of the test. Where the criterion of computation is: 0.00 - 0.20 is very low; 0.21 - 0.40 is low; ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − = ∑ 2 2 11 1 s pq s n n r 35 0.41 - 0.70 is medium; 0.71 - 1.00 is very high. Arikunto, 2002:208 Here is an example of reliability computation of the item number one. Σpq = pq 1 + pq 2 + pq 3 + …+ pq 40 = 0.222 + 0.240 + 0.222 + … + 0.196 = 9.1822 s 2 = 30 30 618 14584 2 − = 0.873 = 0.873 For α = 5 and number of subject 30, r table = 0.361. Because r 11 was higher than the r table , then the item number one is reliable.

3.2.3.7 Item Difficulties

A test item is considered to be a good item of evaluation if it is not too easy or not too difficult. Students will not be eager to do their best if the test is too easy. On the contrary, they will be desperate if they can not answer the test correctly because of the very difficult test items. To know the level of the item difficulties in this research, the writer used the following formula: ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − = 773 . 61 182 . 9 773 . 61 1 30 30 r 11 36 ID = T RL RU + Where: ID : index of difficulty level; RU : the number of upper group who answered correctly; RL : the number of lower group who answered correctly; T : the total number of students in both groups. Gronlund, 1982: 102-103 The criterion of the computation is: P = 0.00 – 0.30 → difficult; P = 0.30 – 0.70 → medium; P = 0.70 – 1.00 → easy. Arikunto, 2002:210 Here is an example of item difficulty computation of the item number one. 37 Table 4. The Example of Item Difficulty Computation Upper group Lower Group No Code Score No Code Score 1 S-06 1 1 S-22 1 2 S-09 1 2 S-27 1 3 S-05 1 3 S-13 1 4 S-12 1 4 S-23 5 S-07 1 5 S-08 6 S-03 1 6 S-28 1 7 S-14 7 S-21 1 8 S-16 1 8 S-02 1 9 S-15 1 9 S-19 1 10 S-10 10 S-26 1 11 S-04 1 11 S-24 12 S-01 1 12 S-25 13 S-17 1 13 S-29 14 S-30 1 14 S-11 15 S-20 15 S-18 Σ 12 Σ 8 ID = 30 8 12 + = 0.67 According to the criteria, the item number 1 is medium.

3.2.3.8 Item Discrimination