3.6.3 Difficulty Level
Arikunto 2013:210 explains that a good test is a test which is not too easy or vice versa too difficult to students. After administering and scoring the try out
test, an item analysis was made to evaluate the effectiveness of the items. It aims to check whether each item may be the requirement of a good test item or not.
Heaton 1975: 172 said that all items should be examined from the point of view of their difficulty level of discrimination.
Very easy items are to build in some affective feelings of “success” among
lower ability students and to serve as warm up items, and very difficult items can provide a challenge to the highest-ability students Brown, 2004:59.
The index difficulty of an item simply shows how easy or difficult the particular item proved in the test. Arikunto 2013: 222 explains that the easy test
is not able to stimulate students learning and the difficult test is able to make students desperate because of out of their reach.
In measuring level of difficulty of an essay or oral tests or short answer items, I used the different formula test below:
Zulaiha, 2008: 34 Where:
P = item difficulty
The criteria of difficulty level based on Arikunto 2013: 225 as follows:
Interval Criteria
0.00 P ≤ 0.30 Difficult
0.30 P ≤ 0.70 Medium
0.70 P ≤ 1.00 Easy
Table 3.5 Criteria of Difficulty Level
The result of difficulty level between three raters:
Table 3.6 The Difference Score between Three Raters Rater 1
The criteria of the difficulty index for the item based on rater 10.36 was medium.
Rater 2
The criteria of the difficulty index for the item based on rater 2 0.28 was medium.
Rater 1 Rater 2
Rater 3 Max Score
12 12
13 Min Score
6 5
6 Total Max Score
25 25
25 Mean
9.06 7.972
9.139
Rater 3
The criteria of the difficulty index for the item based on rater 3 0.37 was medium.
After calculating the item, the index of difficulty level rater 1 was 0.36, rater 2 was 0.28 and rater 3 was 0.37. According to the criteria, rater 1 and rater 3 were
medium and rater 2 was difficult. From the computation of item difficulty, it was found there was comparison between medium: difficult = 2: 1. Consequently, it
could be concluded that item was medium. It means that the test can be used as a good instrument of evaluation. The index of other items could be seen in
Appendix 11.
3.6.4 Discriminating Power