According to Kathleen M. Bailey, difficulty level is “an index of how easy an individual item was for the people who took it. I.F. is a number, typically printed as a
decimal, ranging from 0.0 to 1.0. It represents the proportion of people who got the item right out of all the people who took the test.”
37
The formula of the difficulty level of each item in large group is stated below:
38
In which: FV
: The index of difficulty R
: The number of correct answers N
: The number of students taking the test U
: Upper half L
: Lower half n
: Number of candidates in one group
To calculate the difficulty level in short-answer item, ZaenalArifin states as follows:
39
37
Kathleen M. Bailey, Learning about Language Assessment: Dilemmas, Decisions, and Directions, New York: Heinle Heinle Publishers, 1998, p. 132.
38
J.B. Heaton, loc. cit.
39
Zainal Arifin, loc. cit.
FV =
or FV =
The next step, the writer tried to find out the difficulty level of all items in the English Summative Test for the first grade of SMPN 3 Tangerang Selatan at odd
semester academic year 20132014 by using the following formula:
40
In which: P
: The difficulty level of all items B
: The total number of difficulty level of each item ∑
: Sigma Total N
: The total number of test items
The scale of the difficulty level of all test items ranged from 0.00 to 1.00. It can be interpreted in the rank scale of difficulty level, as follow:
40
Zaenal Arifin, loc. cit., p. 272.
’
Table 2.1 The rank scale of Level of Difficulty
41
P Interpretation
0.30 Difficult
0.30
– 0.70
Moderate 0.70
Easy
From the rank scale above, it shows the easiness and the difficultness of test items. So, the teacher will know the difficulty level of each test item in the
summative test.
b. Discriminating Power
According to J. Stanley Ahmann and Marvin D. Glock, “the discriminating power of a test item is its ability to differentiate between pupils who have achieved
well the upper group and those who have achieved poorly the lower group.”
42
Normally, the upper group will be able to answer the question well rather than the low group. The index of discriminating power is recognized by proportion if the
proportion is high it means that the test item is included in good test item because it can discriminate the upper and lower group.
As stated by Kenneth D. Hopkins, items that yield a discrimination index of .30 or more are relatively good in distinguishing between knowledgeable and less
knowledgeable examinees.
43
. The maximum size of the index is + 1.00 and the minimum size is
– 1.00. Any negative value means that the test item discriminates-to some degree-in the wrong
direction. Hence, the discriminating power of the test item is unsatisfactory. Positive
41
Suharsimi Arikunto, Dasar-dasar Evaluasi Pendidikan, Jakarta: PT. Bumi Aksara, 2006, p. 208op. cit., p. 210.
42
J. Stanley Ahmann and Marvin D. Glock, op. cit., p. 187.
43
Kenneth D. Hopkins, op. cit., p. 257.
values show that the test item discriminates in the desired direction, even though it may not be completely satisfactory.
44
At one side, a good test item is it can be answered by upper group correctly and incorrectly by lower group.
To analyze the index of discriminating power of an item, Wilmar Tinambunan states the formula as follows:
45
In which: D
: The index of item discriminating power U
: The number of pupils in the upper group who answered the item correctly L
: The number of pupils in the lower group who answered the item correctly N
: Number of pupils in each of the groups
Then, to calculate the discriminating power in short-answer item, it can be used by the formula, as follows:
46
44
J. Stanley Ahmann and Marvin D. Glock, op. cit., p. 189.
45
Wilmar Tinambunan, op. cit., p. 139.
46
Zainal Arifin, op. cit., p. 278.
D =
T =
1
X
2
X √
2 1
2 2
In which:
1
X
: The mean of upper group
2
X
: The mean of lower group
2 1
: The total quadrate of individual deviation from upper group
2 2
: The total quadrate of individual deviation from lower group n
: 27 x N upper and lower group
Table 2.2 The classifications of the index of Discriminating Power D are:
47
Index of Discriminating Power
Classifications
0.70 – 1.00
Excellent 0.40
– 0.70 Good
0.20 – 0.40
Satisfactory ≤ 0.20
Poor Negative value on D
Very poor
c. The Effectiveness of Distracter
Based on, Mozaffer and Farhan Jaleel’s article that, “another important
technique is analysis of distractors, that provides information regarding the individual distractors and the key of a test item.”
48
It means that, the teacher will know the ability of the students from the answer which they have chosen.
47
Anas Sudijono, Pengantar Evaluasi Pendidikan, Jakarta: PT. Raja Grafindo Persada, 2006 p. 389.
48
Mozaffer Rahim Hingorjo Farhan Jaleel, Analysis of One-Best MCQs: the Difficulty Index, Discrimination Index and Distractor Efficiency,
Karachi: Journal of Pakistan Medical Association, 2012
, p. 1.
Actually, the test item which can be called good quality that is the distractor will be chosen by students who answer incorrect equally. On the contrary, a poor test
item is the distractor will be chosen unequally. Besides that, the test item will be called good quality too if many upper group can answer correctly and only a little
lower group can answer correctly.
3. The Difficulty Level and the Discriminating Power
According to J. Stanley Ahmann and Marvin D. Glock, “It has been long
known that the discriminating power of an item is influenced by its difficulty.”
49
To support, it also tells how difficult or easy the questions were, the difficulty index, and whether the questions were able to discriminate between students who
performed well on the test, from those who did not, the discrimination index.
50
Logically, after analyzing the level of difficulty the teacher will get information that it can be known the level of difficulties of each item whether the test
items are included in easy, moderate, or difficult level. Then, from that level of difficulties the teacher can discriminate the students whether they are included in
upper or lower group. As an example, based on Zainal
Arifin’s statement, the upper group will be able to answer the question rather than the lower group.
51
So, it can be concluded that the difficult level only can be answered by the upper group of students and it cannot
be answered by the lower group of students. So, after analyzing the test items based on the difficulty level it will be better
if the teacher analyzing the discriminating power in the same times and procedures. Because, in analyzing the test items the teacher not only know the level of difficulties
of those items but also know how well the test items discriminate the upper and lower group students.
49
J. Stanley Ahmann and Marvin D. Glock, loc. cit. p. 189.
50
Mozaffer Rahim Hingorjo Farhan Jaleel, loc. cit.
51
Zainal Arifin, op. cit., p. 133.
According to Heaton, “Facility values and the discrimination indices are usually recorded together in tabular form and calculated by similar procedures.”
52
Above all, by analyzing the test items in term difficulty level and discriminating power, the teacher will know the qualities of the test items whether the
test items have an easy, moderate, or difficult level and a high or low discriminating power.
To calculate them, it can use the formula as follows:
53
In which: FV
: The index of difficulty R
: The number of correct answers N
: The number of students taking the test D
: Discriminating index U
: Upper half L
: Lower half n
: Number of candidates in one group
52
J.B Heaton, op. cit., p. 182.
53
Ibid.
FV =
or FV =
D =
An additional, difficulty has a bearing on the discrimination value of a test –
that a test which is too hard or too easy will not discriminate between individuals of different levels of achievement as well as one which is more appropriate for the range
of abilities in the group.
54
4. The Importance of Item Analysis
Talking about the importance of the item analysis, it is very important for the teacher to get much information from the result of the analyzing the items. By the
analyzing, the teacher will know about the achievement of his or her teaching and learning process in the classroom. Not only the process itself but also the teacher
knows the students’ performances. It means that, there are so many benefits of the
item analysis.
Based on Anthony J. Nitko, there are six points of the importance of item analysis, those are:
55
1. “Determining whether an item functions as the teacher intends.” In
this book, there are five points about deciding the functioning items which teacher needs to consider: a whether it seems to be testing the intended
instructional objective, b whether it is of the appropriate level of difficulty, c whether it is able to distinguish those who have command of the learning
objectives from those who do not, d whether the keyed answer is correct; and e for response-choice items whether the distractors are functioning.
2. “Feedback to students about their performance and as a basis for class
di scussion.” The teacher gets information about the performance of their
students. For example in reviewing the test, the teacher will know the students’ errors.
54
Victor H. Noll, Introduction to Educational Measurement, Boston: Houghton Mifflin Company, 1965, p. 180.
55
Anthony J. Nitko, op. cit., p. 284-286.
3. “Feedback to the teacher about pupil’s difficulties.” In item analysis
the teacher will know the students’ problem, it can be seen from the students’
errors of their answering and the teacher is suggested to take the score more than one item such as group and individual test in all the topics which has
taught. 4.
“Areas for curriculum improvement.” From this statement, item analysis will give information to conduct the revision of the curriculum.
5. “Revising the items.” By doing the item analysis, teacher can revise
the item from students’ responses of each item if there are some items which are revised it just need a few time to revise than make the new item and the
items can be used for the next testing. 6.
“Improving item-writing skills.” The teacher should have item-writing skill in making items in the test and the way to improve the item-writing skill
is to analyze the item from students’ responses.
In addition, Widdowson said in his book, Language Testing about the item analysis. It usually provides two kinds of information on items:
56
1. Item facility, which helps us decide if the test items are at the right level for the target group, and
2. Item discrimination, which allows us to see if the individual items are providing information on candidates; abilities consistent with that provided by
the other items on the test.
56
H.G Widdowson, Language Testing, Oxford: University Press, 2000, p.60.
C. Previous Studies
There are some studies about the difficulty level and the discriminating power which had conducted by several researchers.
The first, Andrian Dwi Prayoga did an analysis about the difficulty level of English summative test for the second grade of Junior High School at odd semester
20102011 at SMPN 13 South Tangerang. This study was included in quantitative research because the researcher used some numerical data which were analyzed
statistically. Also, this study was categorized as descriptive analysis because it was intended to describe the objective condition about the difficulty level of the English
summative test. In this study, the researcher only took 93 students as a sample. The findings of this study were moderate items had the highest percentages with 66,7
followed by difficult items with 20 and easy items with 13.3 . Overall, the difficulty level of the test was moderate level with 0.50 index of difficulty it means
that the test had a good difficulty level.
57
The second, Hikmah Lestari did an item analysis about the discriminating power of English summative test at the second year of SMPN 87 Pondok Pinang.
This study was categorized as a descriptive analysis because it was intended to describe the objective condition about the discriminating power by analyzing the
quality of English summative test items in discriminating students’ achievement. This study was considered as a quantitative research because the researcher used some
numerical data which was analyzed statistically. The researcher only took 60 students as an ordinal sampling in her study. The findings of this study was the English
summative test which was tested at second grade of SMPN 87 Pondok Pinang had
57
Andrian Dwi Prayoga, An Analysis on the Difficulty Level of English Summative Test for Second Grade of Junior High School at Odd Semester 20102011 of SMPN 13 South Tangerang A
Case Study at the Second Grade of SMPN 13 South Tangerang, Research Paper at State Islamic University Syarif Hidayatullah Jakarta. Jakarta, 2011, p. i, unpublished.