3.3 Identification of the Problem
Most of the teacher of Junior and Senior High Schools still do not know how to construct a good test. They made a test without paying attention to the
characteristic or the quality of a good test. There are four problems related to the teacher-made English test items. The
problems are: a. The validity level
b. The reliability level c. The difficulty level
d. The discrimination power
3.4 Technique of Data Collection
In this study the intended test is the mid-term test of eighth grade students of Junior High School. The data are in the form of students’ answer sheets, and the
test item of mid-term test of eighth grade students of Junior High School. The writer selects the eighth grade students of Junior High School to get the required
data. Before the test administrated to the students, the writer will contact the English teacher of the selected school to ensure that they were not used anymore.
Then, she begins to analyze the test.
3.5 Technique of Data Analysis
The data to be analyze in this study taken from the students’ answer sheets of the mid-term test of eighth grade of Junior High School. They used to analyze
the quality of test items.
The purpose of this analysis is to identify the quality of each item, whether they belong to good items, moderate items, or bad items. Through items analysis,
we can also find information about the weakness or the shortcoming of the items. Here the items analysis consists of the following:
3.5.1 Analysis of Validity
Validity refers to whether or not a test measures what it is supposed to measure. In this study the writer used content validity. It means that the items
should measure what it is supposed to measure. Then, the writer compare each item with the curriculum of KTSP then if the item meet the criteria of the material
of the curriculum the item can be said as valid item and vice versa.
3.5.2 Analysis of Reliability
The formula that is used to estimate the reliability of the test is Kruder- Richardson 20 formula. According to Brown 185:2005 the Kuder-Richardson 20
formula is the most accurate and flexible formula to calculate reliability. The formula is:
K-R20 = K ∑Si²
1- K-1
St²
K_R20 = Kuder-Richardson formula 20 k = number of the items
Si² = item variance St² = test score variance
Brown, 181:2005
The formula to calculate the variance is:
St
2
=
n n
y y
2 2
St²= the variance
the sum of Y= the total score
N= the number of respondent
3.5.3 Difficulty Level Analysis
A good test item which is not too difficult or too easy for a group of students if more than 75 percent of the group accurately respond to that item of a test. If
between 25 percent and 75 percent of the students in a group accurately respond to an item of a test, the item is considered as moderate. A hard item is one which
fewer than 25 percent of students correctly answer on a test. Based on the above difficulty level criteria, the difficulty level criteria that are
used are: 1
an item with a difficulty level of 0.00 ≤ P ≤ 0.25 is a difficult item
2 0.25
≤ P ≤ 0.75 is moderate 3
0.75 ≤ P ≤ 1.00 is easy
The formula is: P = R T
P = difficulty level or index of difficulty
R = the number of students responding correctly to the item T = the total number of students responding to the item Nitko, 1983:288
3.5.4 Discrimination Power Analysis
Item discrimination tells how well the item performs in separating the better students from the poorer students. The formula of computing item discriminating
power is as follows: D = RU-RL
½ T
D = the index of DP RU = the number of students in the upper group who answer the item correctly
RL = the number of students in the lower group who answer the item correctly ½ T = one half of the total number of the students included in the item analysis
Gronlund, 1982:103 .00 is obtained when an equal number of the students in each group answer
correctly. 1.00Is the highest equal indicating that all students in the upper group got the
item correctly and all the students in the lower group got wrong. - is obtained when more students in the lower group answer correctly than the
students in the upper group. Zero and - DP of item should be removed from the test and then discarder or
improved. Ebel and Frisbie 1991:232 classify the discrimination power values as follows:
Discrimination index
Item Evaluation
0.40 and above Very good item
0.30-0.39 Reasonably good but possibly subject to improvement
0.20-0.29 Marginal items, usually needing being subject to
improvement
0.19 and below Poor items, to be rejected or improved by revision
In estimating the discrimination power, the writer divided the class into three groups there are upper group, lower group, and middle group. In divide the sample
into upper group and the lower group, the writer ranks the sample from 1 to 120. Then, from the ranks, the writer classified the upper group of the sample is 27
students who got highest grade from the whole sample. And the lower group is 27 students who got the lowest grade from the whole sample. The rest students
are categorized as middle group. The list of upper group and lower group students can be seen in appendix 6.
By using the criteria above, the writer analyzed the items. Therefore, the test can be said as a good test or not.
24
CHAPTER IV RESULT OF THE STUDY