̅ = The class percentage
F = Total percentage score
N = Number of students
The last, after getting mean of students‘ score per actions, the writer identifies whether the students‘ improve their understanding of narrative text from
pre-test up to post-test in cycle 1 and cycle 2. She uses the formula as below:
P = Percentage of students‘ improvement
У = pre-test result
y
1
= Post - Test 1
P =
Percentage of students‘ improvement У
= pre-test result У
2 =
Post - Test 2
F. The Trustworthiness of Test
To analyze the examined test items, the writer uses the trustworthiness of test, there are some ways including:
1. Test Validity
Validity is the component criteria for evaluating the test or as a measure of the test. It could be about the representation of test toward the material that is
being given for the students. Before administering the pre-test, the writer analyzes the validity and the reliability of pre-test instrument in order to find
̅
out whether the test is valid or good to be used. According to Arikunto, information will be valid if appropriate with the fact and the test will be valid
if it can be measure what it should be measure.
42
Before administering the test, the writer used auditing by asking the advisor to review and evaluate the study to ensure the validity of the
instruments. Then, after the students did the pre-test, she used the Anatest software to calculate the instruments‘ validity and reliability scores.
Table 3.3 The criterion of “koefisien korelasi”
43
Scale Remark
0.80 - 1.0 Very high
0.60 - 0.80 High
0.40 - 0.60 Enough
0.20 - 0.40 Low
0.0 0.20
Very low After the calculation using ―ANATEST‖, the validity value or XY
correlation of the pre-test instrument used in this study was 0.86. It means the test is valid and categorized into very high quality. It was gotten the data from
40 forty questions multiple choices that was examined before and got 23 twenty-three questions that was valid through ANATEST software.
Instrument that was valid are number 4, 5, 8, 9, 11, 12, 13, 14, 15, 16, 18, 20, 23, 24, 25, 26, 28, 31, 33, 35, 36, 38, and 40. Meanwhile, the reliability of the
instrument that was 0.92 which means the test is valid and categorized into very high reliability. Then, the validity value of post-test 2 used in this study
was 0.55. It means the test is valid and categorized into enough quality. Then,
42
Suharsimi Arikunto, Dasar-Dasar Evaluasi Pendidikan, Jakarta:Bumi Aksara, 2010, pp. 58
—59.
43
Suharsimi Arikunto. op.cit., p. 75.
the reliability of the instrument was 0.71 which means the test is valid and categorized into high reliability.
2. Discrimination Power
The analysis of item discrimination of test items is to know the performance of the test through distinguishing students who have high
achievement and low achievement. Item discrimination provides more detailed analysis of the test items difficulty, because it shows how the top scores and
lower scores performed on each item. The formula as following:
44
D : The index of discriminating power U : The number of correct answer in the upper group
L : The number of correct answer in the lower group N : The total number of people in the top group.
The discriminating scale uses:
Table 3.4 Discriminating Scale
DP REMARK
0.6-1.0 Very good
0.4-0.6 Good
0.1-0.3 Enough
-1-0.0 Bad
3. Item Facility
Item facility difficulty item concern on the proportion of comparing students who answer the questions correctly with all of the students following the
44
Charles Alderson, Caroline Clapham and Dianne Wall, Language Test Construction and Evaluation, Cambridge University Press, 1995, p. 274.
D
test. Item difficulty is how easy or difficult is the test based on the group of students. The formula as following:
45
IF : Item facility
N
correct
: Number of students who selected the correct answer N
total
: Total number of students taking the test The criterion that is used is as:
Table 3.5 Criterion Scale
ID REMARK
0-0.14 Difficult
0.15-0.85 Moderate
0.86-1.00 Easy
G. The Criterion of the Action Success
Classroom action research CAR is able to called successful if it can fulfill the criteria which have been determined, and fail if it cannot fulfill the
criteria which have been determined. The writer and the teacher discussed to determine the criteria of the action success. This study is regarded successful if
75 numbers of students can pass the KKM in the school is 75 seventy-five. If the study passes the criteria, thus it is called successful and if it not that will need
improvement to continue to the next cycle.
45
James Dean Brown, Testing in Language Programs, New York: McGraw-Hill, 2005, p. 66.