which belong to neither the experimental nor control group. There were 32 respondents in the try out test.
The respondents had to answer 40 questions in 45 minutes in vocabulary matching task format and selected response fill-in vocabulary format. As
mentioned by Brown 2004:198, matching tasks have the advantage of offering an alternative to traditional multiple-choice or fill-in-the-blank formats and are
sometimes easier to construct than multiple choice items, as long as the test designer has chosen the matches carefully.
The result of this try out test became the consideration dealing with validity and reliability for using instrument in gathering the data. In order to meet
the reliability of the test, the writer did try out test one time. If the result on students’ score on was good enough, then the test was reliable. Thus, the test
could be used as the instrument to collect the data.
3.7.2 Qualities of the Test
A test is considered as a good test, if it has three qualities. Those qualities are validity, reliability, and practicality. In this study, the writer only focused on
validity and reliability of the test since those are the qualities that are essential to the effectiveness of any data-gathering procedure.
3.7.2.1 Validity
According to Gronland as adapted in “Language Assessment”, validity is the extent to which inferences made from assessment results are appropriate,
meaningful, and useful in terms of the purpose of the assessment Brown,
2004:22. In other words, validity is a standard or criterion that shows whether the instrument is valid or not. To measure the validity, the writer used the formula
that called Product moment:
�� = � ∑ � − ∑� ∑
� ∑�
2
− ∑ �
2
� ∑ ² − ∑ ²
where:
r
xy
= the correlation of the scores ΣX = the total of students who have right answer
ΣY = the total of students’ scores X = the number of the students who have right answer
Y = the students’ scores N = the number of students
Arikunto, 2009:72 After the writer obtained the reliability score, the following step was
to insult to the score with the r Product Moment table.
3.7.2.2 Reliability
Reliability is defined as the quality of the instrument or procedure demonstrates over a period of time. Heaton 1975: 167 states that “reliability is a necessary
characteristic of any good test: for it to be valid at all, a test must be first reliable as a measuring instrument. If the test is administered to the same candidates on
different occasions, then, to the extent that it produces result, it is not reliable.”
To know whether this test is reliable or not, I will use the following formula:
�
11 =
[
� � − 1
] [
� − ∑ �
]
where:
r
11
= the reliability of the test k
= the number of items p
= proportion the subject answering the item incorrectly q = proportion the subject answering the item correctly
S
2
= the total variance Arikunto, 2009:100
The result was consulted to critical value for r-product moment. When we obtained coefficient of the correlation is higher than the critical value for r-
product moment, it means that the item is valid at 5 alpha level of significance Arikunto, 2006:184.
3.7.3 Item Analyses