instrument it should be tried out first to the students in another class. Try out is also important since “try out is a kind of pre-testing, which provides opportunities
for the test-maker to try out the test directions and to check the estimated time required to work the items of the test. If the directions are not clear to the subjects,
this should certainly be noted at the time of pre-testing, so that the instructions can be clarified in the final form” Harris, 1969:104. Thus, through try out, it could
be identified whether the test instrument is valid and reliable or not. This try out was also to identify the appropriateness of the scoring system applied on the
instrument.
3.6.2.1 Validity of Test
The instrument was checked in terms of its validity. According to Best 1981:153 validity is “that quality of data-gathering instrument or procedure that enables it to
determine what it was designed to determine.” In line withy Best, Gronlund 1988:226 cited by Brown 2004:22 also states validity test is “the extent to
which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment .” To support this idea, Heaton
1974:153 states that “validity of the test is the extent to which it meaures what it is supposed to measure and nothing else.” Or in other words, it is clearly that
validity refers to what extent the test measure what is intended to be measured. Thus, if a test claims to measure the ability in writing, then it should test that
ability. In this study, the product moment formula was used to calculate validity:
Best, 1981:58
{ }
{ }
2 2
2 2
XY
Y Y
N X
X N
Y X
- XY
N r
∑ −
∑ ∑
− ∑
∑ ∑
∑ =
Where : r
= the correlation of the scores on the two halves of the test, N
= number of students, X
= the score of each component of writing scoring, Y
= the score of total item score
XY ∑
= the sum of the products of paired X and Y scores
2
X ∑
= the sum of the squared X scores
2
Y ∑
= the sum of the squared Y scores If the obtained coefficient of correlation is higher than the critical value for
r product moment, it means that the test is valid at 5 alpha level of significance.
3.6.2.2 Reliability of Test
“Reliability is the quality of consistency that the instrument or procedure demonstrates over a period of time. A test is reliable to the extent that it measures
consistency, from one time to another” Best, 1981:199. Harris 1969:14 also states that “reliability is defined as the stability of
test score.” A test is said to be reliable if it has consistency of the result score when it is administered at different times. There are a number of ways in
estimating reliability of a test. The reliability of the test in this study was measured by the following formula:
Arikunto, 2006:196 ⎟⎟
⎠ ⎞
⎜⎜ ⎝
⎛ ∑ ⎟
⎠ ⎞
⎜ ⎝
⎛ −
=
2 2
11
- 1
1 k
k r
t b
σ σ
Where :
11
r = index reliability
k = number of item
2 b
σ
∑ = item variance
2 t
σ
= total variance Meanwhile, in order to find out the variance of each item, the formula is:
Then, the formula to calculate the total variance is:
Having obtained the t-value, each number of items is then checked by critical value of t-table. If the t-value is bigger than t-table, the test is said to be
reliable.
3.6.2.3 Difficulty Level