required to speak. In the written test, students are required to do their best effort in order to measure their progress.
Therefore, the researcher could say that the tests in this research had the face validity because it really measures what is supposed to measure.
3. Reliability of the Test
Heaton 1979:155 suggests that a test must be reliable. He also says that in order to have a test which is valid, a test must first be reliable as a measuring
instrument. He classifies reliability into two categories: test-reliability and mark- reliability.
3.1. Test-reliability
There are factors affecting the reliability of a test, such as: § The extent of the sample of material selected for testing
The reliability is concerned with the size of the sample. Therefore, the larger the sample, the more tasks the testee has to perform, the greater the
probability that the test as a whole is reliable. Hence, objective tests tend to show more reliability as they allow a wide field to be covered. In this test, the condition
is assured by number of the test items. § The instruction
When carrying out the test, the instruction should be clear as to assure that the students thoroughly understand how to perform the task. In this research, this
aspect of reliability was assured by the simplicity of the tasks or instructions. § Personal factors such as motivation and illness
Personal factors such as motivation and illness in doing the test administered also become important factors because they can influence the results
of the test. In order to arise the students’ motivation in doing the test, the researcher told the students to do the best in the test and the results were used to
see their abilities.
3.2. Mark Reliability
Heaton 1979:155-156 states that the scoring of the test is an important factor affecting the reliability of a test. Being consistent in scoring a test also
determines the reliability of the test. Therefore, this reliability denotes the extent to which the same marks or grades are given if the same test is marked by two or
more different examiners or the same test is marked by the same examiner at different time. Objective tests can fulfill the mark reliability, but subjective tests
may still have difficulties in fulfilling the reliability. The reliability of the test was obtained by adopting Spearman- Brown
split-half formula and was computed by using Pearson Product Moment. Ary 1990:213 states that:
The most common procedure is to correlate the scores on the odd numbered items of the test with the scores on the even
numbered items. Then, to trans form the split half correlation into an appropriate reliability estimate for the entire test, the
Spearman Brown formula is employed.
The formula of the Spearman Brown Split- Half and the Pearson r correlation are described as follows:
1. The Spearman-Brown formula:
=
where, = the estimated reliability of the entire test
r
1212
= the Pearson r correlation between the two halves 2. The Pearson r correlation
where:
r = Pearson r
? X = sum of the scores in x distribution
? Y = sum of the scores in y distribution
? X Y = sum of the product of paired x and y scores ? X
2
= sum of the squared scores in x distribution ? Y
2
= sum of the squared scores in y distribution N
= number of subjects The result of the computation than could be seen as the reliability coefficient.
According to Best 1970: 257, the reliability coefficient and its relation are as follows:
0.00-0.20 negligible
0.20-0.40 low and slight
0.40-0.60 moderate
0.60-0.80 substantial
0.80-1.00 high to very high
To know the reliability of the test, a pilot test was administered. Heaton 1979: 158 states that a pilot test is a try-out of the test to a small but
representative group of testees. In this study, a try out of simple past tense was conducted to a group of students of the first grade students of SMA Binatama
Sleman After conducting the try out, the reliability coefficient of the test items
was counted based on the formula suggested by Lado. Then, if the test is reliable, the real test could be given to other group of students.
D. Data Gathering Technique and Research Design
First, the researcher conducted the pilot test on May 7, 2007 in SMA Binatama Sleman
to measure the validity and reliability coefficient of the tests. Then, the researcher started to conduct the experiment in SMK Bopkri I
Yogyakarta . The researcher conducted the pretest for IAK in May 22, 2007 and
the pretest for IAP in May 25, 2007. The procedure was the same for those two groups.
The pretest was divided into two forms, written test and spoken test. The researcher conducted the written test first. The researcher distributed the test
papers and the answer sheets to the students. Before doing the test, there was an explanation of what the students should do in every part of the test. The
researcher had to check whether they understood the instruction or not. After finishing the written test, the test papers were collected. After that, the researcher
conducted the spoken test. The researcher asked the students to come one by one in front of the class. The researcher recorded students’ utterances in the tape-
recorder. The results of the written test and spoken test were examined in data analysis.