Reliability Try Out for the Instrument

Ana : Sure. What is it? Bintang : ............................................... Ana : ............................................... Bintang : Thanks. Good bye. Ana : Bye. The test given to the students were about telephone conversation, so the tests were in line with the existing curriculum.

3.6.2 Reliability

Reliability is a general quality of stability of scoring regardless of what the test measured. Brown 2004:20-22 states that there are some factors in considering one test could be said reliable or not, they are fluctuations in the students, in scoring, in test administration, and in the test itself. The students’ condition takes the most part in deciding whether a tets is reliable or not. Sometimes students are sick, has fatigue, has bad days, or other physical and psychological factors which may obstract one’s from achieving his true score. Next is the scoring system. This could be in the form of inter-rater reliability or intra-rater reliability. Inter-rater reliability occurs when two or more scorers yield inconsistent acores of the same test Brown: 2004, 21. And intra- rater reliability is a common occurance for classroom teachers because of unclear scoring cirteria, bias toward good and bad students, or simple carelessness. The condition in which the students take the test could be also affect the reliability of a test. The test must be administered in a very proper and comfortable place in which students can do the test well. Also the last factor is the test itself. The test made must be appropriate to the students’ background knowledge, to the time allotment, and to the criteria of scoring. Extending the conversation on the phone Closing the conversation on the phone Based on the observations of the tests, the tests given to the test takers were reliabe since there were a certain criteria in achieving each points. The criteria can be seen in point 3.8 in this chapter, table 3.4. Nevertheless, there were some occurances in discussing scoring criteria. This happened because I was still having a lack experience in scoring students’performance and also from the teacher themselves that having different point of view in deciding which students achieve what criteria. Sometimes the score given by each scorers were a bit different. Nevertheless, by discussing and analyzing the recording of the performance once more, the difference could be erased.

3.6.3 Practicality