Reliability Validity and Reliability of the Test .1 Validity

54 measured 3 validity is specific to a particular use selection, placement, evaluation of learning 4 validity is expressed by degree for example: high, moderate, or low. Gronlund, 1982:126. There were three ways to look at the validity of a test: content validity, construct validity, and criterion-related validity. Harris 1969:19 says that the analysis accords with the views of recognized authorities in the stills area and the test then reflects such an analysis, it may be said to have content validity. In addition, Gay 1987: 129 explains that logical validity includes content validity and it is so named because validity is determined primarily through judgment, they are item validity and sampling validity. Gay also insists that content validity is determined by expert judgment. There is no formula in which it can be computed and there is no way to express it quantitatively. Therefore, in this study the writer assumed the role of expert that the test was used was content validity for her study. Based on the validity theory of Gay, the validity of the instrument that the writer used was content validity.

3.9.2 Reliability

According to Harris 1969:14 by reliability is meant the stability of test scores. Test reliability is affected by a number of factors, chief among them being the adequacy of the sampling of tasks. In addition, Gay 1987:135 says that reliability is the degree to which a test consistently measures whatever it measures. Reliability refers to the consistency of test result. Reliability refers to the consistency of the students’ achievement Gronlund, 1986:125. It means that the students will always get the same result. 55 In this study, the writer used the scorerrater reliability. Gay 1987:141 states that scorer rater reliability refers to the situations for which reliability must be investigated, such as essay test, short answer test involving more than one word response, rating scale, and observation instrument. In this study, the writer used assessment criteria by Walter Bartz. 56

CHAPTER IV DATA ANALYSIS AND INTERPRETATION

4.2 The Result of the Study

This study was conducted in four activities. The two activities were teaching-learning activity. The first activity was pre-test and held in March, 10 th 2007. The second activity was held in March, 14 th 2007. The third activity was held in March, 17 th 2007. The last activity was post-test and done in March, 21 st 2007.

4.2 Data Analysis

Analysis means the categorizing, ordering, manipulating, and summarizing of data obtain answers to research questions Kerlinger, 1988:125. The purpose of analysis is to reduce data to be intelligible and interpretable so that the relation of research problem can be studied. In scoring the test, the students called out in turn and the writer tested them by giving the dialogues relating to the material. In giving scores, the writer followed rating scale developed by Walter Bartz Bartz cited in Valette, 1983:150. It showed four items that were important to be scored: fluency, quality of communication, amount of communication, and effort to communicate. But in this study, I did not give score on amount of communication because the students did not create the dialogue. They only memorize the given dialogues.