Scoring Criteria METHODS OF THE RESEARCH

Content validity is a non-statistical type of validity that involves the systematic examination of the test content to determine whether it covers a representative sample of the behavior domain to be measured Anastasi Urbina, 1997: 14. Face validity is the extent to which a test is subjectively viewed as covering the concept it purports to measure. It refers to the transparency or relevance of a test as it appears to test participants Holden, 2010: 637. In this study, validity of the writing test covered content and construct validity.

a. Content Validity of the Test

Furlong and Lovelace 2000: 72 state that a test is said to have content validity when the items of the test accurately represent the concept being measured. The writing test is developed in reference to standard competency and basic competencies stated in the first year of the second semester of English subject. It means that the whole materials which are covered in the test reflect the materials which are given to the tenth grade students. The test was considered as valid in content validity since the test of writing constitutes a representatives sample of the language skill and structure and also the material used were chosen based on 2006 English Curriculum of KTSP for tenth grade students.

b. Construct Validity of the Test

Construct validity is achieved by considering whether the test measures just the ability which it is supposed to measure. In this study, there were three aspects or characteristics that were assessed in students’ writing, namely; grammar, organization, and vocabulary that were suggested by the notion suggested by Jacobson 2003

3.6.2 Reliability of the Writing Test

The reliability of a research instrument is the degree of consistency and dependence with which the instrument measures the attribute. Wellington 2000: 200 states that reliability is also used in connection with research methods in order to estimate the degree of confidence in the data. Reliability refers to the extent to which a test or technique functions consistently and accurately by yielding the same results at different times or when used by different researchers. In this research, inter-rater reliability was used. Inter-rater reliability is established when the results of the writing test are assessed using subjective judgment. It was applied to know whether or not the data of the writing score that were given by two raters were reliable. The researcher was the first rater and the teacher of SMA Al-Kautsar as the second rater in gaining studen ts’ score. The second rater had 11 years teaching experience and graduated from S1 English Department, University of Lampung. After the raters gained the results, they were compared. When there was a high degree of agreement, the procedure could be considered reliable. In order to determine the level of the instrument reliability, the norm of categorizing the reliability coefficient was employed. The following table was the norm of adopted categorizing the reliability coefficient. To determine the level of reliability of scoring system, the Spearman Rank Correlation was applied in the data. The formula is: r = 1 – where : r : coefficient of rank correlation