The fourth graders of Authenticity Reliability Validity

9 to be done by the students and arranged based on the certain topic theme unit. The module is used as an ongoing assessment made by the teacher concerning whether or not students have achieved the goals objectives of the lesson being discussed.

3. The fourth graders of

SD Teruna Bangsa Fourth graders of SD Teruna Bangsa in this paper refer to the students whose ages are ranging from eight until ten years old. The fourth graders consist of thirty students from different social backgrounds. Teruna Bangsa is one of the private schools in Yogyakarta located in Villa Seturan Indah D10 Caturtunggal, Depok, Sleman, Yogyakarta.

4. Authenticity

It is one of the language assessment principles. Bachman and Palmer 1996, p.23 define authenticity a s “the degree of correspondence of the characteristics of a given language test task to the features of a target language task,” and then suggest an agenda for identifying those target language tasks and for transforming them into valid test items. Basically, an assessment is said to be authentic when the tasks exercises given have the as-natural-as-possible language, based on real-world task, contextualized items, and meaningful topics organization Brown, 2003, p.28.

5. Reliability

Reliability is one of the language assessment principles which concerns about the consistency and dependability. This principle is affected by several factors. They are student-related reliability, rater teacher reliability, test 10 administration reliability, and test reliability. This paper is focused on test reliability and it deals with the nature of the test itself. The sources of test unreliability may from the inappropriate time allotment the amount of time provided is not suitable with the test items and the poorly written test items that are ambiguous or that have more than one correct answer Brown, 2003, p.28.

6. Validity

An assessment is said to be valid if it measures what is intended to measure. Validity is the extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment Gronlund, 1998, p.226. There is no final, absolute measure of validity, but several different kinds of evidence may be used to examine the validity of the test items. There are five types of evidences proposed by Brown 2003, p.22: Content validity content-related evidence, criterion-related evidence, construct-related evidence, consequential validity, and face validity. From those evidences, there are two aspects which will be discussed in terms of validity in this paper, i.e. content validity and face validity. Brown 2003, p.22 defines content validity as a situation in which a test requires the test-taker to perform the behavior that is being measured. The content validity can be identified observationally if the achievement being measured can be clearly defined. In other words, content validity deals with the requirements to perform the previously-learned lessons and represent the objectives of the unit on which the assessment is based. 11 The extent to which “students view the assessment as fair, relevant, and useful for improving learning” Gronlund, 1998, p.210 is popularly known as face validity. Mousavi 2002, p.244 explains that face validity refers to the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the examinees who take it, the administrative personnel who decide on its use, and other psychometrically unsophisticated observers. Face validity is seen from students’ points of view. The test is said to be face-valid if it has a clear direction, a logically-organized structure of the test, an appropriate difficulty level, has no ‘surprises’, and an appropriate timing Brown, 2003, p.27.

B. Research Method