Reliability Validity Principles of Language Assessment

20 and also any kind of assessment to be effective and appropriate. As proposed by Brown 2003, p.19, there are five principles of language assessment which can be used to see whether the test assessment being used are effective or not. This paper focuses on the three principles proposed by Brown 2003, and they are reliability, validity, and authenticity. Those three principles will be used to see the fulfillment of language assessment’s principles of the Automatic English module for fourth graders in second semester and to see its strength and weaknesses, as what has been mentioned in the purpose of this study.

a. Reliability

A reliable test is consistent and dependable. What it means by consistent and dependable is that if a teacher gives the same test to the same students or matched students on two different occasions, the test should yield similar results Brown, 2003, p. 21. Establishing reliability is a prerequisite for establishing validity. Although a valid assessment is by necessity reliable, the contrary is not true. A reliable assessment is not necessarily valid. The issue of reliability of a test may best be addressed by considering a number of factors that may contribute to the unreliability of a test adapted from Mousavi, 2002, p.804. The first factor is from students student-related reliability, and the second factor which may contribute to the unreliability of a test is from rater’s scoring system rater- reliability. Test administration test administration reliability also contributes to unreliability of a test. The last factor which contributes to unreliability of a test is the test itself test reliability. 21 The study focuses on one of the factors which may contribute to the unreliability of a test, i.e. test reliability. Test reliability deals with the nature of the test itself. Sometimes the nature of the test can cause measurement errors. Inappropriate time allotment may affect the performance of students who do not perform well on a test with a time limit. Another factor contributing to unreliability of a test is the poorly written test items which are ambiguous or have more than one correct answer.

b. Validity

Everitt 2002, p.388 defines validity as the extent to which a measuring instrument is measuring what was intended. In other words, it can be said that a valid test is a test which measures what it is intended to measure. It requires students to perform tasks that were included in the previous classroom lessons. Brown 2003, p.22 explains that there is no final, absolute measure of validity, but several different kinds of evidences may be invoked in support. Some evidences which may support a test to be valid are content validity, criterion- related validity, construct validity, consequential validity, and face validity. From those evidences, there are two evidences which is used and discussed to help the researcher analyzing the Automatic English module. Those are content validity and face validity. A test is said to be content-valid if it requires the students to perform the behavior that is being measured. For example, if a teacher is trying to assess a student’s ability to speak, then, a test which requires the learner to speak within some sort of authentic context is a content-valid test. Face validity is the extent to which students view the assessment as fair, relevant, and useful for 22 improving learning Gronlund, 1998, p.210. Face validity refers to the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the examinees who take it Mousavi, 2002, p.244. In other words, face validity means that the students perceive the test to be valid. Brown 2003, p.27 states that face validity will likely be high if learners encounter a well-constructed with familiar tasks in an expected format, a clearly doable test within the allotted time limit, items that are clear and uncomplicated, directions that are crystal clear, tasks that relate to their course book content validity, and a difficulty level that presents a reasonable challenge. Content validity is a very important ingredient in achieving face validity. If a test samples the actual content of what the learner has achieved or expects to achieve, then face validity will be more likely to be perceived.

c. Authenticity