20 and also any kind of assessment to be effective and appropriate. As proposed by
Brown 2003, p.19, there are five principles of language assessment which can be used to see whether the test assessment being used are effective or not.
This paper focuses on the three principles proposed by Brown 2003, and they are reliability, validity, and authenticity. Those three principles will be used
to see the fulfillment of language assessment’s principles of the
Automatic English
module for fourth graders in second semester and to see its strength and weaknesses, as what has been mentioned in the purpose of this study.
a. Reliability
A reliable test is consistent and dependable. What it means by consistent and dependable is that if a teacher gives the same test to the same students or
matched students on two different occasions, the test should yield similar results Brown, 2003, p. 21.
Establishing reliability is a prerequisite for establishing validity. Although a valid assessment is by necessity reliable, the contrary is not true. A reliable
assessment is not necessarily valid. The issue of reliability of a test may best be addressed by considering a number of factors that may contribute to the
unreliability of a test adapted from Mousavi, 2002, p.804. The first factor is from students student-related reliability, and the second factor which may
contribute to the unreliability of a test is from rater’s scoring system rater- reliability. Test administration test administration reliability also contributes to
unreliability of a test. The last factor which contributes to unreliability of a test is the test itself test reliability.
21 The study focuses on one of the factors which may contribute to the
unreliability of a test, i.e. test reliability. Test reliability deals with the nature of the test itself. Sometimes the nature of the test can cause measurement errors.
Inappropriate time allotment may affect the performance of students who do not perform well on a test with a time limit. Another factor contributing to
unreliability of a test is the poorly written test items which are ambiguous or have more than one correct answer.
b. Validity
Everitt 2002, p.388 defines validity as the extent to which a measuring instrument is measuring what was intended. In other words, it can be said that a
valid test is a test which measures what it is intended to measure. It requires students to perform tasks that were included in the previous classroom lessons.
Brown 2003, p.22 explains that there is no final, absolute measure of validity, but several different kinds of evidences may be invoked in support. Some
evidences which may support a test to be valid are content validity, criterion- related validity, construct validity, consequential validity, and face validity. From
those evidences, there are two evidences which is used and discussed to help the researcher analyzing the
Automatic English
module. Those are content validity and face validity. A test is said to be content-valid if it requires the students to
perform the behavior that is being measured. For example, if a teacher is trying to assess a student’s ability to speak, then, a test which requires the learner to speak
within some sort of authentic context is a content-valid test. Face validity is the extent to which students view the assessment as fair, relevant, and useful for
22 improving learning Gronlund, 1998, p.210. Face validity refers to the degree to
which a test
looks
right, and
appears
to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the examinees who take it
Mousavi, 2002, p.244. In other words, face validity means that the students perceive the test to be valid. Brown 2003, p.27 states that face validity will
likely be high if learners encounter a well-constructed with familiar tasks in an expected format, a clearly doable test within the allotted time limit, items that are
clear and uncomplicated, directions that are crystal clear, tasks that relate to their course book content validity, and a difficulty level that presents a reasonable
challenge. Content validity is a very important ingredient in achieving face validity. If a test samples the actual content of what the learner has achieved or
expects to achieve, then face validity will be more likely to be perceived.
c. Authenticity