20
Figure 2.1: Kemp’s Instructional Design Model Source: Kemp, 1977:9
3. Language Testing
In teaching, there are three components that are closely related. They are objectives, teaching activities, and testing. Their relation is shown by the effect of
one component to the others. Tests are part of teaching activities where in the teaching process teachers need to observe students’ progresses in learning the
materials by giving them some assessment Brown, 2004: 4. Tests are a subset of an assessment. According to Brown 2004: 3, a test, in the simple form, is a
method of measuring a person’s ability, knowledge, or performance in a given Evaluation
Learners’ charac-
teristics
Learning objectives
Subject content
Pre- assessment
Teaching learning
activities sources
Support services
Revision
Goal, topics,
general purposes
21 domain. First, the method must be explicit and structured to qualify the test.
Second, a test should measure general ability, competencies, or objectives. Third, a test measures an individual’s ability, knowledge, or performance. It means that
the test could measure student’s personal ability, and then the teacher will know who the students are. Finally, a test measures a domain which means the test
shows students’ particular level in the class. According to Bachman and Palmer 1997: 4, a model of tests on the large
scale English Foreign Language tests usually include sections testing grammar, vocabulary, reading, and listening comprehension. These tests measure language
ability as a set of components which are grammar, vocabulary, pronunciation, and spelling. Besides to evaluate student’s language ability, language tests can also be
used as a tool for clarifying instructional objectives, for evaluating the relevance of the objectives, the materials, and the activities of students following program
Bachman Palmer, 1997: 4. a.
Types of Language Tests The basic distinction in language testing involves two families of tests
that perform two very different functions Brown, 2005: 1. First, Norm- referenced Tests NRT helps administrators and teachers make program level
decision, such as proficiency and placement decision. NRT measures general language ability or proficiencies. It provides score, interpreted in relation to
mean, median, standard deviation, and percentile rank Brown, 2004: 7. Second, Criterion-referenced Test CRT helps teacher make classroom-level
22 decision, such as diagnostic and achievement decisions. CRT measures specific
objectives based on language points. Test can be distinguished through some criterions. Based on the
procedure, the test can be distinguished as formal test and informal test. According to Brown 2004: 5, informal test can take a number of forms,
starting with incidental, unplanned comments and responses, along with coaching and other impromptu feedback to the students. On the contrary,
formal tests are exercises or produces specifically designed to tap into a storehouse of skills and knowledge Brown, 2004: 6. The tests are systematic,
planned sampling techniques constructed to give teacher and student an appraisal of student achievement.
Then, test can be characterized as formative and summative test. Formative test evaluates students in the process of forming their competencies
and skills with the goal of helping them to continue that growth process Brown, 2004: 6. The opposite, summative assessment measures and
summarizes what a student has grasped, and typically occurs at the end of a course or unit of instruction Brown, 2004: 6.
In the term of approaches to language testing, test is determined into discrete-point test and integrative test Oller et al, 1979. Discrete-point test is
constructed on the assumption that language can be broke down into its component parts and that those parts can be tested successfully Brown,
2004:8. In contrast, integrative test measures language competence as a unified set of interacting abilities that cannot be tested separately Oller et al,
23 1979. In accordance with the above explanation, this study can be classified as
a criterion-referenced test, formal test, summative test, and discrete-point test. b.
Test Qualities Then, to know if the test is effective and useful there are some test
qualities according to Brown 2004: 19 such as practicality, reliability, validity, authenticity, and washback. Further, Bachman and Palmer 1997:17
added some qualities such as interactiveness, and impact. Those qualities are explained as follows:
1 Practicality
It means that the test is not expensive, stays within appropriate time constraints, easy to administer, and easy to score. As a consideration, Bachman
and Palmer 1997: 36 determined the practicality based on the consideration of the resources that would be required in the test had the balance qualities the
teachers want, and the allocation and the management of the resources that were available.
2 Reliability
A reliable test is consistent and dependable. It means that the similar test should bring similar results if it is tested for the same students in a different
occasion, yet there are number of factors that may contribute to unreliability of a test those are fluctuations in the student, in scoring, test administration, and
the test itself Mousavi, 2002: 804. Reliability can be considered to be a function of consistency of scores from one set of test and test tasks to another
Bachman and Palmer, 1997: 19. It means the test will not make any
24 difference to a particular test taker whether the test taker takes the test on
different occasion and setting. 3
Validity Validity means that the tests are appropriate, meaningful, and useful in
terms of the purpose of the test Gronlund, 1998: 226. It means that the test really measures what teacher wants to measure, for example: a valid reading
test will measure student’s reading ability by asking the students to read and by answering comprehension questions. Moreover, the test designer has to
consider the content related evidence, criterion related evidence, construct related evidence, consequential validity, and face validity Brown, 2004:26-
27. The content related evidence means that the content of the test is suitable with the test, for example: if teacher tests speaking, she should give a test in
which lets the students to speak. The criterion related evidence is usually related with the results that are supported by the other concurrent performance,
and the prediction the test taker’s likelihood of future success. The construct related evidence related with any theory, hypothesis, or model that attempts to
indicate the area of language ability teachers want to measure. The consequential validity refers to all the consequences of a test. The face validity
refers to the test appearance. 4
Authenticity Authenticity is the degree correspondence of the characteristics of a
given language test task to the features of a target language task Bachman and Palmer, 1997: 23. The authentic test may present natural language in the test,
25 contextualized items, relevant and interesting topics for the learners, thematic
organization items, and real-world task. 5
Interactiveness Bachman and Palmer 1997: 25 defined interactiveness as the degree
and type of involvement of the test taker’s individual characteristics in accomplishing a test task. The interactiveness means as a quality of any tasks
target language use tasks and test tasks can potentially vary in test taker interactiveness.
6 Washback or Impact
Washback or impact is the effects of the test for students. It means that a variety of individual will be affected by and thus have an interest, or hold a
‘stake’, in the use of a given test in particular situation Bachman and Palmer et. al, 2004.
c. Designing Classroom Language Tests
To design classroom language test, a designer requires the purpose of the test, the objectives of the test, the specifications of the test, the selected test
task and the separate item arranged, and kind of scoring, grading, and feedback Brown, 2004: 42. Defining the purpose helps the designer to choose right
kind of test. Based on the purpose, test types are distinguished into aptitude tests, proficiency tests, placement tests, diagnostic tests, and achievement tests
Brown, 2004: 43. The explanation is outlined in the next page.
26 1
Aptitude Test A language aptitude test is designed to measure capacity or general ability to
learn a foreign language and ultimate success in that undertaking Brown, 2004: 43.
2 Proficiency Test
A proficiency test measures overall ability. It has traditionally consisted of standardized multiple-choice items, grammar, vocabulary, reading
comprehension, and aural comprehension Brown, 2004: 44. The tests are almost summative and norm-referenced. The example of this test is Test of
English as a Foreign Language TOEFL
®
. 3
Placement Test The test has a purpose to place students into a particular level or section of
language curriculum or school Brown, 2004: 45. The test usually includes a sampling of the material to be covered in the various courses in a
curriculum. 4
Diagnostic Test A diagnostic test is designed to diagnose specified aspects of a language
Brown, 2004: 46. For example, pronunciation test might diagnose the phonological features.
5 Achievement Test
An achievement test is related directly to classroom lessons, units, or even a total curriculum Brown, 2004: 47. The test has focused on objectives in
questions, thus it is limited to particular materials. In this study, the design
27 is categorized as achievement test, because the test is related directly to
curriculum.
4. Reading Skill