11
3. Achievement Test
According  to Arthur  Hughes,  achievement  test  is  “The  test  that  is
related  to  language  courses,  their  purpose  being  to  establish  how successful  individual  student,  group  of  students,  or  the  courses
themselves have been achieving the objectives.
16
It could be concluded that the achievement test is a type of test which is designed to measure
students’ mastery related to the objective of the learning process. According  to  Woolfolk,  the  measurement  of  achievement  can  be
divided into two types; formative and summative test.
17
It means that the teacher can measure the achievement of the students by using formative
and summative test.
a. Formative Test
According  to  Baxter,  “Formative  evaluation  is  done  during  a process  so  that  the  process  can  be  changed  to  make  it  more
effective.”
18
It is indicated that formative test is designed to measure an individual’s comprehension in a term.  Another definition comes
from  Arthur  Hughes  who  said  that  Assessment  is  formative  when the teachers use it to  check on the progress  of their students,  to  see
how far they have mastered what they should have learned, and then use  this  information  to  modify  their  future  teaching  plans.
19
It  can be indicated that formative test can be used to know the success of an
individual to measure the mastery of few chapters. From  any  statements  above,  the  writer  can  conclude  that
formative  test  is  a  kind  of  test  which  is  constructed  by  teacher  to
16
Arthur  Hughes,  Testing  for  Language  Teachers,  Second  Edition,  New  York: Cambridge University Press, 2003, p. 13.
17
Anita E. Woolfolk, Educational Psychology, Englewood Cliffs: Prenctice-Hall, 1990, 4
th
Ed., p. 530.
18
Andy Baxter, Evaluating Your Students, London: Richmond Publishing, 1997, p. 32.
19
Hughes, op, cit.,p. 16.
12
know  how  successful  the  learning  and  teaching  process  in  the classroom in a periodical time.
b. Summative Test
Based on Gronlund, “summative test typically comes at the end of a course  or  unit  of  instruction.  It  is  designed  to  determine  the  extent  to
which  the  instructional  objectives  have  been  achieved  and  is  used primarily  for  assigning  course  grades  or  for  certifying  pupil  mastery  of
the intended learning outcomes.”
20
It implies that summative test is used to gauge an individual’s skill after finishing all of the teaching learning
process. It means that this kind of test is designed at the end of the course and used for measuring the effectiveness of instructional objectives.
Moreover, according to Harmer on The Practice of English Language Teaching
, “Summative assessment, as the name suggests, is the kind of measurement  that  takes  place  to  round  things  off  or  make  a  one-off
measurement. Such test include the end-of-year tests that students take or the big public exams which many students enter for.”
21
To sum up, summative test is a type of test which is constructed at the end of study after the students have finished all materials. The purpose is
for certifying the students’ achievement and judging the appropriateness of the material given, method used and the goal of the study.
c. The Proficiency Test
McNamara assumes that proficiency test is  a test which is  designed to  see  language  use  in  the  future,  and  it  is  not  related  to  the  course  or
training  that  an  individual  has  had.
22
Meanwhile,  according  to  Valette,
20
Norman  E.  Gronlund,  Measurement  and  Evaluationin  Teaching,  5
th
Edition,  New York: McMillan Publishing Company, 1981, pp. 11-12.
21
Jeremy  Harmer,  The  Practice  of  English  Language  Teaching,  Essex:  Pearson Education Limited,4
th
Ed., p. 379.
22
Tim McNamara, Language Testing, Hongkong: Oxford University Press, 2000, p. 7
13
proficiency test is a kind of test to exam an individual’s language ability related to specific language requirements.
23
It  can  be  concluded  that  proficiency  test  is  kind  of  test  whose purpose  to  measure  an  individual’s  language  mastery  based  on  the
material required.
D. The Criteria of a Good Test
According  to  Henning,  there  are  three  criteria  that  should  be  considered  to make a good test; validity, reliability, and practicality.
24
1. Validity
Henning defines, ”Validity in general refers to the appropriateness of  given  test  of  any  of  its  component  parts  as  a  measure  of  what  it  is
purported to measure.
25
From the definition above, the writer assumes that the validity of a test is totally needed because it relates to the conformity of the test to some
goals in curriculum or course of the study. 2.
Practicality
Practicality  means  that  the  test  should  be  easy  to  practice. Practicality  is  concerned  with  a  wide  range  of  factors  of  economy,
convenience,  and  interpretability  that  determine  whether  a  test  is practical for widespread use.”
26
It means that practicality is related to the process  of  the  administration  of  the  test  which  concerns  with  economy,
convenience, and interpretability aspects.
Based  on  statement  above,  it  is  inferred  that  there  are  three important  aspects  related  to  the  practicality  of  the  test.  The  first  is  the
efficiency  of  the  time,  the  efficiency  of  the  cost  and  the  ease  of administration.
23
Valette, op, cit., p.6.
24
Grant  Henning,  A  Guide  to  Language  Testing:  Development,  Evaluation, Research,Cambridge: Newburry House Publishers, 1987, p. 88.
25
Ibid., p. 89.
26
Robert L. Thorndike and Elizabeth Hagen, Measurement and Evaluation in Psychology and Education,
New York: Jhon Willey  Son’s Ltd, 1977, p. 160.
14
3. Reliability
Reliability  of  the  test  can  be  related  to  the  different  candidates, occasions  and  the  results  of  the  test.  It  means  that  if  the  test  is
administered to the different candidates in the different time makes same result, so the test will be reliable.
27
Regarded to this statement the writer concludes that the test can be said reliable if the test is administered to the different test takers in same
or different time and then it produces same score.
E. Validity
Validity  is  often  defined  as  the  degree  of  the  successful  a  test  measures what  is  intended  to  measure.  validity  of  the  test  is  admitted  if  there  is
correspondence between the purpose and the skill that want to be measured. According to Gronlund,  validity of the test is the extent to which the test
measures  what  it  is  supposed  to  be  measured.
28
It  means  that  the  test  must provide real measurement related to the particular skill. Validity also has the
meaning  the  appropriateness  between  the  material  given  and  the recommended skills which arise in the test.
29
To  sum  up,  validity  is  the  basic  thing  in  constructing  the  test,  because without the appropriateness between the material and the skill in the test, the
purpose of the test could not be reached.
Concerning the types of the validity, Hughes divided it into four;  content validity, face validity, criterion-related validity, and construct validity.
30
27
Heaton, op. cit., p. 162.
28
Heaton, op. cit., p. 159.
29
Norman  E.  Gronlund,  Measurement  and  Evaluating  in  Teaching,  New  York: Macmillan Publishing Co., Inc., 1981, p. 65 - 67.
30
Arthur Hughes, op, cit., p. 26
15
1. Content Validity
Content validity is a test whose test items appropriate with the material
and instruction which is given in the teaching learning process.
However, according to Hopkins, the definition of content validity itself is  the  level  of  representativeness  sample  of  content  universe  andor
behavior of the domain which is measured in the test.
31
Moreover,  Alderson  et  al. stated  that  “content  validity  is  the
representativeness or sampling adequacy of the content-the substance, the matter,  and  the  topics-of  a  measuring  instrument.  The  content  means  the
material in the curriculum. That is why this kind of validity is also called as curricular validity.
32
Content validity is also defined as how good the representativeness of the domain of tasks in  the test items.  In analyzing  content validity of the
test, the tester should compare between test tasks and domain of the test in
which it must be described considerately.
According  to  Wiersma  and  Jurs,  there  are  two  methods  for demonstrating the content validity of a test. The first method is by listing
the specific objectives of the test that want to be reached and match them with each test item. And then, decide whether they have been appropriate
or  not.  Another  method  is  by  constructing  a  table  to  classify  the  items’ content  and  taxonomic  level,  that  is,  student  outcome  required  on  the
item.
33
The  methods  mentioned  above  are  known  as  validation  process.  If  a test does not have content validity, there could be some problem appeared.
The  first  problem  is  the  students  could  not  show  the  skill  that  they  have related  to  the  material  in  the  class.  The  second  problem  is  the  students
31
Kenneth  D.  Hopkins,  Educational  and  Psychological  Measurement  and  Evaluation, Boston: Viacom Company, Eight Edition., 1998, p. 77
32
J.  Charles  Alderson  et  al.,  Language  Test  Construction  and  Evaluation,  Cambridge: Cambridge University Press, 1995, p. 173
33
William Wiersma and Stephen G Jurs, loc, cit.