ON STUDY ND ARTS E ING AND E NIVERSITY TA E-CHOICE GRADE PA HIGH SCH uirements ree

THE DE NARRAT LUHU

  EN DEP FAC ESIGN OF A TIVE READ UR SANTO Y Presented to O GLISH LA PARTMEN CULTY OF SA AN ACHIE DING TEST YOSEF SU A as Partial F Obtain the Sa in English

  Ru Student N

  ANGUAGE E NT OF LAN F TEACHER ANATA DH YO EVEMENT M T FOR THE URAKARTA A THESIS Fulfillment arjana Pend

  Language E

  By uth Widyasar Number: 061

  EDUCATIO NGUAGE AN RS TRAINI HARMA UN GYAKART 2011 MULTIPLE E TENTH G A SENIOR H of the Requ didikan Deg Education

  ri 1214020

  ON STUDY ND ARTS E

  ING AND E NIVERSITY TA E-CHOICE GRADE PA HIGH SCH uirements ree Y PROGRA EDUCATIO EDUCATIO Y E ITEM ANGUDI HOOL

  M ON ON

THE DE NARRAT LUHU

  EN DEP FAC ESIGN OF A TIVE READ UR SANTO Y Presented to O GLISH LA PARTMEN CULTY OF SA AN ACHIE DING TEST YOSEF SU A as Partial F Obtain the Sa in English

  Ru Student N

  ANGUAGE E NT OF LAN F TEACHER ANATA DH YO EVEMENT M T FOR THE URAKARTA A THESIS Fulfillment arjana Pend

  Language E

  By uth Widyasar Number: 061

  EDUCATIO NGUAGE AN RS TRAINI HARMA UN GYAKART 2011 MULTIPLE E TENTH G A SENIOR H of the Requ didikan Deg Education

  ri 1214020

  ON STUDY ND ARTS E

  ING AND E NIVERSITY TA E-CHOICE GRADE PA HIGH SCH uirements ree Y PROGRA EDUCATIO EDUCATIO Y E ITEM ANGUDI HOOL

  M ON ON A A Thesis on

  THE DE ESIGN OF A AN ACHIE EVEMENT M MULTIPLE E-CHOICE E ITEM NARRAT TIVE READ DING TEST T FOR THE E TENTH G GRADE PA ANGUDI LUHU UR SANTO Y YOSEF SU URAKARTA A SENIOR H HIGH SCH HOOL

  By Ru uth Widyasar ri

  Student N Number: 061 1214020 A Approved by y

  S Sponsor C Caecilia Tut tyandari, S.P Pd., M.Pd.

  23 May 20

  11

  Dedication Page This thesis is proudly dedicated to: My beloved parents Bernadheta Sri Rejeki and Kristian Budi Santoso, S.E

  My dearest husband Gede Manuel Kris Sudianto My beloved brother Thomas Tirta Suminar All of my friends: Ria, Manda,Dewati, Tiwi, and all of the 2006 students of English Education & Study Program

  The readers of this thesis

  

STATEMENT OF WORK’S ORIGINALITY

  I honestly declare that this thesis, which I have written, does not contain the work or parts of the work of other people, except those cited in the quotations and references, as a scientific paper should.

  Yogyakarta, 7 June 2011 The Writer

  Ruth Widyasari 061214020

  

LEMBAR PERNYATAAN PERSETUJUAN

PUBLIKASI KARYA ILMIAH UNTUK KEPENTINGAN AKADEMIS

  Yang bertanda tangan di bawah ini, saya mahasiswa Universitas Sanata Dharma: Nama : Ruth Widyasari Nomor Mahasiswa : 061214020

  Demi pengembangan ilmu pengetahuan, saya memberikan kepada Perpustakaan Universitas Sanata Dharma karya ilmiah saya yang berjudul:

  

THE DESIGN OF AN ACHIEVEMENT MULTIPLE-CHOICE ITEM

NARRATIVE READING TEST FOR THE TENTH GRADE PANGUDI

LUHUR SANTO YOSEF SURAKARTA SENIOR HIGH SCHOOL

  Dengan demikian saya memberikan kepada Perpustakaan Universitas Sanata Dharma hak untuk menyimpan, mengalihkan dalam bentuk media lain, mengelolanya dalam bentuk pangkalan data, mendistribusikan secara terbatas, dan mempublikasikannya di Internet atau media lain untuk kepentingan akademis tanpa perlu meminta ijin dari saya maupun memberikan royalty kepada saya selama tetap mencantumkan nama saya sebagai penulis. Demikian pernyataan ini yang saya buat dengan sebenarnya. Dibuat di Yogyakarta Pada tanggal: 7 Juni 2011 Yang menyatakan Ruth Widyasari

  

ABSTRACT

  Widyasari, Ruth. 2011. The Design of an Achievement Multiple-Choice Item

  

Narrative Reading Test for the Tenth Grade Pangudi Luhur Santo Yosef

Surakarta Senior High School . Yogyakarta: English Language Education Study

  Program, Sanata Dharma University.

  This research focused on designing a multiple-choice item reading test for the tenth grade Pangudi Luhur St. Yosef Surakarta Senior High School 2009/2010 academic year. Designing a good multiple-choice reading test was very challenging since the multiple-choice tests are extremely difficult to design correctly. In this research, the writer designed a multiple-choice reading test as an achievement test. The writer limited the scope of reading as a skill to be tested into a genre so–called narrative. Narrative has been taught in the tenth grade of senior high school. Furthermore, the writer also investigated the validity of the test by analyzing the test items and giving questionnaire to the English teacher who taught the tenth grade Pangudi Luhur St. Yosef Surakarta Senior High School. The aims of this research were: (1) to describe how the multiple-choice item reading test for the tenth grade students of Pangudi Luhur St. Yosef Senior High School is designed; (2) to present the multiple-choice item reading test for the tenth grade students of Pangudi Luhur St. Yosef Senior High School.

  In addition this research was done by applying five steps of Borg’s and Gall’s R & D (Research & Development) cycle: research and information collecting, planning, developing preliminary form of product, preliminary field testing, main product revision and combined those steps with Bachman’s and Palmer’s test development which consisted of three stages: design, operationalization and administration. By combining the five steps of R & D cycle and the three stages of test development, the writer were able to describe how to design the multiple-choice item reading test and present the multiple-choice item reading test for the tenth grade students of Pangudi Luhur St. Yosef Senior High School.

  The test consisted of five (5) narrative reading texts taken from the internet and twenty (20) questions with five (5) items. The writer distributed the test to the tenth grade students and collected the answer sheets. From the answer sheets then the writer investigated the construct validity of the test by counting the Item Facility (IF) and Item Discrimination (ID). The writer also collected the result of the questionnaire from the English teacher who taught the tenth grade students to investigate content validity and face validity. After analyzing the validity of the test, the writer revised the test. There were 3 (three) numbers that the items should be revised because of very low point of item facility and item discrimination which meant the construct validity was not good.

  In conclusion, the writer applied the five steps of Borg’s and Gall’s R & D cycle and three stages of Bachman’s and Palmer’s test development which helped the writer to design the test correctly. By doing five steps of R & D cycle and three stages of Bachman’s and Palmer’s test development, the problem formulations could be answered. Furthermore, after the revision was made, the test could be applied in the tenth grade students of senior high school.

  

ABSTRAK

  Widyasari, Ruth. 2011. The Design of an Achievement Multiple-Choice Item

  

Narrative Reading Test for the Tenth Grade Pangudi Luhur Santo Yosef

Surakarta Senior High School . Yogyakarta: Program Studi Pendidikan Bahasa

  Inggris, Universitas Sanata Dharma.

  Penelitian ini difokuskan pada perancangan tes membaca dalam bentuk pilihan ganda untuk kelas X (sepuluh) Sekolah Menengah Atas Pangudi Luhur St. Yosef Surakarta tahun ajaran 2009/2010. Penulis memilih tes pilihan ganda dalam penelitian ini karena jarang digunakan untuk tes kemampuan akademik. Sebagai tes kemampuan akademik, penulis perlu mempertimbangkan kurikulum dan silabus sebagai pedoman untuk pembuatan tes. Penulis membatasi ruang lingkup membaca sebagai keterampilan yang akan diuji pada bentuk teks narasi. Teks narasi telah diajarkan di kelas X SMA. Selanjutnya, penulis juga meneliti validitas tes tersebut dengan menganalisa pilihan ganda dari tes dan memberikan kuesioner kepada guru Bahasa Inggris yang mengajar kelas X SMA Pangudi Luhur St Yosef Surakarta. Tujuan penelitian ini adalah: (1) untuk menjelaskan bagaimana tes membaca dalam bentuk pilihan ganda untuk siswa kelas X Pangudi Luhur St Yosef didesain; (2) untuk menyajikan tes membaca dalam bentuk pilihan ganda untuk siswa kelas X SMA Pangudi Luhur St Yosef Surakarta.

  Lebih Jauh lagi, penelitian ini dilakukan dengan menerapkan lima langkah siklus R & D Borg dan Gall: pengumpulan informasi dan materi penelitan, perencanaan, pengembangan bentuk awal dari produk, pengujian tes awal, revisi produk utama dan dikombinasikan dengan pengembangan tes oleh Bachman dan Palmer yang terdiri dari tiga tahap: desain, operasionalisasi dan administrasi. Dengan menggabungkan lima langkah siklus R & D dan tiga tahap pengembangan tes, penulis mampu menjelaskan bagaimana merancang tes membaca dalam bentuk pilihan ganda dan menyajikan tes membaca dalam bentuk pilihan ganda untuk siswa kelas X SMA Pangudi Luhur St Yosef Surakarta.

  Tes ini terdiri dari lima (5) teks bacaan narasi yang diambil dari internet dan dua puluh (20) pertanyaan dengan lima (5) pilihan. Lebih jauh lagi, penulis memberikan tes tersebut kepada siswa kelas X dan mengumpulkan lembar jawaban dari tes tersebut. Dari lembar jawaban tersebut penulis mencari validitas konstruk melalui penghitungan item fasilitas dan item diskriminasi. Penulis juga mengumpulkan hasil kuesioner dari guru bahasa Inggris yang mengajar siswa kelas X untuk menyelidiki validitas isi dan validitas bentuk. Setelah menganalisa validitas tes, penulis merevisi test tersebut. Ada tiga nomor yang harus direvisi karena rendahnya poin item fasilitas dan item diskriminasi yang berarti validitas konstuknya tidak bagus.

  Sebagai kesimpulan, penulis mengaplikasikan lima langkah dalam siklus R & D oleh Borg dan Gall serta tiga tahap dati pengembangan tes oleh Bachman dan Palmer yang membantu penulis mendesain tes dengan benar. Dengan melakulan lima langkah R & D sera tiga tahapan dari pengembangan tes, rumusan masalah telah terjawab. Lebih jauh lagi, setelah direvisi, tes tersebut dapat diaplikasikan untuk kelas sepuluh SMA.

  ACKNOWLEDGEMENTS

  In completing this thesis, I did not work alone. Some people helped me in finishing this thesis. Those people gave me supports and guidance when I wrote the thesis and did the research. In this occasion, I would like to express my gratitude to some special people.

  First of all, my sponsor, Caecilia Tutyandari, S.Pd., M.Pd. who always patiently gave the writer guidance in writing this thesis and doing the research and support in doing this thesis from the beginning until the writer could finish it. I still remember how I started writing this thesis enthusiastically and in the ninth semester, I did nothing. Finally, I did it. I am grateful for the assistance during finishing my thesis.

  My special gratitude goes to the principal of St. Yosef Pangudi Luhur Surakarta Senior High School, Br. Agustinus Mujiya, S.Pd. and all of the teachers in Pangudi Luhur St. Yosef Surakarta Senior High School who gave me opportunity to do my research. My special thankfulness goes to Maria Olise C. H., S. Pd. as the English teacher of the tenth grade Pangudi Luhur St. Yosef Senior High School 2009/2010 academic year who always supported me in doing the research and finishing this thesis and also became a partner to discuss everything.

  My thanks are also for the tenth grade students of Pangudi Luhur St. Yosef Surakarta Senior High School 2009/2010 academic year. All of the students were doing a great job to help the writer finishing the research.

  My deepest gratitude goes to my beloved parents, Bernadheta Sri Rejeki and Kristian Budi Santoso, S.E., my brother, Thomas, and dearest husband, Manuel. Those people I love so much never stopped asking me to finish this thesis as soon as possible. I really appreciate their love, support, and prayers.

  My best gratitude is for all of my friends and all of my lecturers. The contribution of those people to my study is very precious for me. I learn many things from them.

  Finally,” May He give you the desire of your heart and make all your plans succeed” (Psalm 20:4). God have given me success in completing this thesis. I hope that God also give all of people who helped me success in their life.

  Ruth Widyasari

  

TABLE OF CONTENTS

Page

  TITLE PAGE………………………………………………………………………i APPROVAL PAGES……………………………………..………………………ii STATEMENT OF WORK’S ORIGINALITY……...……………………………v

  

LEMBAR PERNYATAAN PERSETUJUAN PUBLIKASI………………………….... vi

  ABSTRACT…………………………………...…………………………………vii

  

ABSTRAK …………………………………………………………………………ix

  ACKNOWLEDGEMENTS……………………………………………………....xi TABLE OF CONTENTS………………………………………………………..xiii LIST OF TABLES……………………………………………………………....xvi LIST OF FIGURES………………………………………………………….…xvii LIST OF APPENDICES…………………………………………………….....xviii

  CHAPTER I. INTRODUCTION A. Research Background................................................................ 1 B. Problem Formulations ............................................................... 4 C. Problem Limitation ................................................................... 4 D. Research Objectives .................................................................. 5 E. Research Benefits ...................................................................... 5 F. Definition of Terms ................................................................... 6 CHAPTER II. REVIEW OF RELATED LITERATURE A. Theoretical Description ............................................................. 8 1. Reading ...............................................................................

  8

  

xiii

 

  2. Test ....................................................................................

  10

  3. Design and Test Development .......................................... 15

  B. Theoretical Framework ........................................................... 19

  CHAPTER III. METHODOLOGY A. Research Method ..................................................................... 22 B. Research Participants .............................................................. 28 C. Research Instruments .............................................................. 29 D. Data Gathering Technique ...................................................... 30 E. Data Analysis Technique ........................................................ 30 F. Research Procedure .................................................................

  31 CHAPTER IV. RESEARCH RESULTS AND DISCUSSION

  A. How the Achievement Multiple-Choice Item Narrative Reading Test is Designed…………………………………………..….35

  1. Research and Information Collecting ……………………35

  2. Planning…………………………………………………..39

  3. Developing Preliminary Form of Product………….. ……41

  4. Preliminary Field Testing…………………………...……42

  5. Main Product Revising…………………………………...50

  B. What the Achievement Multiple-Choice Item narrative Reading Test Looks Like ……..………………………………………..52

  CHAPTER V. CONCLUSIONS AND SUGGESTIONS

  A. Conclusions ............................................................................. 64

  B. Suggestions ............................................................................. 66 REFERENCES….………………………………………………………………..68 APPENDICES……………………………………………………………………70

  LIST OF TABLES

  Page  

Table 4.1 Design Statement of the Test…………………………………….37Table 4.2 Blueprint of the Test……………………………………………..39Table 4.3 The Results of Item Facility (IF) and Item Discrimination (ID)...42Table 4.4 The Test Revision………………………………………………..51

  

LIST OF FIGURES

  Page

Figure 1.1 Test, Achievement, and Teaching…………………………………1Figure 2.1 The Formula of Final Score……………………………………...15Figure 2.2 Test Development………………………………………………..18Figure 3.1 The Combination of R & D and Bachman’s and Palmer’s Test

  Development…………………………………………………….23

LIST OF APPENDICES

  Page

  Appendix 1 The Interview………………………………………………….…71 Appendix 2 The Lesson Plan………………………………………………….73 Appendix 3 The Syllabus ……………………………………………….…….82 Appendix 4 The Texts……………………………………………………..…..92 Appendix 5 The Answer Sheet………………………………………………..99 Appendix 6 The Questionnaire………………………………………………100 Appendix 8 The Result of Counting IF and ID………………………………101

   

  

CHAPTER I

INTRODUCTION In Chapter I, the writer elaborates the research background, problem

  formulations, problem limitation, research objectives, research benefits, and definition of terms.

A. Research Background

  English is one of the most important subjects in senior high school in Indonesia since English is one of the subjects, which is tested in National Examination. English plays important role in National Examination because it determines the students’ graduation. Because of this reason, the process of teaching and learning English is also essential. In senior high school, English is taught from the tenth grade up to the twelfth grade. In the process of teaching English, there are also assessing and testing process to monitor the students’ mastery in English . The relationship among teaching, assessment, and test can be seen in figure 1.1 below:

  Teaching Assessment

  Tests

  Figure 1.1: Test, achievement, and teaching (Brown, 2004: 5)

Figure 1.1 shows that teaching process cannot be separated from assessment and tests. Tests are a part of assessment and teaching, while

  assessment is also a part of teaching.

  In this research, the writer is challenged to design a test since tests are very important in teaching and learning process. Tests are very useful to record the students’ achievement and as tools for teacher to develop teaching and learning process. Brown (2004: 4) also states that

  Tests are prepared administrative procedures that occur at identifiable times in a curriculum when learners muster all their faculties to offer peak performance, knowing that their responses are being measured and evaluated.

  In this research, the writer focuses on designing a multiple-choice item reading test. Reading as one of language skills in learning English is chosen to design the test because reading is essential skill for success in all education contexts Brown (2004:185). Besides, in National Examination, most of the questions test the reading skill.

  There are some techniques of language tests that can be applied in schools such as short-answer tests, fill-in-the-blanks tests, and multiple-choice item tests.

  In this research, the writer designs a multiple-choice item reading test as an achievement test. According to Hughes in Brown (2004: 55), multiple-choice items, which may appear to be the simplest kind of item to construct, are extremely difficult to design correctly. Therefore, it will be challenging the writer to design a multiple-choice item reading test correctly. In addition, an achievement test is related directly to classroom lessons, units, or even a total curriculum (Brown, 2004:47). In this research, the writer will design a multiple- choice test for achievement test, because the writer wants to monitor the students’ understanding after they had been taught a certain topic.

  As stated above that the writer focuses on designing a multiple-choice reading test, therefore the writer also considers the reading performance of the students. There are four types of reading performances. They are perceptive, selective, interactive, extensive (Brown, 2004: 189). In this thesis, the writer focuses on measuring students’ performance in interactive reading.

  …interactive reading types are stretches of language of several paragraphs to one page or more in which the reader must, in psycholinguistics sense, interact with the text. Typical genres that lend themselves to interactive reading are anecdotes, shorts narratives and descriptions. The focus on an interactive task is to identify relevant features (lexical, symbolic, grammatical, and discourse) within texts of moderately short length with the objective of retaining the information that is processed (Brown, 2004: 189).

  In addition, several reading genres are taught in schools among others descriptive, narrative, recount, and report. To limit the scope of the test the writer chose narrative. Narrative is chosen because narrative is a classroom lesson, which is taught in the second semester of the tenth grade of senior high school as stated in the syllabus. Furthermore, one interesting feature of narratives texts in particular is that they appear to induce visualization in the reader as part of reading process-readers report ‘seeing’ scenes in their head when they read such texts (Alderson, 2000: 64).

  The test is tested to the respondents from the tenth grade of Pangudi Luhur St. Yosef Surakarta Senior High School because in the tenth grade all students have the same topic. The students have not been divided into some departments such as IA (Ilmu Alam), IS (Ilmu Sosial), or Bahasa. This study aims to design a multiple-choice item test and investigate the validity of the test. The writer will only investigate the validity of the test because it is the quality that provides the major justification for using test scores or number as a basis for making inferences or decisions. In addition, evaluating the overall usefulness of a given test is essentially subjective, since this involves value judgments on the part of the test developer (Bachman and Palmer, 1996:19).

  This study is very challenging since it will relate to the curriculum and syllabus, the participation of students in Pangudi Luhur St. Yosef Surakarta Senior High School, the English teacher of the tenth grade students in Pangudi Luhur St. Yosef Senior High School and the test itself.

  B. Problem Formulations

  In this research, the writer proposed the following questions:

  1. How is the achievement multiple-choice item narrative reading test designed?

  2. What does the achievement multiple-choice item narrative reading test look like?

  C. Problem Limitation

  In this study, the writer focuses on designing a multiple-choice items reading test for the tenth grade students of Pangudi Luhur St. Yosef Surakarta Senior High School, and investigating the validity of the test included face validity, content validity, and construct validity.

  D. Research Objectives

  There are two objectives of designing multiple-choice reading test for the tenth grade students of Pangudi Luhur St. Yosef Senior High School:

  1. To describe how the achievement multiple-choice item narrative reading test for the tenth grade students of Pangudi Luhur St. Yosef Senior High School is designed;

  2. To present the achievement multiple-choice item narrative reading test for the tenth grade students of Pangudi Luhur St. Yosef Senior High School.

  E. Research Benefits

  There are three benefits of this research, as follows:

  1. This study contributes to the English teacher of the tenth grade of Pangudi Luhur St. Yosef Senior High School. The teacher will have an alternative to test the students’ reading ability in narrative by using multiple-choice reading test since multiple-choice tests are infrequently used as achievement tests for classroom lesson.

  2. This study benefits the writer to answer the research problems and also as a teacher candidate, the writer needs to know how to make the multiple- choice item reading test and measure the quality of the multiple-choice item reading test.

  3. Hopefully, this study will give contribution to other researchers who want to conduct researches related to designing multiple-choice tests and measuring the quality of multiple-choice item reading test.

F. Definition of Terms

  In this study, there are four terms which are often used by the writer. To clarify the meaning of the terms, the writer provides a list below:

  1. Design in general is creating a new set of materials that fix the learning objectives and specific subject area of particular learners (Hutchinson and Waters, 1994:106). While design in test development is the process of describing the purpose (s) of the test, identifying and describing tasks in the TLU (Target Language Use: a context in which the test takers will be using the language outside the test itself) domain, describing the characteristics of the language users/test takers, defining the construct to be measured, developing a plan for evaluating the qualities of usefulness, identifying resources and developing a plan for their allocation and management (Bachman and Palmer, 1996: 86-89).

  2. Reading is transaction between the reader and the text in which the reader’s interpretation reflects both the meaning intended by the author and the meaning constructed by the reader (Armbruster & Osborn, 2002: 7). Reading

  for general comprehension is the ability to understand information in a text and interpret it appropriately (Grabe & Stoller, 2002: 9).

  3. Test is a method of measuring a person’s ability, knowledge, or performance in a given domain (Brown, 2004: 3) . The domain is overall proficiency in a language – general competence in all skills of a language (Brown, 2004: 4). In this study, the given domain is reading comprehension of narrative texts including obtaining main idea (topic), expressions/idioms/phrases in context, inference (implied detail), grammatical features, detail, excluding facts not written, supporting idea(s), and vocabulary in context.

  4. Multiple-choice tests are the tests that have a stem, which presents a stimulus, and several (usually between three and five) options or alternatives to choose from (Brown, 2004: 56).

CHAPTER II REVIEW OF RELATED LITERATURE In this chapter, the writer presents theoretical description from the experts, which are useful to conduct the research and theoretical framework of the study. The theoretical description contains the theoretical description of reading, test,

  and design and test development. While in theoretical framework, the writer summarizes and synthesizes the theories, which are useful to conduct the research and solve the research problems.

A. Theoretical Description

  The writer uses some theories as a basis in conducting this research. They are the theoretical description of reading, test, and design and test development.

  The theoretical description of reading is divided into definition of reading, types of reading and reading tasks. The theoretical description for the test is divided into the definition of the test, the purpose of the test, the format of the test, test validity, marking and scoring.

1. Reading

a. Definition of Reading

  Reading is transaction between the reader and the text in which the

  reader’s interpretation reflects both the meaning intended by the author and the meaning constructed by the reader (Armbruster & Osborn, 2002: 7). Reading for general comprehension is the ability to understand information in a text and interpret it appropriately (Grabe & Stoller, 2002: 9).

  b. Types of Reading Performance

  For considering assessment procedures, several types of reading performance are typically identified, and these will serve as organizers of various assessment tasks as follows: perceptive, selective, interactive, extensive (Brown, 2004: 189).

  In designing the test, the writer focuses on interactive reading. According to Brown (2004: 189) “Included among interactive reading types are stretches of language of several paragraphs to one page or more in which the reader must, in psycholinguistics sense, interact with the text. That is, reading is a process of negotiating meaning; the reader brings to the text a set of schemata for understanding it, and intake is the product of that interaction. Typical genres that lend themselves to interactive reading are anecdotes, shorts narratives and descriptions. The focus on an interactive task is to identify relevant features (lexical, symbolic, grammatical, and discourse) within texts of moderately short length with the objective of retaining the information that is processed”. The writer designs a multiple-choice item reading test, which focuses on a certain genres that is narrative because it is in accord with the curriculum for grade X.

  c. Reading Tasks

  The reading tasks that the writer used to design the test depend on the types of reading performance. The types of reading performance, which the writer focuses on, are interactive reading. Interactive reading can be implemented in various tasks, in particular: cloze tasks, impromptu reading plus comprehension questions, short-answer tasks, editing (longer texts), scanning, ordering tasks, information transfer: reading charts, maps, graphs, diagram (Brown, 2004:201- 210). In designing a multiple-choice item reading test, which focuses on narrative, the writer designed two kinds of tasks: cloze tasks and impromptu reading plus comprehension questions.

  According to Brown (2004: 201-206), cloze task is one of the most popular types of reading assessments task. In the written language, a sentence with a word left out should have enough contexts that a reader can close that gap with a calculated guess. While impromptu reading plus comprehension questions technique is undoubtedly the oldest and the most common. This technique covers the comprehension of these features: main idea (topic), expressions/idioms/phrases in context, inference (implied detail), grammatical features, detail, excluding facts not written, supporting idea(s), and vocabulary in context.

2. Test

a. Definition of Test

  Test is a method of measuring a person’s ability, knowledge, or performance in a given domain. It is an instrument- a set of techniques, procedures, or items- that requires performance in the part of the test-takers. (Brown, 2004:3).

  b. The Purpose of The Test

  According to Brown (2004:43), the first task teacher will face in designing a test for the students is to determine the purpose of the test. Defining the purpose of the test will help teacher choose the right kind of test, and it also will help the teacher to focus on the specific objectives of the test. Based on the objectives, tests are divided into five types, in particular language aptitude tests, language proficiency tests, placement tests, diagnostic tests, and achievement tests.

  In this study, the writer focuses on achievement test. Brown (2004:47) stated that an achievement test is related directly to classroom lesson, units, or even a total curriculum. Achievement tests are (or should be) limited to particular material addressed in a curriculum within a particular time frame and are offered after a course has focused on the objectives in question.

  c. The Format of The Test

  The writer develops an achievement test by designing a task so-called multiple-choice test, which is also known as selection items. Brown (2005: 43) states a multiple-choice format is basically receptive mode (students read and select, but produce nothing). According to Brown (2004: 55), multiple choice items, which may appear to be the simplest kind of item to construct, are extremely difficult to design correctly. Here are the characteristics of multiple choice items according to Brown (2004: 56):

  1. Multiple-choice items are all receptive, or selective, response items in that the test-taker chooses from a set of responses rather than creating a response.

  2. Every multiple choice item has a stem, which presents a stimulus, and several (usually between three and five) options or alternatives to choose from.

  3. One of those options, the key, is the correct response, while the others serve as distractors.

d. Test Validity

  The writer also investigates the validity of the test. Validity is the extent to which inferences made from assessment result are appropriate, meaningful, and useful in terms of the purpose of the assessment (Brown, 2004: 22). There are three criteria of validity, which are investigated by the writer, as follow: face validity, content validity, and construct validity.

  1. Face validity refers to the test’s surface credibility or public acceptability (Alderson, Clapham, and Wall, 1995: 172). Essentially face validity involves an intuitive judgment about test’s content by people whose judgment is not necessarily ‘expert’. (Alderson, Clapham, and Wall, 1995: 172)

  2. Content validity is the representativeness or sampling adequacy of the content- the substance, the matter, the topics- of a measuring instrument. Content validation involves gathering the judgment of ‘expert’. (Alderson, Clapham, and Wall, 1995: 173)

  3. Construct validity is therefore used to refer to the extent to which we can interpret a given test score as an indicator of the abilities, or construct, we want to measure. Construct validation is the on-going process of demonstrating that a particular interpretation of test scores is justified, and involves, essentially, building a logical case in support of a particular interpretation and providing evidence justifying that interpretation ( Bachman and Palmer, 1997:21-22).

  In addition, in this study, the writer measure construct validity through measuring item facility (or item difficulty), item discrimination (differentiation), and distractor analysis by using Microsoft Excel. Excel is an electronic spreadsheet program that can be used for storing and organizing data (French, 2011:1).

  1. Item facility (IF) is a statistic used to examine the percentage of students who correctly answer a given item (Brown, 2005:66). IF refers to the proportion of students who answered the question correctly, we calculated the IF using COUNTIF function for the letter corresponding to the correct answer (Elvin, 2003). The COUNTIF function is used to count up the number of cells in a selected range that meet certain criteria (French, 2011:1). Appropriate test items will generally have IFs that range between 0,15 and 0,85 (Brown, 2004: 59)

  2. Item discrimination (ID) is a statistic that indicates the degree to which an item separates the students who performed well from those who did poorly on the test as a whole. These groups are sometimes referred to as the “high” and “low” scorers or “upper” and “lower” proficiency students (Brown, 2004:68).

  ID is usually the difference between the IF for the top third of test takers and the IF for the bottom third of test takers for each item on a test, we can calculated the ID using SUM function (Elvin, 2003). The SUM function will add together the contents of all the cells in the range (French, 2011:1). High discriminating power would approach a perfect 1,0 (one) and no discriminating power at all would be 0 (zero).

  3. According to Brown (2004:60), distractor efficiency is the extent to which (a) the distractor “lure” a sufficient number of test-takers, especially lower-ability ones, and (b) those responses are somewhat evenly distributed across all distractors. In addition, Elvin (2003) stated that the quality of a test item may be poor. To try to identify these potential sources of measurement error, I calculate the ratio of students answering each question to students taking the test. It can be calculated by using SUM function.

e. Marking and Scoring In designing the test, the writer also determines how to give mark and score.

  There are basically two types of marking: subjective marking which is usually used for marking test of writing and speaking, and objective marking which is used for multiple-choice, true/false, error recognition and other item types where the candidate is required to produce a response which can be marked as either ‘correct’ or ‘incorrect’ (Alderson, Clapham, and Wall, 1995:106). Thus, in this research, the writer uses objective marking.

  If the test consists of a number of objective subsets (for example: multiple- choice), then each item may have been assigned a mark of 1 (one) if correct and 0 (zero) if wrong (Alderson, Clapham, and Wall, 1995:148). In this research, the writer gives score based on the mark of correct number which is divided by two (2). The formula is as follow:

    #

  2 Figure 2.1: the Formula of Final Score

3. Design and Test Development

  Design is in general a linear process, but in some cases some activities are iterative, that is, will need to be repeated a number of times (Bachman and Palmer, 1996: 86).

  In addition, design is arranging materials into a fixed and good material. The designing is the same as creating a new set of materials that fix the learning objectives and specific subject area of particular learners (Hutchinson & Waters, 1994:106). To design the multiple-choice reading test, the writer will apply some steps of test development.

  Test Development is the entire process of creating and using test, beginning with its initial conceptualization and design, culminating in one or more archived tests and the results of their use (Bachman & Palmer, 1996: 85). According to Bachman and Palmer (1996: 86-91), there are three stages of test development:

  1. Stage One: Design The product of the design stage is a design statement, which is a document that includes the following components: a) a description of the purpose (s) of the test, b) a description of the Target Language Use (TLU) domain and task types: Target Language Use is context in which the test takers will be using the language outside of the test itself. There are two types of TLU, real-life domain, in which language is used essentially for purposes of communication. The other type of domain consists of situations in which language is used for the purpose of teaching and learning of language, which is called as language instruction domain. While task types is in the form of multiple choice test.

  c) a description of the test takers for whom the test intended,

  d) a definition of the construct (s) to be measured,

  e) a plan for evaluating the qualities of usefulness, and

  f) an inventory of required and available resources and a plan for their allocation and management.

  2. Stage two: Operationalization Operationalization involves developing test task specifications for the types of test tasks to be included in the test, and a blueprint that describes how test tasks will be organized to form actual tests. Operationalization also involves developing and writing actual test tasks, writing instructions, and specifying the procedures for scoring the test. By specifying the conditions under which language use will be elicited and the method for scoring responses to these tasks, we are providing the operational definition of the construct.

  In addition, Brown (2004: 55) explains some practical steps to design multiple-choice test items: design each item to measure a specific objective, state both stem and options as simply and directly as possible, make certain that the intended answer is clearly and the only correct one, use item indices to accept, discard, or revise items.

  3. Stage three: Test administration The test administration stage of the test development involves giving the test to a group of individuals, collecting information, and analyzing this information for two purposes:

  a) assessing the usefulness of the test, and b) making the inferences or decisions for which the test intended.

  The Summary of three stages of Bachman’s and Palmer’s Test Development could be seen in the figure 2.2 below:

  Test Development

  Stages / activities Products

  

Design statement

ƒ Purpose of the test

  1. Design ƒ Description of the TLU

  Describing

  domain and task types

  Identifying

  ƒ Characteristics of test takers

  Selecting ƒ Definition of the construct (s) Defining ƒ Definition for evaluating Developing the qualities of usefulness

  ƒ Inventory of available

  Allocating

  resources and plan for their

  Managing

  allocation and management Blueprint

  Test structure ƒ Number of parts/tasks

ƒ Salience of parts

ƒ Sequence of parts

2.

  ƒ Relative importance of Operationalization

parts/tasks

  Selecting