An analysis on a reading summative test of the tenth grade senior high schools in Yogyakarta.

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

AN ANALYSIS ON A READING SUMMATIVE TEST
OF THE TENTH GRADE SENIOR HIGH SCHOOLS
IN YOGYAKARTA
A SARJANA PENDIDIKAN THESIS
Presented as Partial Fulfillment of the Requirements
to Obtain the Sarjana Pendidikan Degree
in English Language Education

By
Prisca Tri Kristiana
Student Number: 081214061

ENGLISH LANGUAGE EDUCATION STUDY PROGRAM
DEPARTMENT OF LANGUAGE AND ARTS EDUCATION
FACULTY OF TEACHERS TRAINING AND EDUCATION
SANATA DHARMA UNIVERSITY
YOGYAKARTA
2012


PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

AN ANALYSIS ON A READING SUMMATIVE TEST
OF THE TENTH GRADE SENIOR HIGH SCHOOLS
IN YOGYAKARTA
A SARJANA PENDIDIKAN THESIS
Presented as Partial Fulfillment of the Requirements
to Obtain the Sarjana Pendidikan Degree
in English Language Education

By
Prisca Tri Kristiana
Student Number: 081214061

ENGLISH LANGUAGE EDUCATION STUDY PROGRAM
DEPARTMENT OF LANGUAGE AND ARTS EDUCATION
FACULTY OF TEACHERS TRAINING AND EDUCATION
SANATA DHARMA UNIVERSITY
YOGYAKARTA
2012

i

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

ii

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

iii

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

“Happy are those who dream dreams and are ready to pay the price
to make them come true”
(Leon J. Suenes)

Dedicated to
My late father Petrus Sudiharjono,
My beloved mother Christina Rubinem,
My lovely sisters: Fransisca Suhartini & Lusia Suharyanti


iv

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

v

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

vi

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

ABSTRACT
Kristiana, Prisca Tri. 2012. An Analysis on a Reading Summative Test of the Tenth
Grade Senior High Schools in Yogyakarta. Yogyakarta: English Language
Education Study Program, Sanata Dharma University.
Reading skill is considered significant in English learning. In order to
know the students’ reading ability, the reading summative test is administered at
the end of the semester in senior high school. Regarding the criteria of a good test,

the researcher intends to learn the validity of the reading summative test of the
tenth grade senior high schools in Yogyakarta.
This research was conducted to answer the major problem namely how the
validity of the reading summative test of the tenth grade senior high schools in
Yogyakarta is. Besides, this research also aimed to answer four minor problems,
which were: 1) How is the face validity of the reading summative test of the tenth
grade senior high schools in Yogyakarta?, 2) How is the content validity of the
reading summative test of the tenth grade senior high schools in Yogyakarta?, 3)
How is the item facility of the reading summative test of the tenth grade senior
high schools in Yogyakarta?, and 4) How is the item discrimination of the reading
summative test of the tenth grade senior high schools in Yogyakarta?.
The document analysis method was used in analyzing the document of the
reading summative test of the tenth grade senior high schools in Yogyakarta. The
face validity was obtained by gathering the experts’ opinion using a checklist
table. The content validity was obtained by matching the reading passages to the
text genres and the test items to the objectives in the syllabus. The syllabuses and
the students’ answer sheets were obtained from SMA N 3 Yogyakarta, SMA N 6
Yogyakarta, and SMA N 11 Yogyakarta. The item facility and the item
discrimination were analyzed and categorized after calculating the item facility
and the item discrimination index.

After analyzing the data, the results revealed that the reading summative
test of the tenth grade senior high schools Yogyakarta had validity in terms of face
validity and content validity. The test had face validity since it consisted of brief
and understandable instructions and questions, appropriate length of passages and
time allotment, and clear format and copy. The reading passages covered all of
text genres which were stated in the competence standard and the basic
competence. Almost all of the test items covered the objectives which were stated
in the syllabus. However, the test needed improvement in terms of item facility
and item discrimination. The reading summative test was considered easy and
could not be used to discriminate the tenth grade senior high school students of
SMA N 3 Yogyakarta, SMA N 6 Yogyakarta, and SMA N 11 Yogyakarta
Keywords: face validity, content validity, item facility, item discrimination.

vii

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

ABSTRAK
Kristiana, Prisca Tri. 2012. An Analysis on a Reading Summative Test of the Tenth
Grade Senior High Schools in Yogyakarta. Yogyakarta: Program Studi Bahasa

Inggris, Universitas Sanata Dharma.
Keterampilan membaca dianggap penting dalam pembelajaran bahasa
Inggris. Untuk mengetahui kemampuan membaca siswa, tes sumatif membaca
dilaksanakan pada akhir semester di Sekolah Menengah Atas (SMA). Berkenaan
dengan kriteria tes yang baik, peneliti bermaksud untuk mempelajari validitas tes
sumatif membaca untuk kelas sepuluh (X) Sekolah Menengah Atas (SMA) di
Yogyakarta.
Penelitian ini dilakukan untuk menjawab permasalahan utama, yaitu
bagaimana validitas tes sumatif membaca untuk kelas sepuluh (X) Sekolah
Menengah Atas (SMA) di Yogyakarta. Selain itu, penelitian ini juga bertujuan
untuk menjawab empat permasalahan tambahan, yaitu: 1) Bagaimana validitas
permukaan tes sumatif membaca untuk kelas sepuluh (X) Sekolah Menengah Atas
(SMA) di Yogyakarta?, 2) Bagaimana validitas isi tes sumatif membaca untuk
kelas sepuluh (X) Sekolah Menengah Atas (SMA) di Yogyakarta?, 3) Bagaimana
fasilitas soal tes sumatif membaca untuk kelas sepuluh (X) Sekolah Menengah
Atas (SMA) di Yogyakarta?, dan 4) Bagaimana diskriminasi soal tes sumatif
membaca untuk kelas sepuluh (X) Sekolah Menengah Atas (SMA) di
Yogyakarta?.
Metode analisis dokumen digunakan dalam menganalisa dokumen tes
sumatif membaca untuk kelas sepuluh (X) Sekolah Menengah Atas (SMA) di

Yogyakarta. Validitas permukaan diperoleh dengan mengumpulkan pendapatpendapat dari para ahli tes menggunakan tabel ceklist. Validitas isi diperoleh
dengan mencocokkan bacaan dengan jenis teks dan soal-soal tes dengan tujuan
pembelajaran dalam silabus. Silabus-silabus dan lembar jawab siswa diperoleh
dari SMA N 3 Yogyakarta, SMA N 6 Yogyakarta, dan SMA N 11 Yogyakarta.
Fasilitas soal dan diskriminasi soal dianalisa dan dikategorikan setelah
menghitung indeks fasilitas soal dan diskriminasi soal.
Setelah menganalisa data, hasil penelitian menunjukkan bahwa tes sumatif
membaca untuk kelas sepuluh (X) Sekolah Menengah Atas (SMA) di Yogyakarta
valid dalam bentuk validitas permukaan dan validitas isi. Ulangan tersebut
memiliki validitas permukaan karena terdiri dari instruksi dan pertanyaan yang
singkat dan dapat dipahami, panjang bacaan dan alokasi waktu yang sesuai, serta
format dan salinan yang jelas. Bacaan-bacaan dalam tes meliputi semua jenis teks
yang tercantum dalam standard kompetensi dan kompetensi dasar. Hampir semua
soal tes mencakup tujuan pembelajaran dalam silabus. Tetapi, soal tes tersebut
memerlukan perbaikan dalam hal fasilitas soal dan diskriminasi soal. Tes sumatif
membaca tersebut dianggap mudah dan tidak dapat digunakan untuk membedakan
siswa kelas sepuluh (X) SMA N 3 Yogyakarta, SMA N 6 Yogyakarta, dan SMA
N 11 Yogyakarta.
Kata kunci: validitas permukaan, validitas isi, fasilitas soal, diskriminasi soal.
viii


PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

ACKNOWLEDGMENTS
First of all, I would like to thank my Lord Jesus Christ for letting me be
one of the happiest people in this world. He makes me strong in my life.
I would like to thank C. Tutyandari, S.Pd., M.Pd. very much as my
advisor. She always encourages and guides me patiently. I thank V. Triprihatmini,
S.Pd., M.Hum., M.A., Henny Herawati, S.Pd., M.Hum., and Ag. Hardi Prasetyo,
S.Pd., M.A., who gave me the data of my thesis.
I would like to express my gratitude to the headmasters of SMA N 3
Yogyakarta, SMA N 6 Yogyakarta, and SMA N 11 Yogyakarta, who gave me the
data. I would like to thank Drs. Harwanto, Mr. Nguji Mulyono, S.Ag., Kusworo,
S.Pd., M.Hum., and F. Sunu Purwawarsita, S.Pd., for their help.
I would like to thank my mother for her prayer and support every day. She
gives me spirit in my life. I also thank my sisters and brothers, who always
support me from the distance.
I am so grateful to Andreo Asdifati, who always accompanies me and
makes me stronger to face various problems during my study. I thank my cousins,
Mbak Maria and Mbak Wiwin, who helped me to edit and print my thesis. I thank

my buddies in my second home (Angelin, Mbak Oda, and Checyk) for their
support. I also thank everyone who supports me but I cannot mention them one by
one on this page.
The Writer
Prisca Tri Kristiana

ix

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

TABLE OF CONTENTS
Page

TITLE PAGE………………………………………………………………

i

APPROVAL PAGE………………………………………………………..

ii


DEDICATION PAGE……………………………………………………..

iv

STATEMENT OF WORK’S ORIGINALITY…………………………..

v

LEMBAR PERNYATAAN PERSETUJUAN PUBLIKASI………………

vi

ABSTRACT………………………………………………………………..

vii

ABSTRAK…………………………………………………………………..

viii


ACKNOWLEDGMENTS…………………………………………………

ix

TABLE OF CONTENTS………………………………………………….

x

LIST OF TABLES…………………………………………………………

xiii

LIST OF APPENDICES…………………………………………………..

xiv

CHAPTER I. INTRODUCTION…………………………………………

1

A. Background………………………………………………………….

1

B. Problem Limitation……………………………..…………………...

3

C. Problem Formulation…………….………………………………….

3

D. Objectives……………………………………………………………

4

E. Benefits……………………………………………………………...

4

F. Definition of terms…………………………..………………………

5

x

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

CHAPTER II. REVIEW OF RELATED LITERATURE………………

7

A. Theoretical Description……………………………………………...

7

1. Language Test…………………………………………………...

7

2. Summative Test………………………………………………….

8

3. Reading Test…………………………………………………….

8

4. Language Test Validity…………………….……………………

10

a. Face Validity………………………………………………...

10

b. Content Validity……………………….…………………….

11

c. Item Facility (IF)…………………………………………….

12

d. Item Discrimination (ID)……………………………………

13

B. Theoretical Framework……………………………………………...

14

CHAPTER III. RESEARCH METHODOLOGY……………………….

17

A. Research Method………………………………………………….…

17

B. Research Setting……………………………………………………..

18

C. Research Object and Participants……………………………………

18

D. Instruments and Data Gathering Techniques………………………..

19

E. Data Analysis Techniques…………………………………………...

24

F. Research Procedures………………………………………………...

26

CHAPTER IV. RESEARCH RESULTS AND FINDINGS…………….

27

A. Face Validity of the Reading Summative Test……………………...

27

B. Content Validity of the Reading Summative Test…………………..

32

xi

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

C. Item Facility of the Reading Summative Test………………………

36

D. Item Discrimination of the Reading Summative Test………………

40

CHAPTER V. CONCLUSIONS AND RECOMMENDATIONS…..…..

45

A. Conclusions…………………………………………………………

45

B. Recommendations...…………………………………………………

48

REFERENCES…………………………………………………………….

50

APPENDICES……………………………………………………………..

52

xii

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

LIST OF TABLES

Table

Page

2.1. Category of the item discrimination…………………………………...

13

3.1. The face validity comment on the reading summative test……………

19

3.2. The content validity of the reading summative test……………………

21

3.3. Item facility and item discrimination of the reading summative test.....

23

4.1. List of typing mistake in the test item………………………………....

29

4.2. List of typing mistake in the text……………………………………....

29

4.3. Passages length of the reading summative test………………………...

30

4.4. Item facility of SMA N 3 Yogyakarta…………………………………

37

4.5. Item facility of SMA N 6 Yogyakarta…………………………………

38

4.6. Item facility of SMA N 11 Yogyakarta………………………………..

38

4.7. The average of the item facility………………………………………..

39

4.8. The item discrimination of SMA N 3 Yogyakarta………………….....

41

4.9. The item discrimination of SMA N 6 Yogyakarta………………….....

42

4.10. The item discrimination of SMA N 11 Yogyakarta……………….....

43

4.11. The average of the item discrimination………………………………

44

xiii

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

LIST OF APPENDICES
Appendix
APPENDIX 1

Page
Surat ijin penelitian SMA N 3 Yogyakarta dan SMA N
11 Yogyakarta…………………………………………...

53

APPENDIX 2

Surat ijin penelitian SMA N 6 Yogyakarta……………...

54

APPENDIX 3

Data of the face validity of the reading test……………..

55

APPENDIX 4

Data of the content validity of the reading summative
test……………………………………………………….

APPENDIX 5

Item facility and item discrimination of the reading
summative test SMA N 3 Yogyakarta…………………..

APPENDIX 6

APPENDIX 11

67

Silabus Bahasa Inggris Kelas X SMA NEGERI 6
Yogyakarta Semester 1………………………………….

APPENDIX 10

64

Silabus Bahasa Inggris Kelas X SMA NEGERI 3
Yogyakarta Semester 1………………………………….

APPENDIX 9

61

Item facility and item discrimination of the reading
summative test SMA N 11 Yogyakarta…………………

APPENDIX 8

58

Item facility and item discrimination of the reading
summative test SMA N 6 Yogyakarta…………………..

APPENDIX 7

57

76

Silabus Bahasa Inggris Kelas X SMA NEGERI 11
Yogyakarta Semester 1………………………………….

85

The document of the reading summative test…………...

100

xiv

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

CHAPTER I
INTRODUCTION

This chapter contains research background, problem formulation, problem
limitation, research objectives, research benefits, and definition of terms.

A. Background
Reading skill is considered significant besides three other skills (listening,
speaking, writing) that are learned in English learning. It is proven by the
existence of reading skill in the curriculum. At school, the students are expected to
comprehend various kinds of English texts. By reading, they may obtain the
provided information, learn grammar and vocabulary in context, and know the
written expressions used in the texts. In her study, Merina (2009) argued that
reading gives significant contribution in English learning because the students
know how to use language in developing other skills.
The purpose of reading skill is not merely about grasping the information
from the text but also developing the reading strategies. According to Brown
(2004), the reading skills are categorized into microskills and macroskills.
Microskills are the subset of macroskills. Microskills for reading comprehension
consist of several objectives including retaining elements of language, recognizing
word classes and systems, and recognizing the different forms of a particular
meaning. Macroskills include recognizing the functions of the texts, inferring the
implicit meaning from context, and developing some reading strategies.

1

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

2

In order to know to what extent the students acquire reading ability,
teacher assesses their reading skill. It can be done in the form of formative and
summative test. Formative test evaluates the students’ skills in order to form their
competencies during the learning process. Even though a reading formative test is
necessary in the learning process, the summative test is also needed. It is supposed
to measure the students’ achievement during a period of time. In addition, it will
show how well the students attain the objectives of the learning process.
Regarding the importance of reading skill, the educational institutions or
schools develop some tests that include reading section. However, the kinds of the
tests may be varied in the form of the item number and type. Some kinds of
reading tests that are commonly used are multiple-choice, matching tasks, and
short-answer tasks. Those types are usually preceded by the passages in which the
students can find the answers.
However, a question comes out whether the test writers consider the
criteria of a good test or not. The criteria of a good test are practicality,
authenticity, validity, and reliability. One of the criteria of a good test is validity.
If the test is valid, it can be used to measure the students’ ability. According to
McNamara (2008), “The purpose of validation in language testing is to ensure the
defensibility and fairness of interpretations based on test performance”. Hence,
the test should be appropriate and accurate in measuring what is expected to
achieve. Does it really measure the students’ ability in reading? Is the test
appropriate related to the objectives of the reading skill? Does the test provide the
appropriate level of difficulty?

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

3

Therefore, as a teacher candidate, the researcher needs to learn the validity
of the reading test, especially in senior high school. In this case, the researcher
intends to analyze the reading summative test of the tenth grade senior high school
in Yogyakarta. By analyzing the validity of the reading test, the researcher learns
how the validity of the test should be and learn how to write a valid test in the
future. It also gives a reference for the test writers in order to be better in
composing a better test.

B. Problem Limitation
This study focuses on the validity of the English reading summative test of
the tenth grade senior high schools in Yogyakarta. The test was administered at
the end of the first semester year 2011. The type of the test is a multiple-choice
which was made by MKKS or Musyawarah Kerja Kepala Sekolah (The Principal
Consultative Work) in Yogyakarta. The validity will be analyzed in terms of face
validity, content validity, item facility, and item discrimination.

C. Problem Formulation
The major problem of this research is:
How is the validity of the reading summative test of the tenth grade senior
high schools in Yogyakarta?
The minor problems of this research are:
1. How is the face validity of the reading summative test of the tenth grade
senior high schools in Yogyakarta?

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

4

2. How is the content validity of the reading summative test of the tenth grade
senior high schools in Yogyakarta?
3. How is the item facility of the reading summative test of the tenth grade senior
high schools in Yogyakarta?
4. How is the item discrimination of the reading summative test of the tenth
grade senior high schools in Yogyakarta?

D. Objectives
The objectives of this research are:
To find out how the validity of the reading summative test of the tenth
grade senior high schools in Yogyakarta is.
1. To find out how the face validity of the reading summative test of the tenth
grade senior high schools in Yogyakarta is.
2. To find out how the content validity of the reading summative test of the tenth
grade senior high schools in Yogyakarta is.
3. To find out how the item facility of the reading summative test of the tenth
grade senior high schools in Yogyakarta is.
4. To find out how item discrimination of the reading summative test of the tenth
grade senior high schools in Yogyakarta is.

E. Benefits
The results of this research are expected to be useful for:
1. The teachers who intend to obtain some information from this research related

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

5

to test validity.
2. The test writers who need the reference related to the test validity in order to
improve the test validity in the future.
3. The other English test writers who plan to design a reading summative test in
order to make a better test in the future.
4. The future researchers who are interested in this topic and intend to do related
research.

F. Definition of terms
1. Analysis
Analysis is an activity of studying or researching something in order to
obtain an understanding about something which is analyzed. As stated in UCLES
(2001), “The purpose of analysis is to examine or think about something in detail
in order to understand it or get to know it better.” The analysis in this study refers
to the process of studying the reading summative test of tenth grade senior high
schools in Yogyakarta in order to investigate the validity of the test.
2. Reading test
Reading test is a test which measures the students’ reading ability. It
requires both the students’ comprehension and strategies in reading various kinds
of texts. It represents the objectives of reading skill design in the syllabus. The
types of the texts have been determined based on the curriculum. Brown (2004)
stated that “An inability to comprehend may thus be traced to a need to enhance a
test-taker’s strategies for achieving ultimate comprehension”. The reading test in

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

6

this study refers to the reading test at the end of the first semester in senior high
schools in Yogyakarta.
3. Summative test
Summative test is a test that is administered at the end of the course or
each semester. The test should measure what the students have learned during the
period of time. By summarizing the materials into a test, the test should reveal the
students’ achievement toward the objectives of the course. According to Brown
(2004), “summative assessment aims to measure, or summarize, what a student
has grasped, and typically occurs at the end of a course or unit of instruction”.
The summative test in this study refers to the reading summative test which was
conducted at the end of the first semester year 2011 in senior high schools in
Yogyakarta.
4. Tenth grade senior high school students
The level of the students that are going to be analyzed is in the tenth (X)
grade. In this level, the students have not been categorized into IPA (Ilmu
Pengetahuan Alam), IPS (Ilmu Pengetahuan Sosial), or Bahasa. The reading
ability of every student in this level are be different from one school to the other
schools. Therefore, the researcher determines the tenth grade students as the
subject of this study. The tenth grade students refer to the students from senior
high schools in Yogyakarta.
5. Senior high schools in Yogyakarta
Senior high schools in Yogyakarta in this study refers to the state senior
high schools for which MKKS makes the reading summative test.

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

CHAPTER II
REVIEW OF RELATED LITERATURE

This chapter contains theoretical description and theoretical framework. It
consists of theories which are used as the guidance in conducting this research.

A. Theoretical Description
1. Language Test
Bachman (2004) defines that “a test is a particular type of measurement
that is designed to elicit a certain behavior from which one can make inferences
about certain characteristics of an individual”. The definition provides a general
concept of a test. A test is an instrument to measure individual’s ability.
Specifically, Oller (1979) states that a language test is an instrument to measure
what the students have learned during the course (as cited in Wiratmo, 2009). In
his book, Bachman (1990) also states that a language test can be a device in
focusing an achievement of language abilities. It is clear that the aim of a
language test is to obtain a significant data from the learner related to their
achievement during the course.
Related to the aim of a language test, the objectives of the course should
be clear since the test is used to measure the students’ achievement. Mehrens and
Lehmann (1973) imply that a test helps the teacher to evaluate how far the
objectives have been attained. It means that the objectives are the changes which
the teacher expects from the students after the learning process.

7

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

8

Based on the purpose and the design of the test, the language tests can be
categorized into several types. The following is one of the test types related to the
current research.
2. Summative Test
Summative test is a kind of assessment which is administered at the end of
a course period. As Brown says (2004), “summative assessment aims to measure,
or summarize, what a student has grasped, and typically occurs at the end of a
course or unit of instruction”. It contains a summary of the course contents. At the
end of a semester, the students are tested to show their ability which they have
achieved during the course. This kind of assessment provides the information how
well the students’ have attained the objectives of the course.
3. Reading Test
Reading test is a test which measures the students’ ability in applying
reading skills. The students have to comprehend the texts given and apply their
skills such as skimming and scanning in reading the texts. More specifically,
Heaton (1975) says that the reading skills also include the ability to understand
the meaning of words and word groups in the context, catch the main idea from
paragraphs, scan the specific information in the passages, and draw the
conclusions of the text.
In the first chapter, the writer has indicated the microskills and macroskills
based on Brown’s (2004) classification. Those microskills and macroskills are
used in the syllabus as the objectives of reading. More completely, the microskills
include discriminating the distinctive graphemes and orthographic patterns of

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

9

English, retaining chunks of language, and processing writing at an efficient rate
of speed. The other microskills are recognizing words and interpreting word
patterns, recognizing grammatical word classes, systems, patterns, rules, and
elliptical forms, recognizing a meaning in different grammatical forms, and
recognizing cohesive devices in written discourse.
Meanwhile, the macroskills of reading based on Brown’s (2004)
classification are also varied. The macroskills are recognizing the rhetorical forms
and their significance, recognizing the communicative functions of written texts,
inferring context by using background knowledge, inferring links and connections
between events, and detecting the main idea and the specific information. Besides,
other skills such as differentiating between literal and implied meanings, detecting
specific culturally references and interpreting them in a context, and developing
the reading strategies are included in the macroskills.
The reading summative test consists of some passages and items or
questions. The reading test forms may also differ based on how they are designed.
It may be matching-task, true/false task, multiple-choice, completion task, or
open-ended question. However, Heaton (1975) states that multiple-choice can be a
useful form in testing reading since it is practical in terms of administering,
scoring, and interpreting. In the multiple choice form, there are three levels of the
length of the passage based on Heaton’s classification. The elementary level
consists of 50-100 words, the intermediate level consists of 200-300 words, and
the advanced level consists of 400-600 words.

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

10

4. Language Test Validity
One of the characteristics of a good test besides practicality, authenticity,
and reliability, is validity. As a measurement device, a language test should be
valid and high qualified. It means that the test should be able to measure what to
measure. According to Hughes (1989), “A test is said to be valid if it measures
accurately what is intended to measure”. The test should be a measuring rod of the
course objectives. Therefore, the test should be appropriate or compatible with the
skill that is measured. McNamara (2008) states that “The purpose of validation in
language testing is to ensure the defensibility and fairness of interpretations based
on test performance”. In analyzing reading test validity, there are some aspects
which can be involved. The following are the theories about the aspects in
analyzing the reading summative test.
a. Face Validity
Face validity shows the compatibility between the test and what to
measure. Based on Brown (2004) “face validity refers to the degree to which a test
looks right, and appears to measure the knowledge or abilities it claims to
measure”. The test items should be clear in terms of the instructions or questions
and the pictures, doable based on the time allocation given, appropriate in the
level of difficulty based on the students’ ability, and well-constructed in terms of
the format and copy. Since the face validity deals with the appearance of the test,
the test takers may make a judgment whether the test has face validity or not.
Even though the face validity only provides the surface appearance of the
test, it is important to investigate. Alderson, Clapham, and Wall (1995) state that

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

11

the face validity of the test affects the students’ performance in doing the test.
Heaton (1995) also states that the students will maintain their motivation if the
test fulfills the face validity. It means that the students will do the best on the test
if the test looks valid.
Regarding the significance of the face validity of the test, it is important to
make judgment on the test appearance. In investigating the face validity, others’
comments of the test are required. Alderson et al (1995) say that the process of
face validation deals with how the people comment on the appearance of the test.
It is clear that the face validity of the test is on the surface appearance of the test
itself.
b. Content Validity
Content validity shows how the content of the test items reflect the
objectives that are going to achieve. Hughes (1989) stated that “A test is said to
have content validity if its content constitutes a representative sample of the
language skills, structures, etc. with which it is meant to be concerned”.
McNamara (2008) also agrees that a test has a content validity if the test contains
a relevant content to the test purposes. Besides, the content of the test should be
relevant to the text genres which are stated in the standard of competency and the
basic competence of the School Based curriculum.
The other way to investigate the content validity of the test is by
comparing and matching the test items with the test specification or the syllabus.
It is because the syllabus provides the objectives of the course that are going to
achieve through the test. According to Bachman and Palmer (1980), “The

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

12

selection of the tasks is representatives of the larger set of tasks of which the test
is assumed to be a sample”. It means that the items of the test should reflect the
objectives stated in the syllabus. The test has the content validity if it provides
appropriate number of sample from the syllabus.
c. Item Facility (IF)
Item facility (IF) shows the appropriate level of difficulty of a test item.
The test item should not be too difficult or too easy because it should be a device
in measuring the students’ ability. As Brown (2004) stated, “Item facility (or IF)
is the extent to which an item is easy or difficult for the proposed group of testtakers”. Brown (2005) also states that item facility (IF) is in the form of statistic
percentage that shows the easiness of the test. The item facility provides the
comparison of the students’ correct answers and the total responses of the test
item.
In order to find out the item facility of the test, each item of the test should
be analyzed. The percentage of the item facility can be calculated by dividing the
number of the students who answer correctly with the total responses to the test
item. Brown (2004) states that the appropriate range of the item facility is between
.15 and .85 (≥ .15 to ≤ .85). If the percentage of the item facility (IF) is
approaching 0.00, it means that the test item is too difficult. Brown argues that a
very easy item (.85 or higher) is considered as the warm-up items and build
motivation of the low-ability students. Otherwise, a very difficult item may be a
challenge to the high-ability students.

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

13

d. Item Discrimination (ID)
As cited in Brown (2004), “Item discrimination (ID) is the extent to which
an item differentiates between high- and low-ability test-takers”. The item
discrimination (or ID) of a test item can discriminate the students ability as a
whole. According to Brown (2005), item discrimination (ID) refers to the
statistical description that can be used to discriminate the high-ability and lowability students.
In calculating item discrimination, the researcher has to divide the students
first into high-ability and low-ability students. When the high-ability students
answer correctly and the low-ability students answer wrongly on a test item, it
means that the test item has a good discrimination. In contrast, a test item has a
poor discrimination if the high-ability students and the low-ability students answer
the same. In calculation, if the ID of a test item is approaching 1.00, it means that
the test item has a high discrimination (Brown, 2004). As cited in Brown (2005),
Ebel (1979) has suggested the categorization of the item discrimination (ID) as
follows.
Table 2.1. Category of the item discrimination

No.

Item Discrimination (ID) Index

Category

1.

.40 and up (≥.40)

very good items

2.

.30 to .39 (≥.30 to ≤.39)

reasonably good

3.

.20 to .29 (≥.20 to ≤.29)

marginal items

4.

.19 and below (≤.19)

poor items

The categorization shows that the test items are considered to be
appropriate if it has item discrimination (ID) .40 and up (≥.40). Those items can

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

14

be used to differentiate the students’ ability. The items which have item
discrimination between .30 to .39 (≥.30 to ≤.39) are considered reasonably good
(but possibly subject to improvement). The marginal items which have item
discrimination between .20 to .29 (≥.20 to ≤.29) usually need improvement. The
poor items (low discrimination) have item discrimination .19 and below (≤.19).
Those items should be rejected or improved by revision.

B. Theoretical Framework
A language test is an assessment device in measuring the students’ ability.
The aim of a language test is to obtain a significant data from the students related
to their achievement during the course. It means that the objectives of the course
should be clear since the test is used to measure the students’ achievement. One
type of the language tests which can be used to measure the students’ achievement
during the course is the summative test.
Reading is considered important besides the other skills in language
learning. The reading summative test is administered at the end of each semester
in senior high schools. However, the researcher intended to find out the validity of
the reading summative test of the tenth grade senior high schools in Yogyakarta.
The reading test provides samples of the reading objectives. The objectives are in
the form of microskills and macroskills as stated in Brown (2004) and the syllabus
of the course. As a measurement instrument, the language test should provide a
high quality and validity in order to obtain an accurate measurement. There are
four aspects in investigating the validity of the reading summative test which

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

15

include the face validity, the content validity, the item facility, and the item
discrimination.
The face validity is the surface appearance of the test. However, the face
validity gives a significant effect to the students who take the test. If the test looks
valid, they will perform the best in doing the test. Even though the face validity
deals with the surface appearance of the test, the researcher needs experts’ help in
making judgment about the face validity of the reading summative test of the tenth
grade senior high schools in Yogyakarta. The experts are expected to give
comments on the test appearance in order to determine the face validity. The
comments include the clarity of the instructions and questions, the length of the
passages, and the clarity of the pictures.
The content validity refers to the compatibility between the test items with
the objectives which are stated in the syllabus of the course. The test content
should provide the relevant sample of the reading objectives. The content
validation can be done by checking the texts or passages included in the test with
the text genres which are stated in the standard of competency and the basic
competence and matching each item of the test with the objectives stated in the
syllabus. The reading summative test of the tenth grade senior high schools in
Yogyakarta has the content validity if it includes all text genres and provides
appropriate number of items which are matching with the objectives in the
syllabus.
The item facility (IF) in this study refers to the difficulty level of the
reading summative test of the tenth grade senior high schools in Yogyakarta. It is

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

16

in the form of statistical percentage that shows the difficulty level of the test. The
item facility (IF) provides supporting evidence in investigating the test validity.
Therefore, the test must provide an appropriate level of difficulty. The range of
the appropriate item facility of the test is between .15 and .85.
The item discrimination (ID) in this study refers to the ability of the
reading summative test items of the tenth grade senior high schools in Yogyakarta
to differentiate the high and the low ability students. The test items should be able
to separate or discriminate the students’ ability in doing the test. The test has the
appropriate item discrimination if the percentages reach or above .40 (≥.40).

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

CHAPTER III
RESEARCH METHODOLOGY

This chapter discusses the method used in this research. It consists of six
parts which are research method, research setting, research object and participants,
instruments and data gathering techniques, data analysis techniques, and research
procedure.

A. Research Method
This research used a document or content analysis as the method.
Document analysis might use both written and recorded data as the source of the
research. Fraenkel and Wallen (2008) defined content analysis as a technique to
study human behavior and communication indirectly. As Ary, Jacobs, Sorensen,
and Razavieh (2010) said, “Content analysis focuses on analyzing and interpreting
recorded material to learn about human behavior”. The researcher analyzed and
interpreted the data in order to comprehend the document. In this research, the
researcher analyzed the document which was the reading summative test of the
tenth grade senior high schools in Yogyakarta. The test was administered at the
end of the first semester year 2011.
The researcher intended to report the validity of the reading summative
test of the tenth grade senior high schools in Yogyakarta using the document
analysis method. The document analysis enabled the researcher to categorize the
data before conducting the research. In this research, the categories of the validity

17

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

18

included not only face validity and content validity, but also item facility and item
discrimination. Each category was analyzed by using different instruments in
order to find out the validity of the reading summative test.

B. Research Setting
The reading summative test of tenth grade senior high school was
administered on Tuesday, December 6, 2011. All state senior high schools in
Yogyakarta held the test at the same time. The research was conducted during the
even semester (February until June). The research included gathering the data,
analyzing it, and drawing the conclusion. The data gathering included obtaining
the experts’ opinion, the syllabuses and the students’ answer sheets. The data
gathering was conducted at schools and in the campus.

C. Research Object and Participants
The object of this research was the reading summative test of the tenth
grade senior high schools in Yogyakarta. It consists of fifty items in term of
multiple-choice test. It was conducted at the end of first semester year 2011. The
test was made by MKKS (Musyawarah Kerja Kepala Sekolah) Yogyakarta or The
Principal Consultative Work of Yogyakarta. The test document was analyzed by
the document analysis method.
The researcher intended to obtain the opinion of some participants about
the face validity of the test. The participants of this research were some experts in
English teaching who were familiar with the language test. They were three

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

19

lecturers of English Language Study Program of Sanata Dharma University and
two English teachers of senior high school teachers in Yogyakarta. They were
given a checklist table of the face validity to assist them in giving comment on the
reading summative test.

D. Instruments and Data Gathering Techniques
The focus of this research was the validity of the reading summative test of
the tenth grade senior high schools in Yogyakarta including the face validity, the
content validity, the item facility, and the item discrimination. The face validity of
the test was checked by some experts who were familiar with English teaching
and English testing. The content validity, the item facility, and the item
discrimination were analyzed by the document analysis method.
The face validity of the reading summative test showed how the test
looked right and the surface appearance of the test. Since it was about the surface
appearance of the test, it included the clarity of the instructions, the clarity of the
questions, the length of the passages, and the clarity of the pictures. The face
validity excluded the content of the test in relation with the test specification or
objectives. The researcher gave three lecturers of English Language Study
Program of Sanata Dharma University and two English teachers of senior high
school in Yogyakarta a checklist table in order to obtain their comments about the
face validity of the test. The researcher intended to find out how the face validity
of the reading summative test of the tenth grade senior high schools in Yogyakarta
is. The following was the table of the face validity.

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

20

Table 3.1. The face validity comment on the reading summative test

No.
1.

Aspect
Instructions

Criteria

Comment

a. Brief and understandable
b. Brief but not
understandable
c. Long but understandable
d. Long and not
understandable

2.

Questions

a. Brief and understandable
b. Brief but not
understandable
c. Long but understandable
d. Long and not
understandable

3.

Length of the
passages

a. 50-100 words
(elementary level)
b. 200-300 words
(intermediate level)
c. 400-600 words
(advanced level)

4.

Pictures

a. Clear and relevant
b. Clear but irrelevant
c. Unclear but relevant
d. Unclear and irrelevant

The content validity of the reading summative test of the tenth grade
senior high schools in Yogyakarta was checked by using a checklist table.
Besides, the contents of the test were analyzed based on the standard of
competency and the basic competence of the School Based curriculum. The

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

21

syllabuses of the tenth grade senior high schools in Yogyakarta were also needed
to check the content validity of the reading summative test.
The syllabuses contained the reading standard competencies, basic
competencies, and the objectives. Therefore, the test items were analyzed based
on the syllabuses. The syllabuses were obtained from SMA N 3 Yogyakarta, SMA
N 6 Yogyakarta, and SMA N 11 Yogyakarta. The researcher analyzed the
syllabuses and combined the objectives into the objectives as shown in the
following table. The following was the checklist table which was used to analyze
the content validity of the reading summative test.
Table 3.2. The content validity of the reading summative test

Standard
competencies
5.

Basic competencies

Understanding 5.1
Responding
the
the meaning of
meaning
of
short
short functional
functional text (e.g.
text and essay
announcement,
in a form of
advertisement,
recount,
invitation, etc.) formal
narrative, and
and informal using
procedure
in
various
written
the context of
language
accurately,
daily life and
fluently, acceptable, in
accessing
the context of daily life.
knowledge.
5. 2
Responding
the
meaning and rhetorical
steps in essay using
various written text
accurately, fluently, and
acceptable
in
the
context of daily life and
accessing knowledge in
the text of: recount,
narrative,
and
procedure.

Objectives

Item
Number

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

22

The researcher intended to analyze the item facility and the item
discrimination by analyzing the answer sheets of the students and the students’
scores. Those data were obtained from three senior high schools in Yogyakarta.
The schools were SMA Negeri 3 Yogyakarta, SMA Negeri 6 Yogyakarta, and
SMA Negeri 11 Yogyakarta. Those schools were the representatives of the senior
high schools in Yogyakarta. The item facility (IF) of the reading summative test
of the tenth grade senior high schools in Yogyakarta was analyzed by using the
formula which was stated in Brown (2004).
IF =

# of students answering the item correctly
Total # of students responding to that item

Calculation of the item discrimination (ID) was started by categorizing the
students’ scores. They were divided into the high-ability students, the middleability students, and the low-ability students. Then, the item facility (IF) of highability and low-ability were calculated by using Brown’s (2004) formula. To
calculate the item discrimination (ID), the item facility for the lower category
(low-ability student) was subtracted from the item facility for the upper category
(high-ability student). The item discrimination (ID) of the reading summative test
of the tenth grade senior high schools in Yogyakarta was analyzed by using the
formula which was stated in Brown (2005).
ID = IF upper – IF lower
Using the Microsoft Excel program, the data of the item facility (IF) and
the item discrimination (ID) could be combined into the same table. However, the

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

23

IF and ID tables of the three senior high schools were separated in order to make
it clear and understandable. The table contained student number, item number,
students’ total score, IF, IF upper, IF lower, and ID. The following was the table
which was used in calculating the item facility (IF) and the item discrimination
(ID) of the reading summative test.
Table 3.3. Item facility and discrimination of the reading summative test

St #

Items
1

2

3

4

5

6

Total
7

8

9

10

e tc

1
2
3
4
5
IF
IF upper
IF lower
ID

E. Data Analysis Techniques
The major data of this research was the reading summative test of the tenth
grade senior high school. The researcher intended to learn the validity of the test
by analyzing the test items. The validity included the face validity, the content
validity, the item facility (IF), and the item discrimination (ID). After analyzing
the four aspects, the validity of the reading summative test was concluded. The
following were the techniques in analyzing those aspects:

PLAGIAT MERUPAKAN TINDAKAN TIDAK TERPUJI

24

1. Face Validity
The face validity showed the compatibility between the test and what to
measure. The face validity was the surface appearance of the test. The researcher
analyzed the face validity of the test by analyzing the checklist results. The results
contained the participants’ opinion about the reading summative test. The
researcher gathered their opinion in the table 3.1 and analyzed it to find out
whether the test had face validity or not. The face validity included the clarity of
the instructions and the questions, the length of the passages, and the clarity of t