Content analysis and authenticity of the 2012 english test in the senior high school national examination.

(1)

ABSTRACT

Widyaningrum, Frisca Ayu Desi. 2014. Content validity and authenticity of the 2012 English test in the senior high school national examination. Yogyakarta: English Language Education Study Program, Sanata Dharma University.

National Examination (UN) was the most important standardized test employed to assess Indonesian students’ competence, including English ability. Moreover, the final scores of the National Examination were prepared to be a tool to select state-university-student candidates. Due to that significance of National Examination, analyzing its validity and authenticity was important as well. Language test validity was categorized into face, content, construct, consequential and criterion-referenced validity. Due to consideration of time for analysis, only content validity of both listening section and the five reading test versions of National Examination were analyzed in this research. Besides, language test authenticity in this research referred to test tasks and test texts. This study mainly employed Brown’s theory (2004) about content validity and authenticity.

This study aimed to answer two questions, namely: 1) How valid is the content of English test items of National Examination year 2012 for senior high schools related to the lesson objectives and test specifications? and 2) How authentic is English test items of National Examination year 2012 for senior high schools related to the criteria of authenticity set by Brown?

The researcher employed a qualitative research with document analysis. The research objects were the listening items and the five types of reading test items of National Examination year 2012 which aimed to avoid students’ fraudulence. However, the five reading test versions had similarities on either the test tasks or the test texts. The data of the analysis were obtained by using checklists and the data were employed to answer the research questions. Besides, questionnaires were distributed to four experts as the data triangulation.

There were two findings of this research. First, the content of the National Examination year 2012 was 98.8% valid since almost the contents were relevant to the test specifications. There were three reading test versions failed to represent a certain kind of reading texts namely explanation text. Second, the National Examination year 2012 met the criteria of authenticity with percentage 79.5% since some listening and reading test items were not qualified to authenticity criteria. Natural language use, the relevance of the test topics, and real-world representativeness became problematic aspects to meet the higher standard of authenticity. This research was expected to be beneficial as a meaningful evaluation upon the administration of National Examination for senior high schools in Yogyakarta as well as be useful for English practitioners and future researchers.

Keywords: content validity, authenticity, English test items, National Examination, document analysis


(2)

ABSTRAK

Widyaningrum, Frisca Ayu Desi. 2014. An Analysis on content validity and authenticity of the 2012 English test in national examination for senior high schools. Yogyakarta: English Language Education Study Program, Sanata Dharma University.

Ujian Nasional (UN) merupakan tes standard terpenting yang diselenggarakan untuk menilai kompetensi para peserta didik di Indonesia, termasuk kemampuan berbahasa Inggris. Terlebih, nilai akhir UN dipersiapkan untuk menyeleksi calon mahasiswa perguruan tinggi negeri. Karena pentingnya UN, menganalisa validitas dan autentisitas UN juga penting. Validitas tes bahasa dikelompokkan menjadi validitas permukaan, validitas isi, validitas konstruksi, validitas sebab-akibat, dan validitas kriteria. Karena keterbatasan waktu analisis, hanya validitas isi dari soal mendengarkan dan kelima versi soal tes membaca yang dapat dianalisa dalam penelitian ini. Selain itu, autentisitas tes bahasa pada penelitian ini mengacu pada test tasks dan test texts. Penelitian ini pada dasarnya menggunakan teori Brown (2004) mengenai validitas isi dan autentisitas.

Penelitian ini bertujuan untuk menjawab dua rumusan masalah, yaitu: 1) Bagaimana naskah soal UN tahun 2012 untuk SMA memenuhi kriteria validitas isi dalam hubungannya dengan tujuan pembelajaran dan kisi-kisi soal? 2) Bagaimana naskah soal UN tahun 2012 untuk SMA memenuhi kriteria autentisitas kaitannya dengan teori Brown?

Peneliti menggunakan jenis penelitian kualitatif dengan analisis dokumen. Objek penelitian ini adalah soal tes mendengarkan dan kelima tipe naskah soal membaca di Ujian Nasional tahun 2012 yang bertujuan menghindari tindak kecurangan siswa. Namun, kelima naskah soal membaca tersebut memiliki banyak persamaan dalam hal test tasks dan test texts. Data analisis diperoleh dengan menggunakan checklists dan data tersebut digunakan untuk menjawab rumusan masalah. Selain itu, peneliti menyebarkan angket pertanyaan kepada empat ahli sebagai triangulasi data.

Ada dua temuan pada penelitian ini. Pertama, naskah soal UN tahun 2012 untuk SMA adalah 98.8% valid karena hampir seluruh butir soal sesuai dengan kisi-kisi soal. Ada tiga macam tipe tes membaca yang tidak merepresentasikan satu jenis teks yaitu explanation text. Kedua, soal UN tahun 2012 untuk SMA memenuhi kriteria autentisitas dengan persentase 79.5% karena beberapa soal mendengarkan dan membaca tidak sesuai dengan kriteria autentisitas. Kelaziman penggunaan bahasa, kesesuaian teks, dan adanya representasi kehidupan sehari-hari menjadi permasalahan autentisitas. Peneliti berharap penelitian ini berguna sebagai media evaluasi terhadap pelaksanaan UN untuk SMA di Yogyakarta serta berguna bagi praktisi pengajaran Bahasa Inggris dan peneliti lain di masa mendatang.

Kata Kunci: content validity, authenticity, English test items, National Examination, document analysis


(3)

CONTENT VALIDITY AND AUTHENTICITY OF THE 2012 ENGLISH TEST IN THE SENIOR HIGH SCHOOL

NATIONAL EXAMINATION

ASARJANA PENDIDIKANTHESIS

Presented as Partial Fulfillment of the Requirements to Obtain the Sarjana PendidikanDegree

in English Language Education

By

Frisca Ayu Desi Widyaningrum Student Number: 091214136

ENGLISH LANGUAGE EDUCATION STUDY PROGRAM DEPARTMENT OF LANGUAGE AND ARTS EDUCATION FACULTY OF TEACHERS TRAINING AND EDUCATION

SANATA DHARMA UNIVERSITY YOGYAKARTA


(4)

i

CONTENT VALIDITY AND AUTHENTICITY OF THE 2012

ENGLISH TEST IN THE SENIOR HIGH SCHOOL

NATIONAL EXAMINATION

A SARJANA PENDIDIKAN THESIS

Presented as Partial Fulfillment of the Requirements to Obtain the Sarjana Pendidikan Degree

in English Language Education

By

Frisca Ayu Desi Widyaningrum Student Number: 091214136

ENGLISH LANGUAGE EDUCATION STUDY PROGRAM DEPARTMENT OF LANGUAGE AND ARTS EDUCATION FACULTY OF TEACHERS TRAINING AND EDUCATION

SANATA DHARMA UNIVERSITY YOGYAKARTA


(5)

(6)

(7)

iv

Only those who dare to fail greatly

can ever achieve greatly.

Robert F. Kennedy

It is part of the job of life to figure out who you

are and what you have got.

- Happy Feet Two (2011)

Tell me and I forget.

Teach me and I remember.

Involve me and I learn.”

Benjamin Franklin

I dedicate my thesis to:

Sanata Dharma University as my love to this campus in which I got friends, lessons, happiness,

wisdom and life values

also to my family as my love to them.


(8)

v

STATEMENT OF WORK’S ORIGINALITY

I honestly declare that this thesis, which I have written, does not contain the work or parts of the work of other people, except those cited in the quotations and the references, as a scientific paper should.

Yogyakarta, 16 January 2014 The Writer

(Frisca Ayu Desi Widyaningrum) (091214136)


(9)

vi

LEMBAR PERNYATAAN PERSETUJUAN

PUBLIKASI KARYA ILMIAH UNTUK KEPENTINGAN AKADEMIS

Yang bertanda tangan di bawah ini, saya mahasiswa Universitas Sanata Dharma Nama : Frisca Ayu Desi Widyaningrum

Nomor Mahasiswa : 091214136

Demi pengembangan ilmu pengetahuan, saya memberikan kepada Perpustakaan Universitas Sanata Dharma karya ilmiah saya yang berjudul:

CONTENT VALIDITY AND AUTHENTICITY OF THE 2012 ENGLISH TEST IN THE SENIOR HIGH SCHOOL NATIONAL EXAMINATION

beserta perangkat yang diperlukan (bila ada). Dengan demikian saya memberikan kepada Perpustakaan Sanata Dharma baik untuk menyimpan, mengalihkan dalam bentuk media lain, mengelolanya dalam bentuk pangkalan data, mendistribusikan secara terbatas, dan mempublikasikan di internet atau media lain untuk kepentingan akademis tanpa perlu meminta ijin dari saya maupun memberikan royalti kepada saya selama tetap mencamtumkan nama saya sebagai penulis.

Demikian pernyataan ini saya buat dengan sebenarnya.

Dibuat di Yogyakarta

Pada tanggal: 16 Januari 2014 Yang menyatakan


(10)

vii ABSTRACT

Widyaningrum, Frisca Ayu Desi. 2014. Content validity and authenticity of the

2012 English test in the senior high school national examination. Yogyakarta:

English Language Education Study Program, Sanata Dharma University.

National Examination (UN) was the most important standardized test employed to assess Indonesian students’ competence, including English ability. Moreover, the final scores of the National Examination were prepared to be a tool to select state-university-student candidates. Due to that significance of National Examination, analyzing its validity and authenticity was important as well. Language test validity was categorized into face, content, construct, consequential and criterion-referenced validity. Due to consideration of time for analysis, only content validity of both listening section and the five reading test versions of National Examination were analyzed in this research. Besides, language test authenticity in this research referred to test tasks and test texts. This study mainly employed Brown’s theory (2004) about content validity and authenticity.

This study aimed to answer two questions, namely: 1) How valid is the content of English test items of National Examination year 2012 for senior high schools related to the lesson objectives and test specifications? and 2) How authentic is English test items of National Examination year 2012 for senior high schools related to the criteria of authenticity set by Brown?

The researcher employed a qualitative research with document analysis. The research objects were the listening items and the five types of reading test items of National Examination year 2012 which aimed to avoid students’ fraudulence. However, the five reading test versions had similarities on either the test tasks or the test texts. The data of the analysis were obtained by using checklists and the data were employed to answer the research questions. Besides, questionnaires were distributed to four experts as the data triangulation.

There were two findings of this research. First, the content of the National Examination year 2012 was 98.8% valid since almost the contents were relevant to the test specifications. There were three reading test versions failed to represent a certain kind of reading texts namely explanation text. Second, the National Examination year 2012 met the criteria of authenticity with percentage 79.5% since some listening and reading test items were not qualified to authenticity criteria. Natural language use, the relevance of the test topics, and real-world representativeness became problematic aspects to meet the higher standard of authenticity. This research was expected to be beneficial as a meaningful evaluation upon the administration of National Examination for senior high schools in Yogyakarta as well as be useful for English practitioners and future researchers.

Keywords: content validity, authenticity, English test items, National Examination, document analysis


(11)

viii ABSTRAK

Widyaningrum, Frisca Ayu Desi. 2014. An Analysis on content validity and authenticity of the 2012 English test in national examination for senior high

schools. Yogyakarta: English Language Education Study Program, Sanata

Dharma University.

Ujian Nasional (UN) merupakan tes standard terpenting yang diselenggarakan untuk menilai kompetensi para peserta didik di Indonesia, termasuk kemampuan berbahasa Inggris. Terlebih, nilai akhir UN dipersiapkan untuk menyeleksi calon mahasiswa perguruan tinggi negeri. Karena pentingnya UN, menganalisa validitas dan autentisitas UN juga penting. Validitas tes bahasa dikelompokkan menjadi validitas permukaan, validitas isi, validitas konstruksi, validitas sebab-akibat, dan validitas kriteria. Karena keterbatasan waktu analisis, hanya validitas isi dari soal mendengarkan dan kelima versi soal tes membaca yang dapat dianalisa dalam penelitian ini. Selain itu, autentisitas tes bahasa pada penelitian ini mengacu pada test tasks dan test texts. Penelitian ini pada dasarnya menggunakan teori Brown (2004) mengenai validitas isi dan autentisitas.

Penelitian ini bertujuan untuk menjawab dua rumusan masalah, yaitu: 1) Bagaimana naskah soal UN tahun 2012 untuk SMA memenuhi kriteria validitas isi dalam hubungannya dengan tujuan pembelajaran dan kisi-kisi soal? 2) Bagaimana naskah soal UN tahun 2012 untuk SMA memenuhi kriteria autentisitas kaitannya dengan teori Brown?

Peneliti menggunakan jenis penelitian kualitatif dengan analisis dokumen. Objek penelitian ini adalah soal tes mendengarkan dan kelima tipe naskah soal membaca di Ujian Nasional tahun 2012 yang bertujuan menghindari tindak kecurangan siswa. Namun, kelima naskah soal membaca tersebut memiliki banyak persamaan dalam hal test tasks dan test texts. Data analisis diperoleh dengan menggunakan checklists dan data tersebut digunakan untuk menjawab rumusan masalah. Selain itu, peneliti menyebarkan angket pertanyaan kepada empat ahli sebagai triangulasi data.

Ada dua temuan pada penelitian ini. Pertama, naskah soal UN tahun 2012 untuk SMA adalah 98.8% valid karena hampir seluruh butir soal sesuai dengan kisi-kisi soal. Ada tiga macam tipe tes membaca yang tidak merepresentasikan satu jenis teks yaitu explanation text. Kedua, soal UN tahun 2012 untuk SMA memenuhi kriteria autentisitas dengan persentase 79.5% karena beberapa soal mendengarkan dan membaca tidak sesuai dengan kriteria autentisitas. Kelaziman penggunaan bahasa, kesesuaian teks, dan adanya representasi kehidupan sehari-hari menjadi permasalahan autentisitas. Peneliti berharap penelitian ini berguna sebagai media evaluasi terhadap pelaksanaan UN untuk SMA di Yogyakarta serta berguna bagi praktisi pengajaran Bahasa Inggris dan peneliti lain di masa mendatang.

Kata Kunci: content validity, authenticity, English test items, National Examination, document analysis


(12)

ix

ACKNOWLEDGEMENTS

First of all, I would like to extend my great gratitude to my savior, Jesus Christ, for His endless love and uncountable blessings. He always enlightens my ways every time I am hopeless. Besides, He always helps me to stand strong.

I would like to address my gratitude to my thesis advisor, Carla Sih Prabandari, S.Pd., M.Hum, for her greatest patience, guidance, suggestions and encouragement. Therefore, I can finish my thesis well. Besides that, I would like to render thanks to the lecturers of English Language Education Study Program of Sanata Dharma University especially those who have contributed to this thesis accomplishment: Adesti Komalasari, S.Pd., M.A., Ag. Hardi Prasetyo, S.Pd., M.A., Drs. Barli Bram, M.Ed., Ph.D., Markus Budiraharjo, B.Ed., M.Ed., Ed.D., Sr. Margareth, FCJ, Veronica Triprihatmini, S.Pd., M.Hum., M.A., and my academic advisor: Christina Kristiyani, S.Pd., M.Pd.

I would like to express my gratitude to the English teacher at SMA Negeri

7 Yogyakarta, Dra. Dorothea Sri Ismayawati, for her help, support, and care.

Then, I would like to thank PBI secretariat staffs, Mbak Dhanniek and Mbak

Linda, who have helped me to manage all of the things related to administration. I also address my thankfulness to Sanata Dharma Library officers for their friendly and warm services.

I would like to give my special gratitude to my mother, Ibu Francisca Hartini, who is always with me with her sincere prayer. I thank her for always raising me up with her support when I am down and patiently teaching me to be a


(13)

x

tough woman. I would also give my special gratitude to my siblings: Anindya Marthasari, Bagus Rilo Pambudhi, and Annisa Dela Widhiastuti for their cheerfulness and support.

I would personally give thanks to my best friends: Efrem Justitia Suksma, Yut Liyut, Vicky, and Agustina for their support; Helen, Bruder Markus, Sekar, Ita, Niken, Pipiet, Bertha, and Denny for their assistance and encouragement. My thankfulness also goes to my play fellows: Unggul, Erik, Momon, Dovi, Yudha, Veri, and Yulius Nico. I also give my gratitude to all my friends in English Language Education Study Program of Sanata Dharma University for their cheerfulness, encouragement, and help, especially for my play performance partners in A Fortiori, for my classmates in class F, my classmates in TW class and my SPD group mates (Niko, Linda, Ika, Budi, Vita, and Tunggul). Last but not least, I sincerely address my sincere gratitude for those who I could not mention one by one for their help and support. May God always bless them all with endless happiness!

Best Regards, Frisca Ayu Desi Widyaningrum


(14)

xi

TABLE OF CONTENTS

Page

TITLE PAGE... i

APPROVAL PAGES... ii

DEDICATION PAGE... iv

STATEMENT OF WORK’S ORIGINALITY... v

PERNYATAAN PERSETUJUAN PUBLIKASI... vi

ABSTRACT... vii

ABSTRAK... viii

ACKNOWLEDGEMENTS... ix

TABLE OF CONTENTS... xi

LIST OF TABLES... xiv

LIST OF APPENDICES... xv

CHAPTER I. INTRODUCTION

A. Research Background... B.Research Problems... C.Problem Limitation... D. Research Objectives... E. Research Benefits... F.Definition of Terms...

1 5 5 7 8 9

CHAPTER II. REVIEW OF RELATED LITERATURE

A. Theoretical Description... 1.Language Testing... 2.Language Test... a.Paper-and-Pencil Language Tests... b.Performance Tests...

14 14 15 15 16


(15)

xii

3.Test Purposes... a.Language Aptitude Tests... b.Proficiency Tests... c.Placement Tests... d.Diagnostic Tests... e.Achievement Tests... 4.Validity... a.Face Validity... b.Content Validity... c.Criterion-Related Evidence... d.Construct Validity... e.Consequential Validity... 5.Authenticity... B.Theoretical Framework...

16 16 17 17 18 18 18 19 19 21 21 21 22 25

CHAPTER III. METHODOLOGY

A. Research Method... B.Research Objects... C.Research Instruments... D. Data Gathering Techniques... E. Data Analysis Techniques... F. Research Procedures...

29 31 31 32 33 42

CHAPTER IV. RESULTS AND DISCUSSION

A. Content Validity of 2012 English Test Items in National Examination for Senior High Schools... 1.Validity of the Test Specifications... 2.Content Validity of the Listening Test Items according to Competence Standard and Basic Competence... 3.Content Validity of the Listening Test Items according to Graduate Competence Standard...

47 48

53


(16)

xiii

4.Content Validity of the Reading Test Items according to Competence Standard and Basic Competence... 5.Content Validity of the Reading Test Items according to Graduate Competence Standard... B.Authenticity of 2012 English Test Items in National Examination for

Senior High Schools... 1.Authenticity of the Listening Test Items... 2.Authenticity of the Reading Test Items... a. Authenticity of the Test Tasks... b.Authenticity of the Test Texts... C.Other Findings...

55

58

63 67 70 71 75 77

CHAPTER V. CONCLUSIONS AND RECOMMENDATIONS

A.Conclusions... B.Recommendations...

79 82


(17)

xiv

LIST OF TABLES

Table Page

Table 3.1 The Sample of Test Specification Validation Checklist... 35 Table 3.2 The Sample of Authenticity Checklist... 39 Table 3.3 The Sample of Authenticity of the Test Text Checklist... 41 Table 4.1 The Percentages of Content Validity and Authenticity of

the Test Items... 48 Table 4.2 The Dissemination of the Reading Test Items according to

Competence Standard and Basic Competence... 57 Table 4.3 The Dissemination of the Reading Test Items according to

Graduate Competence Standard... 63 Table 4.4 The Dissemination of the Similar Kinds of the Reading

Test Items... 66 Table 4.5 The Percentages of the Authenticity of the Test


(18)

xv

LIST OF APPENDICES

Appendix Page

Appendix 1 Transcription of the Listening Section... 88 Appendix 2 The Document of Competence Standard and Basic

Competence... 92 Appendix 3 The Document of Graduate Competence Standard... 99 Appendix 4 Checklists... 103


(19)

1 CHAPTER 1 INTRODUCTION

This chapter describes the background of the study, research problem, problem limitation, research objectives, and research benefits. Besides, in this chapter explains the definition of the terms used in this research. Each part is described as follows.

A. Research Background

Viewed from its functions, National Examination in Indonesia is the highest standardized test employed to assess and measure Indonesian students’ competence. By passing National Examination, Indonesian students are able to graduate from a certain education level and to continue their study to the further education level. This is stated in Education Ministry Regulation (No. 59/2011) on National Examination, “National Examination, abbreviated as UN (Ujian Nasional), is a national standardized test which is administered nationally in order to test the students’ competency achievement on a particular subject in a group of science and technology”. The regulation includes the procedures of the exam application and the procedures of composing the exam questions. The regulation implies that National Examination is administered orderly and the test items are not designed randomly but referring to specifications and lesson objectives.


(20)

Mentioned in Education Ministry Regulation (No. 22/2006), National Examination materials are generally based on Competence Standard and Basic Competence of each level of educational units concluded in Content Standards

(Standard Isi). Competence Standard and Basic Competence comprise lesson

materials from the first level up to the last level of classes. Furthermore, Competence Standard and Basic Competence of each level of educational units become reference to createGraduate Competence Standard (Standard Kompetensi

Lulusan) which consists of test specifications. The policy regulates the subjects

which are examined in National Examination. In senior high schools, for example, the subjects which are examined depend on students’ majors. However, some subjects such as Indonesian and English are examined in National Examination to all majors such as natural science, social science, and linguistics.

The test-makers (Department of Education) need to pay attention to the test’s content validity and authenticity in order to make good test items, particularly in National Examination. Content validity is one of the validity facets and it is important since it helps the test reflect the measured skills which should be performed by students. American Psychological Association (1985) advances validity of a test to reveal the meaningful, appropriate, and useful test scores (as cited in Rudner and Shafer, 2002, p.12).

Mousavi (2002) explains that if a test requires test-takers to perform the behavior that is going to be measured, it is considered as a valid test (as cited in Brown, 2003, p.22). Seif (2004) explains that measuring the content of a test can be conducted by assessing the sample of the test questions. If a test does not have


(21)

content validity, there is no possible logical outcome that the test-examiners are not able to determine that students achieve the set of learning objectives in a particular level of education (as cited in Jandhagi and Shaterian, 2008, p.2).

Later, the National Examination test-makers need to pay attention to the authenticity of the test as well as the content validity. Bachman and Palmer (1996) define authenticity as “the degree of correspondence of the characteristics of a given language test task to the features of a target language task” (p.23). In its relation to National Examination, authenticity enrolls two important parts namely test task characteristics and test text characteristics like in a reading test. The test tasks pointed up the test instructions and the optional answers of the items while the test texts referred to the passages used in the test.

In general, Brown (2004) states authenticity is determined by five indicators namely the natural language which is used in the test, the contextualization of the items, the relevance of the test topics and the learners, the presence of thematic test item organization, and the representation of the real-world task or sources (p.28). Those five indicators are important since they correlate with the degree of authenticity of a text and those five indicators will be explained further in chapter two. Authenticity is as important as validity since it builds figures of the target language used in the real situation for students. Students will be confused to use language in context unless National Examination reflects


(22)

authenticity. Moreover, it is important that the materials used in the test are relevant to students’ majors. If the materials used in the language test cannot resemble the relevance to the majors, it would be difficult for students to understand the content. As the result, the texts’ content is not communicated well to the students.

Nowadays, an issue happening in Indonesian society is that some universities are going to use the final result of National Examination to select university student candidates instead of administering the selection test. Directorate General of Higher Education (DGHE) of Ministry of Education and Culture (Kemendikbud) Djoko Santoso, Tempo (Monday, June 4th, 2012), explains that Indonesian state universities agree to use National Examination scores as the requirement to select the university-student candidates. Djoko Santoso, Directorate General of Higher Education (DGHE) of Ministry of Education and Culture (Kemendikbud), adds that there are some purposes in applying this policy. The first purpose is to decrease the expense of the students’ parents since the parents spend extra money to pay a selection test if their children are going to universities. The second purpose is to integrate Indonesian education and National Examination is to become a selection instrument to enter a higher educational level. Indonesian Government integrates school levels only from elementary schools up to senior high schools or vocational high schools. Therefore, the students cannot continue their study to university unless they take a selection test.


(23)

The researcher intends to analyze the content validity and authenticity of English items of National Examination year 2012 for senior high schools since National Examination is the highest standardized test employed to assess Indonesian students’ competence. Moreover, students’ scores of National Examination are prepared to be possibly used as a selection tool. Due to the reasons, validity and the authenticity of the National Examination are essential to be considered.

B. Research Problems

The problems of this research are formulated as follows:

1. How valid is the content of English test items of National Examination year 2012 for senior high schools related to the lesson objectives and the test specifications?

2. How authentic is English test items of National Examination year 2012 for senior high schools related to the criteria of authenticity set by Brown?

C. Problem Limitation

The researcher intends to analyze the sets of English items of National Examination year 2012 for senior high schools. The National Examination which is administered to Indonesian senior high school students is to measure the competence of the 12th grade students in order to graduate from senior high school level and continue their study to universities. The English items of National Examination year 2012 for


(24)

senior high schools consist of two types namely listening section and reading section. It is since both sections are necessary in National Examination according to the Competence Standard and Basic Competence. However, the researcher focuses the analysis on both listening and reading section of English test items of National Examination year 2012 for senior high schools.

Due to the various test types of National Examination in Indonesia and incapability to organize research in English test items of National Examination year 2012 for senior high school conducted in Indonesia, the analysis was focused on the test items which was conducted Daerah Istimewa Yogyakarta. The researcher took sets of English items of National Examination for natural science and social students, which was held in SMA Bopkri 2 Yogyakarta. This school was chosen because it becomes a representative of the English test items which is administered in Yogyakarta.

The set of English items of the National Examination are differentiated into five versions (type A57, type B69, type C71, type D32, and type E45). This variation is generally applied by making different test questions in those five test items but the test questions of all five types are based on Graduate Competence Standard which contains the test specifications. Furthermore, the test types' distinction is carried out in order to anticipate students’ fraudulence. Therefore, all types of the test items are going to be analyzed in order to make the data more valid. The discussion of this research is focused on the validity of 2012 English National Examination’s test content and the authenticity of the test items.


(25)

According to Brown (2004), validity of a test, particularly a language test like an English test, is determined by some aspects namely face validity, content-related evidence (content validity), criterion-related evidence, consequential validity, and construct validity. However, the research is focused on the content-related evidence (content validity) in order to analyze the validity of English items of National Examination year 2012. The authenticity of test items is determined by the authenticity indicators claimed by Brown (2004). The indicators are the natural language which is used in the test, the contextualization of the items, the relevance of the test topics and the learners, the presence of thematic test item organization, and the representation of the real-world task or sources (p.28).

D. Research Objectives

The researcher analyzes content validity and authenticity of English test items on National Examination in order to obtain more information about the quality of English items of National Examination year 2012 for senior high schools in case of its content validity and its authenticity. The subject which is going to be analyzed is English. The sets of English items of the National Examination administered to SMA Bopkri 2 Yogyakarta are varied into five test types; type A57, type B69, type C71, type D32, and type E45. The research is carried out in order to accomplish the objectives:


(26)

1. To obtain information whether the test items of listening and reading section of 2012 English National Examination for senior high schools meet content validity.

2. To obtain information whether the test items of listening and reading section 2012 English National Examination for senior high schools meet authenticity.

E. Research Benefits

There are benefits of this study for the English teachers, the test-makers of National Examination, and the future researchers. The benefits imply on the research data and findings revealed in the analysis. They are described in details as follows.

1. For English Teachers and English Practitioners

This study gives an analysis of the content validity and the authenticity of English items of National Examination year 2012 for senior high schools. Since content validity and authenticity takes part in performing a good test and this research reveals information about both kinds of principles of language assessment. Therefore, from this research, the English teachers are able to be more aware of English items of National Examination in the following years.

2. For National Examination Test-designers

This study gives an analysis of the content validity and the authenticity of English items of National Examination year 2012 for senior high schools. The data of this study are able to be used to evaluate the sets of English items of


(27)

National Examination that the test-makers have made in case of its content validity and authenticity. Furthermore, the findings of this research could be significant considerations for the National Examination test-designers in designing the English language tests, particularly in the future National Examination.

3. For Future Researchers

This study provides meaningful data related to the content validity and authenticity of English test items of National Examination year 2012 which is useful for future researchers as references in conducting research in English language tests. Moreover, the future researchers are able to explore this research more and reveal other findings by applying other high technical research instruments to obtain data. Therefore, there would be more detail information related to the content validity and authenticity of English test discovered.

F. Definition of Terms

There are definitions of the terms used in this study which relates to content validity, authenticity, National Examination, and senior high school. The definitions are taken from several experts and certain education regulations. The terms are defined as follows.


(28)

Gronlund (1998) says, “validity is the extent to which inferences made from assessment result are appropriate, meaningful, and useful in terms of the purpose of the assessment” (as cited in Brown, 2004, p.22). It simply means that in order to make a test valid, appropriate and meaningful, the test should reflect the lesson objectives. Besides reflecting the lesson objectives, the test result should be appropriately connected to the purpose of the test and one of the validity facets is content validity.

Brown (2004) adds if a test has content validity, the items of the test represent the measured subject-matter or behavior in order to evaluate achievement or proficiency tests (p.23). In addition, the test content should reflect the target tasks which are organized in the test specifications and the lesson objectives. In this study, the researcher dealt with the content of English test items of National Examination year 2012 for senior high schools in order to analyze the validity of its content by comparing the test content with the relevant test specifications incorporated in Graduate Competence Standard and the lesson objectives which are incorporated in Competence Standard and Basic Competence.

2. Authenticity

Authenticity is a matter of appropriateness which is referred to the test items’ content and construction. Authenticity of test items is able to be analyzed into two elements namely test tasks and test texts. In the relation to the test task, Bachman and Palmer (1996) explains that authenticity is the scale of the relation between the characteristics of a given language test task and the characteristics of


(29)

the target language (as cited in Brown, 2004, p.28). It indicates that in order a test task meets authenticity; the test task should simulate a real-world task and the test task’s aim is not to test a grammatical form of language. This is also referred to a statement about authentic texts from William (1984). He states that authentic texts are written to convey a message (as cited in Day, 2003, p.4).

It means that the authentic texts aim mainly for communication and not for teaching grammar or lexis. However, the language used in both test tasks and test texts should be as natural as the target language. The target language used in National Examination is American English and British English. Therefore, the authenticity contains five visible and important indicators as what Brown (2004) claims. The important indicators are the natural language which is used in the test, contextualization of the items, relevance of the test topics and the learners, presence of thematic test item organization, and representation of the real-world task or sources (p.28). 3. National Examination

According to Education Ministry Regulation (No. 59/2011) on National Examination, “National Examination, abbreviated as UN, is a national standardized test administered nationally in order to test students’ competence achievement on a particular subject in a group of science and technology”. National Examination is administered to each educational level starting from elementary school level to senior high school level. The test is to measure and assess the students’ competency of particular


(30)

subjects of study which are examined. National Examination is annually administered to students in the highest grade of an educational level in order to pass a certain level of education and then to continue their study to the higher level of education. In senior high schools, National Examination is administered to each majors; natural science, social science, and linguistics. National Examination is prepared by Department of Education which instructs groups of teachers (MGMP) to make the test items. The test items for each region are different in order to avoid fraudulence which is carried out by the students. Therefore, there are five types of National Examination year in 2012 administered to senior high schools in Yogyakarta and the types will be explained in chapter four.

This study deals with the five versions of English items of National Examination year 2012 for senior high schools which were used in Yogyakarta. The test versions are A57, B69, C71, D32, and E45. The five test versions refers to the reading section since the listening section of the National Examination year 2012 is available only in one version. This study is focused on the test items which are administered to natural science and social science students.

4. Senior High Schools

There are two types of high schools in Indonesia, SMA (senior high schools) and SMK (vocational high schools) and the difference of both schools lies in the school objectives. The school objectives of both high schools are different. SMA or senior high schools prepare students to study in universities while SMK or vocational high schools prepare students to work on some certain


(31)

jobs. Indonesian Government Regulation No. 29/1990 on secondary education asserts that students of secondary schools are prepared to continue their education to the higher educational level. Then, the senior high schools are divided into three general levels namely first level (class 10th), middle (class 11th), and the last level (class 12th). After senior high school students enter the 11th grade, they have to choose one of three majors, namely Natural Science, Social Science, and Linguistics. In the relation to the Competence Standard and Basic Competence of natural science, social science, and linguistics; natural science and social science have the same contents of Competence Standard and Basic Competence while linguistics’ are different. Therefore, the tests for natural and social science are the same while the test items for linguistics are different. According to the background of this study, the researcher only deals with senior high schools. The test items with which the researcher deals are the test items for Natural Science and Social Science students in senior high schools.


(32)

14

CHAPTER 2

REVIEW OF RELATED LITERATURE

This chapter presents review of theoretical writings and research related to the study matter. Furthermore, this chapter helps the researcher to answer two research questions. This part contains two major parts of the review of related literature namely theoretical description and theoretical framework.

A. Theoretical Description 1. Language Testing

Testing refers to the activity of testing individuals or things in order to reveal certain information. Besides revealing certain information, testing is conducted to measuring one’s capability, knowledge or performance in a certain domain. It has an essential role in social lives and its function is to measure ability in general or in specific objectives of a person (McNamara, 2000). Furthermore, a test becomes an entrance at essential intercessor moment in education, for instance if students are going to continue their study into a university, they have to pass certain tests like an achievement test and a university selection test. Since it is a language test, the test is related to one’s skills in language use or to perform language. Brown (2004) explains that performing language skills is able to be administered in form of speaking, writing, reading, or listening (p.3). In its development, the language test now simulates the real-life situations.


(33)

2. Language Test

Tests refer to an examination of one’s knowledge or ability and they commonly consist of questions to be answered or activities to be presented. In other words, Brown (2004) defines tests as instruments of measuring the ability and knowledge (p.3). In practice, language tests are differenciated into the way they are composed or designed (method) and the purpose of designing the test itself. In term of method, language tests are differentiated into two types namely traditional paper-and-pencil language tests and performance tests (McNamara, 2000).

In order to make a language test become effective, the language test items should meet principles of language assessment. Two of the principles are content validity and authenticity. Brown (2004) explains there are five criteria for testing a test namely practicality, reliability, validity, authenticity, and washback (p.19). a. Paper-and-pencil language tests

Paper-and-pencil language tests are well known as written tests. According to McNamara (2000) paper-and-pencil language tests are employed to assess separate components of language acquaintance (such as: grammar and vocabulary) and the receptive skills (listening and reading comprehension). The test items are commonly in a fixed response format since these tests are easy to administer and score. McNamara (2000) explains that students do not construct the answer in the fixed response format tests, for instance in multiple choice format. However, they choose one of the optional answers. These tests are considered not effective to assess the productive skills of speaking and writing.


(34)

The example of paper-and-pencil language tests is National Examination since the test items in multiple choice formats. Besides, the aim of administering National Examination is to assess the students’ receptive skills. Receptive skills refer to the ability to receive information and understand it by listening or reading.

b. Performance tests

Nowadays, performance tests are well known as oral tests. Performance tests are different from paper-and-pencil language tests since they assess the act of communication McNamara (2000). Therefore, it is specifically used to assess speaking and writing skills.

3. Test Purposes

In term of test purposes, language tests are differentiated into five types. They are namely language aptitude tests, proficiency tests, placement tests, diagnostic tests, and achievement tests. Brown (2004) explains that by defining the purpose of testing, the test-designers will more focus on the certain objectives of the tests (p.42). Besides, it helps the test-designers to compose the items.

a. Language Aptitude Tests

According to Brown (2004), language aptitude test is “a test which is designed to measure capacity or general ability to learn a foreign language and ultimate success in that undertaking” (p.43). Language aptitude test is not commonly employed since this test is used to predict one’s achievement in learning a language. The achievement of the tests is assessed by likewise processes of mimicry, memorization, and puzzle-solving (Brown, 2004). The


(35)

reason is doubtful since the factors like appropriate self-knowledge, active strategic involvement in learning, and strategies-based instruction influence somebody’s success. The examples of language aptitude tests are the Modern

Language Aptitude Test (MLAT) and Pimsleur Language Aptitude Battery

(PLAB).

b. Proficiency Tests

Proficiency tests are not limited to one course, curriculum, or certain language skill. However, it tests all skills. Brown (2004) explains that proficiency tests consist of standardized multiple-choice items on grammar, vocabulary, reading comprehension, listening comprehension, writing skill, and sometimes oral production performance (p.44). In addition, McNamara (2000) states that proficiency tests are not correlated to the process of teaching since tests more refer to the future ‘real-life’ language use as the criterion. Therefore, the score of this kind of tests has gate-keeping function especially in the educational field and working area. An example of standardized proficiency tests is Test of English as a Foreign Language or well-known as TOEFL.

c. Placement Tests

Brown states placement tests are tests which are employed to place a student into a certain level or course correctly. Since it is for placing a student into a certain level or course, the tests commonly comprise the sample of the material completed in various levels in a certain curriculum (p.45). The tests are various; they can be in a form of written test or oral test, depending on the nature of the


(36)

program. An example of standardized placement tests is the English as a Second Language Placement Test (ESLPT) at San Fransisco State University.

d. Diagnostic Tests

Brown (2004) explains diagnostic tests are used to diagnose certain aspects of a language (p.46). In the practice, the test administrators have a checklist of features in order to point toward difficulties. The diagnostic test results help the teachers to decide on what aspects they have to focus. Besides, it provides information to the students to be aware of errors.

e. Achievement Tests

According to Brown (2004), an achievement test is limited to certain materials related to a curriculum within a particular time frame. An achievement test is used to determine whether the objectives of the course have been met by the end of an instruction period (p.47). Therefore, an achievement contributes in teaching learning process since it is related to classroom lessons, units, or curriculum. An example of achievement tests is National Examination.

4. Validity

Bachman (1990) explains that in order to make a test score becomes a meaningful indicators to assess the individual’s ability, the test should concern only to the ability which is expected to be tested (p.238). Bachman (1990) advances the validity of a test shows the quality of the test itself. When a test meets validity, consequently the test score effectively reflect the true condition of students’ competence (p.236). Moreover, it is able to be the meaningful


(37)

indicators. However, in order to meet the validity, the test should reflect the skills or behavior which would be assessed. There are five types of validity to determine whether or not a test is valid namely face validity, content-related evidence (content validity), criterion-related evidence, consequential validity, and construct validity.

a. Face Validity

Gronlund (1998) states a test is considered having face validity if the students look the test as fair, pertinent, and utile for improving learning (as cited in Brown, 2004, p.26). Face validity itself refers to how the test looks good and it obviously appears to measure the skills which are going to be measured. Furthermore, according to Brown (2004) criteria of a test which has face validity are that the test is well-constructed, the test has the time allotment, the items are obvious and simple, the directions are clear, the tasks meet content validity, and the difficulty level presents a reasonable challenge (p.27).

b. Content Validity

According to APA (1954), content validity refers to the scale that the content of assessment items reflects the content domain of interest (as cited in Miller, 2003, p.2). Shepard (1993) adds that content validity is an indicator to interference the result. It is “evidence-in waiting” (as cited in Miller, 2003, p.5). It means that whenever a test meets validity in the content, the items of the test represent the skills or behavior to be measured in order to evaluate achievement tests.


(38)

Therefore, the scores of the test are effectively used as the meaningful indicators of students’ competence, for instance, a test for reading skills would be considered as a valid reading test if a test of reading measures reading skill and nothing else. The test is not a valid test for speaking or vocabulary because it does not test speaking or vocabulary. However, Seif (2004) claims it does not mean all educational objectives of a particular course are included in the test. Due to test practicality, the test designers should compose several questions which are able to be representatives of achieving the set educational goals. Seif (2004) claims content validity is one of essential parts to compose a test (as cited in Jandhagi and Shateria, 2008, p.2). As a test does not meet validity in its content, there will be two possible outcomes. First, students are not able to perform the needed skills which are not included in the test. Second, there may be some inappropriate questions which students are not able to answer. Therefore, the test tasks should be appropriate to the test specifications on the blueprints. It is similar to what Seif (2004) says, evaluating content validity of a test can be carried out by matching the sample of the test questions to the test instructions (as cited in Jandhagi and Shateria, 2008, p.2). Crocker and Algina (1986) advance that ‘matching method’ effectively ensure validity (as cited in Miller, 2003, p.12).

According to Bachman and Palmer (1996), blueprint is a completed plan providing the characteristics to develop the entire tests (p.90). It contains task specifications for all type of tasks which are to be included in a particular test. The blueprints are evaluation tools to check whether or not the test items are appropriate to the test specifications stated in the blueprints. Brown (2004) states


(39)

that test specifications include the general outlines of the test and the test tasks (p.50). The test specifications refer to a certain curriculum and it consists of only the general outlines of whole materials and skills to be tested since the test designers should consider test practicality.

c. Criterion-Related Evidence

Brown (2004) defines criterion-related evidence refers to the criterion of the test which is expected to be achieved. Criterion-related evidence validity is commonly categorized into two types namely concurrent validity and predictive validity (p.24). Criterion-related evidence is categorized into two types namely concurrent and predictive validity.

d. Construct Validity

Brown (2004) states construct validity has a big role in a test design. Furthermore, it is a main concern in validating large-scale standardized test of aptitude (p.25). It means that in making a test or testing a person, the test-designers or the examiners should adhere to practical procedures and principles. It is for example in determining the scoring criteria of a speaking test, the examiner should consider some factors such as pronunciation, accuracy, vocabulary use, and sociolinguistic appropriateness.

e. Consequential Validity

It refers to the consequences of a test. According to Brown (2004), a test raises various consequences, namely considerations as its accuracy in measuring intended criteria, its impact on the preparation of test-takers, its effect on the learners, and the intended or unintended social consequences of a test’s


(40)

interpretation and use (p.26). Besides, the effects of test preparation courses and manuals are the effect of consequential validity.

5. Authenticity

There are many different views of authenticity since some experts may define authenticity variously. Scarcella and Oxford (1992) define that authenticity refers to unedited and unabridged text (as cited in Day, 2003, p.4). While Widdowson (1976) emphasizes that authenticity is not only about the quality of a text at all but authenticity is reached when the readers understand the writer’s intention (p.264). Williams (1984) explains that authentic texts are written to convey a message (as cited in Day, 2003, p.4). It means that authentic texts’ purpose mainly is for communication.

According to Richards (2001) authentic materials are important to be applied in language teaching and learning. There are several benefits:

1.Since authentic materials are used for communication and exist in the real world. Therefore, authentic materials are considered as more interesting and motivated than created materials.

2.Since they are authentic, the materials are considered as having lots of appropriate information about the target language.

3.Authentic materials are composed not to illustrate some grammatical rules or discourse types. It resembles true language (pp. 252-253).

However, Richards (2001) advances that there are some critics of the use of authentic materials as follows.


(41)

1.Authentic materials have difficult language and vocabulary in it which potentially distracts learners and teachers.

2.Using authentic materials burden teachers since the teachers should find the suitable ones for teaching. The teachers are not able to simplify the materials easily because it would be considered as the unauthentic ones (p.253).

In judging test items belong to the authentic ones, Bachman and Palmer (1996) claim that people cannot define the test items are authentic just by viewing it (pp.28-29). There are two kinds of authenticity in the case of National Examination test items; they are task characteristics and text characteristics. Task characteristics represent the authenticity of test instructions; therefore, the focus is on the test tasks (a set of test instructions and the provided options). In addition, text characteristics has essential role in order that test items meet authenticity. It represents the appropriateness of the passages used as the test materials.

There are several indicators to measure test items’ authenticity since authenticity cannot be defined just by looking at it. According to Brown (2004), in order that a test meets criteria of authenticity, the test items should represent five ways such as the test language is as natural as possible, the items should be contextualized, the topics should be relevant to the learners, there are some thematic organizations to items, and the tasks represent the real-world tasks (p.28). These ways are utilized to analyze the authenticity of National Examination test items in case of the test tasks.

Natural language use refers to “the language of ordinary speaking and writing” (Merriam-Webster Dictionaries, 2013) and it is not able to untangle from


(42)

linguistic facets such as typographical mistakes (in reading materials), lexis, morphemes, word orders and grammar (syntactic matters), diction, and meaning (semantic matters). The test tasks and test texts should resemble how natural the language is used as in the reality (Brown, 2004: 28).

The second indicator is the contextualization of the test items which means the test items are orderly organized into the same topics, for example, in a story line. The third indicator is relevance of the test topics and the learners, which means that the materials should be appropriate to learners’ ability. In some cases, many of authentic passages have difficult level of language which may burden language learners who have lower level of language. The statement is emphasized by Brown (2004) who explains that one of authenticity criteria is that the topics used in the test should be relevant to the learners (p.28) and added by Nutall (1996) who says the high-level-of-language texts are not suitable for improving or developing reading skills (p.177).

In order to make test items more authentic, the test designers should consider of the existence of some thematic organizations to items. This indicator has correlation to the second indicator since both indicators persuade the test designers to organize the test items orderly in some story lines. The last indicator is that the tasks should represent the real-world tasks. It means that authentic materials are selected from real-world sources (as cited in Brown, 2004, p.28). In addition, the test tasks should be designed to obtaining certain information related to the text rather than asking for some grammatical forms or lexical items. Williams (1984) explains that authentic texts are written to convey a message (as


(43)

cited in Day, 2003, p.4). The statement indicates that authentic texts’ purpose is mainly for communication. It aims not focusing on teaching grammatical forms.

Brown (2004) adds in listening items authenticity points up dialogues or monologues spoken by native speakers which represent conversations happen in the real-life (p.28). It indicates that to achieve authenticity,natural language use is important such as in listening test there should be hesitations, white noise, and interruptions.

B.Theoretical Framework

A language test is a systematic method to measure one’s capability, knowledge, or performance in a certain domain in its relation with the language use. In order to meet usefulness of a language test, the test should meet a good test’s criteria, for instance: reliability, validity, practicality, and authenticity (Brown, 2004). Therefore, the language test should be high quality since it is a measurement of students’ capability. One of the types of language tests is English test of National Examination. McNamara (2000) states that in terms of methods, National Examination is a kind of paper-and-pencil language tests (written test). Paper-and-pencil language tests belong to receptive tests because they test somebody’s receptive skills such as listening and reading skills. In terms of test purposes, National Examination is categorized into achievement tests (McNamara, 2000). As an achievement test, National Examination corresponds to the classroom lessons, units, or curriculum (Brown, 2004). The bases of composing National Examination are the Competence Standard, Basic


(44)

Competence, and test specifications which are incorporated in Graduate Competence Standard. In order to meet the usefulness as an assessing tool, language test such as National Examination should meet principles of language assessment. There are two criteria which the researcher focuses on, namely content validity and authenticity.

A valid listening test is a test where the content is composed based on the blueprints. If the topics are relevant with the test specifications, the listening test is valid (Brown, 2004). A valid reading test is a test where the content is composed based on the blueprints. If the topics are relevant with the test specifications, the reading test is valid. Content validity is important to be considered due to the effectiveness of the test. If a language test does not meet content validity it probably affects the students’ capability to perform the intended skill and the students are probably not capable to answer the test questions (Seif, 2004). Therefore, it is important to check content validity of language tests. In order to check the validity of language test, the test-designers or teachers are able to check it by matching the test items with the relevant test specifications and lesson objectives.

Authenticity is one of the important language assessment facets since it resemble how the language test show the real-world tasks and true language use (Richards, 2001). In addition, authentic materials which are used will give advantages for the learners since authentic materials perform the true language in context and they help students by providing appropriate information about the target language (Richards, 2001). In order to be determined as an authentic


(45)

assessment, National Examination test-designers should consider two important parts of authenticity namely test task characteristics and test text characteristics (Bachman and Palmer, 1996).

Task characteristics include five aspects namely the naturalness of test language, the contextualized items represented in the test, the relevance of the test topics and the learners, the existence of some thematic organization items, and the representativeness of the world tasks (Brown, 2004). Those five aspects refer to the quality of the test tasks in reading and listening tests. The naturalness of test language in reading test items consists of linguistic aspects namely typography, lexis, morphology, syntax, and semantics. The naturalness of test language show the appropriateness of the test language to the target language.

The target language use of the English test on National Examination is American English and British English. It is because American English and British English becomes international language which is as means of communication spoken by most of people throughout the world. Meanwhile, the naturalness of test language refers to the existence of hesitations, white noise, and interruptions in listening tests (Brown, 2004). The contextualization of the test items refers to the test items organizations which is related to the existence of some thematic organization items. Another indicator is relevance of the test topics and the learners which has meaning that the materials should be appropriate to learners. The last indicator is that the tasks should represent the real-world tasks which means that authentic materials are taken from real-world sources.


(46)

Besides the test tasks, the test text characteristics become important in order to achieve authenticity and the text characteristics adapt the five indicators of test authenticity. There are three indicators used to check authenticity of reading texts namely the naturalness of test language, the relevance of the test topics and the learners, and the representativeness of the world tasks.

Authenticity is a matter of appropriateness of the content and construction of both test tasks and test texts. It is important to be considered in composing language test due to its benefits by performing the true language in context and providing appropriate information about the target language (Richards, 2001). Authentic materials are not used to teach grammar or language discourse, however, it shows genuine and reliable language (Richards, 2001). After learning certain language using authentic materials, students are expected not to be confused of using the language. Therefore, if the English items of listening and reading sections of 2012 National Examination meet the criteria on both task characteristics and text characteristics, the test is considered as a relatively high authentic test.


(47)

29 CHAPTER 3 METHODOLOGY

In this chapter, the researcher discusses the research methodology. It consists of six parts: research method, research participants, research instruments, data gathering techniques, data analysis techniques, and research procedures. Each part would be explained in details as follows.

A. Research Method

This research belongs to qualitative research since McRoy et al. (1988) defines qualitative research is a kind of research which is focused on non-statistical methods and analysis of social phenomena. Qualitative research uses detailed descriptions from the perspective of the research participants as means to examine specific issues and problems under study. It means that through qualitative inquiry, the researcher conducts analysis of the research participants on their natural setting without any manipulation on the data variable. According to Myers (1997), the data of qualitative research are in a form of descriptive data not in a form of numbers (as cited in Hunt, n.d., p.2).

Related to qualitative research, particularly, a descriptive data of some certain documents is utilized. If there are numbers in the research discussion, the numbers are used as further explanation of the research findings in order to make the explanation clearer. The research is conducted with descriptive analysis because it shares measurement of documents rather than prediction. This study is


(48)

intended to find out the validity of the content on English items of National Examination year 2012 for senior high schools and the authenticity of the English items of National Examination year 2012 for senior high schools. The data which are analyzed are not in the form of statistics or numbers but in a form of descriptive data as well. The forms of numbers are employed to describe the data in details.

One of the qualitative research methods is document analysis. According to Berelson (1947) document analysis is a systematic research technique to observe evidence of concepts from instructional documents (p.74). The types of the documents are various such as written document (public records, private papers, and biography), photograph, poster, map, artifact, motion picture, and sound recording. Since the researcher deals with documentation rather than examination and entails in-depth analysis of a set of collected data, the researcher applies document analysis method. In this research, the primary document which is analyzed is the five versions of English test of National Examination year 2012 for senior high schools.

In addition, Fraenkel and Wallen (2008) state that document analysis is useful to prevail information in dealing with educational matters (p.497). Related to the statement from Fraenkel and Wallen (2008), this research is conducted to collect information correlating to the research problems in the chapter one. They are how the content of English test items National Examination year 2012 for senior high schools meets the criteria of a valid test and how English test items National Examination year 2012 for senior high schools meet the criteria of an


(49)

authentic test.

B.Research Objects

The objects of this study are five sets of English test items of National Examination year 2012 for senior high schools, which were intended to be administered in Yogyakarta. The researcher got the copy of the test from an English teacher in SMA Negeri 7 Yogyakarta who had taught in SMA Bopkri 2

Yogyakarta. The samples of English test items of the National Examination year

2012 are for natural science and social science. Since both majors have the same lesson objectives which are incorporated in Competence Standard and Basic Competence, the test is same for both majors.

There are five versions for the test, namely A57, B69, C71, D43, and E45. Furthermore, the English test of National Examination is divided into two sections, they are listening and reading section. The listening section which was composed in only a version consists of fifteen questions starting from number one up to number 15 while the reading section of each test version consists of 35 questions starting from number 16 up to 50 for all five versions. Therefore, the total items on an English test of National Examination year 2012 for senior high schools items are 50 items of each test version.

C.Research Instruments

The instruments of this research are documents, which are related to National Examination, and the researcher. The documents which are employed in


(50)

this research are the blueprints (Competence Standard and Basic Competence of Senior High School grades 10th up to 12th, and the Graduate Competence Standard) and five sets of English items of National Examination year 2012 for senior high schools which were intended to be administered in Yogyakarta. Furthermore, the test specifications which are elaborated in the Competence Standard-Basic Competence and Graduate Competence Standard are developed into checklists. The checklists are employed as the instruments of this research to obtain the intended information. The checklists later are going to be used to check the validity and authenticity of the English test. The researcher becomes the research instrument as well since the data are compiled and analyzed are processed by the researcher, who constructs conclusions about what is regarded as data.

D.Data Gathering Techniques

In this research, the researcher analyzes organized documents as the main resources. The primary resources are in forms of descriptive data. The resources which are employed in this research are English test items of National Examination year 2012 for senior high schools and the blueprints consisted of Graduate Competence Standard year 2012 for senior high schools, Competence Standard and Basic Competence of English subject for senior high schools grades 10th up to 12th and the criteria of an authentic language testing defined by Brown (2004). The researcher then elaborates the test specifications which incorporated in the document of Competence Standard-Basic Competence and the document of


(51)

Graduate Competence Standard year 2012 for senior high schools into checklists in order to obtain information of content validity on the English test items of National Examination year 2012 for senior high schools. The authenticity criteria were elaborated into checklists as well as to obtain information of the authenticity on the English test items of National Examination year 2012 for senior high schools.

The copies of English items of National Examination year 2012 for senior high schools, the document of Competence Standard-Basic Competence and the document of Graduate Competence Standard year 2012 for senior high schools were gotten from an English teacher in SMA Negeri 7 Yogyakarta. It was when the researcher was conducting teaching practice (Praktek Pengalaman Lapangan) in that school. The criteria of authenticity which were applied to obtain data in this research were taken from Brown (2004) in principles of language assessment.

E.Data Analysis Techniques

This research was focused on analyzing the English test items of National Examination year 2012 for senior high schools in order to answer the research problems. The analysis process was related to checking the validity of the content of the English test and the authenticity of the items. Alderson, et al. (1995) states that a procedure for evaluating content validity of a language test is by comparing the test content with specifications (p. 193). This is supported by Crocker and Algina (1986) who advance that ‘matching method’ effectively ensure validity (as cited in Miller, 2003, p.12).


(52)

Before the English test content of National Examination year 2012 for senior high schools were compared with the blueprints, the test specifications of the blueprints were compared each other in order to ensure that the test specifications were valid to be the instrument to assess the validity on the English Test of National Examination year 2012 for senior high schools. The test specifications which were incorporated in Graduate Competence Standard year 2012 were the elaboration of the lesson objectives on Competence Standard and Basic Competence of English subject for senior high schools grades 10th up to 12th. Graduate Competence Standard contained some specific skills and topics which were provided in the National Examination. The Competence Standard and Basic Competence of English subject for senior high schools grades 10th up to 12th contained the competences and lesson objectives which the students were required to achieve.

Meanwhile, Brown (2004) explains that an authenticity is presented in some criteria namely the language represented in the test is as natural as possible, items are contextualized, topics are meaningful (relevant, interesting), some thematic organized items is provided, and the task represents real-world tasks (p.28). He adds listening comprehension section should contain natural language with hesitation, white noise, and interruptions. Besides natural language use, the topics should be taken from the real-world sources. Later, the checklists to analyze the content validity and the authenticity of the English test were composed. The checklists were distinguished into three checklists. The first checklist was employed to check the validity of the test specifications. The second


(53)

checklist was employed to check the validity of the listening test items and reading test items compared with the test specifications on the blueprints. The last checklist was used to check the authenticity of the listening and reading test items. The techniques on analyzing the data were described as follows.

1. A checklist to check the validity of the test specifications

The checklist was employed to ensure whether or not the Graduate Competence Standard (Standard Kompetensi Lulusan) meets appropriateness to Competence Standard and Basic Competence of Senior High School grades 10th up to 12th.

Table 3.1 The Sample of Test Specification Validation Checklist

No. Competence Standard Basic Competence Graduate

Competence Standard Listening

1. Transactional and

Interpersonal conversations 2. Transactional and

Interpersonal conversations in formal and sustained situation

3. Transactional and

Interpersonal conversations in formal and sustained situation

4. Short functional texts and simple monologue texts

Reading 5. Short functional texts and

simple essays

The checklist was elaborated into two parts namely listening section and reading section since the measured skills in National Examination are both listening and reading skills. Each section had four columns; the first column was for the numbers, the second column consisted of lesson objectives and materials topics


(54)

taken from Competence Standard (four material topics for listening skills and one material topic for reading skills). The third column consisted of the lesson objectives and material topics taken from Basic Competence for the listening or reading skills (four material topics for listening skill and one material topic for reading skill). After the third column, there was the last column which was to put the Graduate Competence Standard (Standard Kompetensi Lulusan). In order to make the illustration clearer, the checklist could be seen in appendix four on pages 104 up to 105.

The checklists were filled by ticking () on the last column which represented the appropriateness between Graduate Competence Standard and both Competence Standard and Basic Competence. If one of the boxes on Graduate

Competence Standard’s column was not ticked, it meant that the criteria on

Competence Standard and Basic Competence of Senior High School grade XII were not stated in the Graduate Competence Standard (Standard Kompetensi

Lulusan). Giving tick () on the boxes means that the criteria on Competence

Standard and Basic Competence of Senior High School grade XII were stated in the Graduate Competence Standard (Standard Kompetensi Lulusan). After comparing the blueprints, the researcher later calculated the result. The final result was presented into percentage.

2. Checklists to check the content validity of English Items on listening and reading section of National Examination

In order to analyze the content validity of the English test items of National Examination, the researcher utilizes other checklists. There were two


(1)

Test item number The Criteria of Authenticity

Real-world representativeness  48

E45

Natural language use 

Contextualized Items 

Relevant topic 

Episodic item organization  Real-world representativeness 

49 E45

Natural language use 

Contextualized Items 

Relevant topic 

Episodic item organization  Real-world representativeness  50

E45

Natural language use 

Contextualized Items 

Relevant topic 

Episodic item organization  Real-world representativeness 


(2)

Test passages The Criteria of Authenticity

A Message to Mr. Anwar

Natural language use

-Relevant topic 

Real-world representativeness -A Letter from Indrawan Tan

Natural language use 

Relevant topic 

Real-world representativeness -Country Club

Natural language use

-Relevant topic 

Real-world representativeness -A Cap Seller

Natural language use

-Relevant topic 

Real-world representativeness -Tourist Boats Collided in Thailand

Natural language use 

Relevant topic 

Real-world representativeness -Dr. Abdulrachman Saleh

Natural language use

-Relevant topic 

Real-world representativeness -Notice to Bid Purchase

Natural language use

-Relevant topic

-Real-world representativeness -Photosynthesis

Natural language use 

Relevant topic 

Real-world representativeness -Raja Ampat

Natural language use

-Relevant topic 

Real-world representativeness -A Laptop

Natural language use

-Relevant topic 

Real-world representativeness -Smoking

Natural language use 

Relevant topic 

Real-world representativeness -Boarding School Education

Natural language use

-Relevant topic 

Real-world representativeness -Laskar Pelangi

Natural language use 

Relevant topic 


(3)

-Test passages The Criteria of Authenticity

A Fantastic Holiday

Natural language use 

Relevant topic 

Real-world representativeness -A Super Mother

Natural language use

-Relevant topic 

Real-world representativeness -A Letter from Imam Subagyo

Natural language use

-Relevant topic 

Real-world representativeness -Office Suites

Natural language use 

Relevant topic 

Real-world representativeness -ASEAN

Natural language use 

Relevant topic 

Real-world representativeness -Milton Friedman

Natural language use 

Relevant topic 

Real-world representativeness -CALL for Proposals

Natural language use

-Relevant topic 

Real-world representativeness  Remote Sensing

Natural language use

-Relevant topic 

Real-world representativeness -Kapoposang

Natural language use

-Relevant topic 

Real-world representativeness -A Natural Disaster

Natural language use

-Relevant topic 

Real-world representativeness -Being on Time

Natural language use

-Relevant topic 

Real-world representativeness -Solar Energy

Natural language use

-Relevant topic 

Real-world representativeness -Cooking Noodles

Natural language use 

Relevant topic 

Real-world representativeness -A Trip to Hampshire

Natural language use

-Relevant topic 

Real-world representativeness -A Letter from ICTG

Natural language use

-Relevant topic 

Real-world representativeness

-Galaxy Tour Natural language use


(4)

Indonesia Government and WWF

Natural language use 

Relevant topic 

Real-world representativeness -Agustinus Adisucipto

Natural language use

-Relevant topic 

Real-world representativeness -Announcement to All Staff

Natural language use

-Relevant topic 

Real-world representativeness -An Aurora

Natural language use

-Relevant topic 

Real-world representativeness -Wakatobi

Natural language use

-Relevant topic 

Real-world representativeness -Toba Eruption

Natural language use 

Relevant topic 

Real-world representativeness -Study Groups

Natural language use

-Relevant topic 

Real-world representativeness -Having a Pet

Natural language use 

Relevant topic 

Real-world representativeness -My Time in Valencia

Natural language use

-Relevant topic 

Real-world representativeness -Kids’ Watching TV

Natural language use

-Relevant topic 

Real-world representativeness -A Letter from -Annie Wright

Natural language use

-Relevant topic 

Real-world representativeness -Prime Plaza Hotels and Resorts

Natural language use

-Relevant topic 

Real-world representativeness -Madonna Sues Manhattan co-op

Board

Natural language use

-Relevant topic 

Real-world representativeness -Alfred Bernhard Nobel

Natural language use 

Relevant topic 

Real-world representativeness -Rocks

Natural language use

-Relevant topic 


(5)

-Test passages The Criteria of Authenticity

Negeri Sembilan

Natural language use 

Relevant topic 

Real-world representativeness -Immune System

Natural language use 

Relevant topic 

Real-world representativeness -/Agriculture

Natural language use 

Relevant topic 

Real-world representativeness -Beggars

Natural language use

-Relevant topic 

Real-world representativeness -Visiting Oceanorium

Natural language use

-Relevant topic 

Real-world representativeness -Octopuses

Natural language use

-Relevant topic 


(6)