31
addition, Morrison 1993 as cited in Cohen, Manion, and Morrison 2000 states that:
A survey research usually generates numerical data, provides descriptive, inferential and explanatory information, uses the same instruments for all
participants or samples, captures data from multiple choice, closed questions, test scores, or observation schedules, and gathers data which
can be processed statistically p. 171.
In conducting the research, the data were collected from samples, which were students of PMat 2010, by using a test. As the descriptive research should
be, the data collected was presented by spotting the errors referred to the classifications and the sources of errors more than spotting the detailed number of
errors each sample made.
B. Research Setting
In order to gain sufficient data like what had been expected to answer both first and second research questions, the researcher conducted the research on
students of PMat 2010. The research took place in Campus 3 Sanata Dharma University, particularly in Mathematics Education Study Program.
C. Research Sample and Population
To successfully conduct survey research on interrogative word question constructions, the researcher used samples to represent the whole population of
students of PMat 2010. The samples of the survey research were 38 students. The researcher believes that those numbers of student adequately represent the number
of population since the population has been limited before.
32
Based on the limitation, the samples, which were 38 students, were selected from six classes of English course attended by 71 students of PMat 2010.
Those students basically had taken English lessons in their previous semesters and had been taught how to construct interrogative word questions using 5W 1H in the
English course.
D. Research Instrument
In order to obtain sufficient data to answer both first and second research questions, the researcher used a test as the research instrument. The test was given
to discover the findings about the errors made by the samples and also to define the sources of errors.
1. Test
“Test is a set of stimuli presented to an individual in order to elicit responses on the basis of which a numerical score can be assigned” Ary, Jacobs,
Razavich 1990, p. 227. In addition, according to Cohen, Manion, and Morrison, test can be a very powerful instrument of data collection because the
data gathered is in the form of numerical data rather than verbal 2000. Further, the score achieved from the test is used to measure the characteristics of the
samples based on the indicator.
2. Types of Test Item
According to Cohen, Manion, and Morrison 2000, the content of the test must be fitted to the purpose of the test itself. The purpose of the test is
represented by the types of test item used by the researcher. There are many types of test item that can be used as the instrument of research. Considering the
33
purposes of using a test as research instrument, especially to answer the second research question, the researcher decided to use five types of test item, namely
transformation test, completion test, word order test, multiple choice test, and translation test.
In order to collect as much data as possible and to fulfill the characteristics of errors in constructing interrogative word question, the researcher preferred to
construct the test items with equal number of each type. The researcher made 50 test items which were divided equally to five types of test item. Thus, each type of
test item contained 10 items. To make it clearer, the researcher provided the table of test items distribution as follows:
Table 3.1 Distribution of Test Items
Transformation Completion
Word Order
Multiple Choice
Translation Test
10 10
10 10
10
3. Validity and Reliability of the Test
When designing a test as the research instrument the researcher should consider two essential requirements of a test. They are test validity and reliability.
If the test used in the research is valid and reliable, the data obtained from the test is considered valid and reliable to answer the research questions.
a. Validity
“Validity is an important key of an effective research...However, it is impossible for research to be
100 valid” Cohen, Manion, Morrison, 2000, p. 105. Besides, Ary, Jacobs, and Razavich argue that validity is always specific. It
34
is specific to the particular purpose of the instrument that is being used. Nevertheless, the instrument validity may not be valid in different situation for
different purpose 1990. Hatch and Farhady add that validity of a test refers to the extent to which the results of the procedure serve the uses of what they were
intended. It refers to the results not to the test 1982.
1 Content Validity
Based on the theory proposed by McNamara, content validity refers to the relevance of the test content to what is measured 2008. It means that the test
given to the samples of research should be relevant to what the researcher wants to measure. Therefore, the test items should be designed according to the purpose
why the test is given. In addition, Hatch and Farhady 1982 state the focus of the content validity is on the adequacy of the sample rather than the appearance of the
test. Considering a test should contain content validity, the test items given to
the samples of research, which were students of PMat 2010, were designed to represent the content validity of a language test. Besides, the test items were also
constructed based on the purposes why the test was given. In this research, the purposes of the test were to discover the errors the samples made regarding the
construction of interrogative word questions and the sources of the errors. Furthermore, the test items used in this research contained the material, which
would be measured, related to the use of
what, who, where, when, why,
and
how
. As a result, the test items were valid enough to measure what would be measured.
35
2 Construct Validity
Construct validity of a test is defined as the criterion which is used to validate large scale standardized test of proficiency Brown, 2004. Further,
construct validity is used to determine how well the test scores represent the learning objectives and how well the test scores predict the particular performance
Hatch Farhady, 1982. In addition, McNamara proposes a theory that test construct underlies the ability which is being measured by the test 2008.
In order to represent the construct validity of a good test, the test items were designed to demonstrate what would be measured. In this case, the purposes
of the test were to define the errors and the sources of errors in English interrogative word questions made by students of PMat 2010. Thus, the test items
were divided into five types namely transformation, completion, word-order, multiple-choice, and translation.
The first part of the test, which is named transformation test, requires the samples of research to transform the statements to appropriate interrogative word
questions which are used to question the subject, object, adverb, or quantifier. In the second part, which is called completion test, the samples of research are given
order to complete the 10 items of incomplete interrogative word question. What they should write to complete the interrogative word questions are only the
interrogative words. Further, the third part represents the word-order test. In this part, the
samples of research are required to construct correct interrogative word questions
36
according to the jumbled words given in each number. The fourth part, which is named multiple-choice test, requires the samples of research to pick one best
answer dealing with the use of interrogative words. At last, the translation test is constructed to know how far the samples of
research understand about the construction of interrogative word questions in both languages they use, which are English and Bahasa Indonesia. In this part, students
of PMat 2010 are required to translate 5 items of English interrogative word question to Bahasa Indonesia and 5 items of interrogative word question in
Bahasa Indonesia to English.
3 Face Validity
Mousavi 2002 as cited in Brown 2004 states that face validity refers to the degree to which the test appears to measure the proficiency or ability based on
the examinees who take it. It means that the face validity of the test represent the relevance of the test in the eye of the examinees or test takers. As a matter of fact,
validity of a test is not something that can be tested scientifically, yet it can be measured if the learners or test takers encounter the uncomplicated items, clear
directions, and suitable difficulty level. Considering that matter, the researcher then constructed 50 test items
which were uncomplicated and suitable to the level of English proficiency of the samples. The test items were constructed based on the preliminary observation
done before the test was given. Moreover, the whole test items involve daily used words in which the samples have already mastered. Some test items were also
37
adapted and taken from the module used in the English course, in which students of PMat 2010 were taking at that time. Besides, to support high face validity, the
researcher also consulted the test items to the advisor and revised them two times.
b. Reliability
“Reliability of a test is the extent to which a measuring device is consistent in measuring whatever it measures” Ary, Jacobs, Razavich, 1990, p. 256. In
addition, Brown 2004 states that reliability can be achieved by making sure that all test takers receive the same quality of input. Besides, it should represent the
degree of consistency about what was being measured. Mostly, the reliability focuses on the score achieved by the test takers compared to equality of the test
items given. According to Ary, Jacobs, and Razavich, the reliability of a test is divided
into four types of method. They are the test-retest reliability, the equivalent-forms reliability, the split-half reliability, and the homogeneity measures reliability
1990. In order to support appropriate answer to the second research question, the researcher preferred to use the fourth method which is the homogeneity measures
reliability. Homogeneity measures reliability refers to the method of estimating
reliability of a test which recognizes that the whole test items are equal Ary, Jacobs, Razavich, 1990. There is no need to establish the different difficulty
level of each item the grouped into two groups like what is found in the split-half technique. By using homogeneity measures reliability, the researcher did not need
to establish the different difficulty level of each test item and then grouped into
38
two halves. The researcher recognized that the whole test items included in the test sheet were equal.
Furthermore, to give the best answer, particularly to the second research question, the researcher had to make sure that the test, which was used as the
research instrument, was reliable. Thus, the researcher used a formula proposed by Kuder-Richardson to compute the reliability of the test used in the research.
The formula, which is determined based on the proportion of correct and incorrect responses, is known as Kuder-Richardson formula 20 Ary, Jacobs, Razavich,
1990, p. 277. The formula can be seen as follows: K S
x 2
– Σ
pq
r
xx
= K
– 1 S
x 2
where: K = number of items on the test
S
x 2
= variance of scores on the total test squared standard deviation p = proportion of correct responses on a single item
q = proportion of incorrect responses on the same item The computation of the test reliability gave result as shown in the reliability scale
which is: 0.86 see Appendix D.
The scale of reliability as shown above determined that the test given to the samples of research, which were students of PMat 2010, was reliable.
According to Purpura 2004, the test reliability ranges on a scale from zero 0 to one 1 in which the zero scale represents low reliability and one scale represents
high or perfect reliability. In addition, Gall, Gall, and Borg propose a theory that a test which yields scores with reliability scale higher than 0.80 is considered
having sufficient reliability especially for research purposes 2007.
39
E. Data Gathering Technique
A data gathering technique is used to gain as much information as possible to answer the research questions. In this research, the researcher gathered the data
to answer both research questions by distributing a test containing 50 test items to the samples of research, which were students of PMat 2010. The data was
collected by checking and analyzing the erroneous answers made by the samples of research shown in the test sheets.
Formally, the test sheets were distributed in three different phases. The first phase was done on Tuesday, October 30, 2012 in class E consisting 8
students. The second phase was done on Wednesday, October 31, 2012 in class F consisting 11 students. Further, the third phase was done on Friday, November 2,
2012 in class B consisting 19 students. Each test was done in 45 minutes. During the test, the samples of research were asked to answer all the test items which are
50 items individually. Almost all the samples of research understood what to do. Only several of them asked for further explanation about the procedure to answer
the test items. Since the test aims to give the best answers to both first and second
research questions, which are to discover the errors and to determine the sources of errors in interrogative word questions, the test takers were not allowed to use
any kind of dictionary. Besides, they were banned to cheat and to discuss the answers with the other test takers. Considering the rules, the samples of the
research did the test quietly and seriously. After they were done, they were
40
instructed to recheck their answers in 10 minutes before collecting the test paper and leaving the classroom.
F. Data Analysis Technique