3.6 Type of the Data
The researcher used quasi-experimental design to conduct this study. Since it was experimental research, the writer needed the data in the form of numbers for
gathering quantitative research. The data was obtained from the students’ score on vocabulary test in the pre-test and post-test from the experimental group and the
control group.
3.7 Instrument for Collecting the Data
Instrument was one of the important things in experiment in which the data can be obtained by using instrument. Instrument was a device used when a researcher
was applying a method of a research. In this research, test was used as the research instrument.
Test was a method of measuring person ability or knowledge in a given domain. The researcher needed a test
to measure students’ mastery on vocabulary. The vocabulary matching task format and selected response fill-in vocabulary
format test consists of 40 items which focus on vocabulary.
3.7.1 Try Out
Before the test is used to collect the data, the try-out test should be conducted to measure whether the test has validity and reliability. If the item is not valid or
reliable, then, it has to be revised or replaced. Try-out test was held at SMPN 39 Semarang in the eighth grade students
in the academic year 20142015. In doing this try out test, the writer took a class
which belong to neither the experimental nor control group. There were 32 respondents in the try out test.
The respondents had to answer 40 questions in 45 minutes in vocabulary matching task format and selected response fill-in vocabulary format. As
mentioned by Brown 2004:198, matching tasks have the advantage of offering an alternative to traditional multiple-choice or fill-in-the-blank formats and are
sometimes easier to construct than multiple choice items, as long as the test designer has chosen the matches carefully.
The result of this try out test became the consideration dealing with validity and reliability for using instrument in gathering the data. In order to meet
the reliability of the test, the writer did try out test one time. If the result on students’ score on was good enough, then the test was reliable. Thus, the test
could be used as the instrument to collect the data.
3.7.2 Qualities of the Test
A test is considered as a good test, if it has three qualities. Those qualities are validity, reliability, and practicality. In this study, the writer only focused on
validity and reliability of the test since those are the qualities that are essential to the effectiveness of any data-gathering procedure.
3.7.2.1 Validity
According to Gronland as adapted in “Language Assessment”, validity is the extent to which inferences made from assessment results are appropriate,
meaningful, and useful in terms of the purpose of the assessment Brown,
2004:22. In other words, validity is a standard or criterion that shows whether the instrument is valid or not. To measure the validity, the writer used the formula
that called Product moment:
�� = � ∑ � − ∑� ∑
� ∑�
2
− ∑ �
2
� ∑ ² − ∑ ²
where:
r
xy
= the correlation of the scores ΣX = the total of students who have right answer
ΣY = the total of students’ scores X = the number of the students who have right answer
Y = the students’ scores N = the number of students
Arikunto, 2009:72 After the writer obtained the reliability score, the following step was
to insult to the score with the r Product Moment table.
3.7.2.2 Reliability
Reliability is defined as the quality of the instrument or procedure demonstrates over a period of time. Heaton 1975: 167 states that “reliability is a necessary
characteristic of any good test: for it to be valid at all, a test must be first reliable as a measuring instrument. If the test is administered to the same candidates on
different occasions, then, to the extent that it produces result, it is not reliable.”
To know whether this test is reliable or not, I will use the following formula:
�
11 =
[
� � − 1
] [
� − ∑ �
]
where:
r
11
= the reliability of the test k
= the number of items p
= proportion the subject answering the item incorrectly q = proportion the subject answering the item correctly
S
2
= the total variance Arikunto, 2009:100
The result was consulted to critical value for r-product moment. When we obtained coefficient of the correlation is higher than the critical value for r-
product moment, it means that the item is valid at 5 alpha level of significance Arikunto, 2006:184.
3.7.3 Item Analyses
It has been stated before that the writer used vocabulary test in the form of vocabulary matching task format and selected response fill-in vocabulary format
as the instrument to collect the data. Arikunto 2009:206 states that the function of item analysis is to know whether the questions are good or not. It can be
obtained about the weakness and guidance to make repairs. There are three
problems associated with the analysis of problems, namely item difficulty, item discrimination and patterns of the answers.
This measurement of items is conducted in order to evaluate the effectiveness of the items. In this study, however, the writer only calculated item
facility and item discrimination of the test.
3.7.3.1 Item Facility known also as Item difficulty
According to Brown 2004:58 item facility is the extent to which an item is easy or difficult for the proposed group of test-takers. If the item is considered as an
easy item, it is seen when there are a great number of test-takers who answer the item correctly. However, if the item is considered as a difficult one, it is seen
when there are a great number of test-takers who answer the item incorrectly. In order to calculate the item facility, the following formula can be used.
P =
��
where: P = the index of difficulty
B = the number of students who answered the item correctly. Js = the number of the test-takers
Arikunto, 2009:207 According to Arikunto 2009:210 the index of difficulty can be classified
into the following. Interval
Criteria 0.00 P
≤ 0.30 0.31 P
≤ 0.70 Difficult
Medium
0.71 P ≤ 1.00
Easy
3.7.3.2 Item Discrimination known also as item differentiation
Brown 2004:59 states that item discrimination is the extent to which an item differentiate between high-and low-ability test-takers. The following formula can
be used to calculate the item discrimination Arikunto, 2009:213. D =
�
–
�
= PA – PB where,
J : the number of students J
A
: the number of upper group students J
B
: the number of lower group students B
A
: the number of upper group students who answer the item correctly B
B
: the number of lower group students who answer the item correctly P
A
: proportion of upper group students, who answer the items correctly P
B
: proportion of lower group students, who answer the items correctly According to Arikunto 2009:218 the discrimination index can be
classified into the following. Interval
Criteria 0.00 D
≤ 0.20 0.21 D
≤ 0.40 0.41 D
≤ 0.70 0.71 D
≤ 1.00 Poor
Satisfactory Good
Excellent
3.8 Method for Collecting the Data
Some steps were done in chronological ways to conduct this study. 1.
The researcher chose eighth grade students of SMPN 39 Semarang as the population. There were 285 students divided into 9 classes; six classes
consisting of 32 students and three classes consisting of 31 students. 2.
The researcher did try out tests. Try out tests were held in a class which belongs to neither the experimental nor control group.
3. The researcher analysed the result of try out tests. The writer analysed the
validity and reliability of the test. If the test was not valid or reliable, then, a revision, commutation, or deletion should be made.
4. The researcher selected two classes as the samples by using nonprobability
sampling technique. Those two classes were randomly assigned into two groups. One group was as the experimental group and another was as the
control group. 5.
The researcher gave a pre-test for both groups. Then, the writer scored the result of the pre-test.
6. The researcher gave treatment to the experimental group subtitled English
songs, while the control group used Grammar Translation Method. The treatment was given in three meetings.
7. The researcher gave a post-test for both groups. Then, the writer scored the
result of the post-test.
8. The researcher calculated the means of test results in both pre-test and
post-test of the two groups. 9.
The researcher compared the difference of the results. 10.
The researcher analysed whether the difference is significant by using t- test formula, also the researcher using t-test formula in the pre-test and
post-test between two groups. 11.
The researcher drew conclusion. 3.8.1 Activities of the Experimental and Control Group
3.8.1.1 Pre-Test
Ideally, pre-test is conducted before doing the treatment. A pre-test for experimental and control group is conducted once during the experiment. The pre-
test is taken in order to find out the students’ mastery of vocabulary before they are given the treatment.
In the pre-test, the writer asked the students to answer the questions focusing on vocabulary in vocabulary matching task format and selected response
fill-in vocabulary format. The instrument used in the pre-test was the same as in the try out test with some revisions, commutations, or deletions.
3.8.1.2 Treatment
After doing a pre-test, the treatment was given for both the experimental and control group. Treatment for both groups was conducted in three meetings.
In the experimental group, the writer gave the treatment using subtitled English songs. Before giving the treatment, the writer conditioned the class. Then,
the writer explained what the students had to do by explaining the steps of the activity.
For the control group, the writer translated some vocabularies into the students’ native language that had to be memorized by the students as the
treatment.
3.8.1.3 Post-Test
Post-test was conducted after the treatment. A post test for the experimental and control group was done in the same way as the pre-test on both groups. In the
post-test, the writer gave a set of questions focusing on vocabulary in vocabulary matching task format and selected response fill-in vocabulary format. The
questions in the post-test were the same as the questions in the pre-test. It was done in order to see the influence of the treatment after the writer gave the
treatment for the experimental and control group.
3.9 Result of Try Out Test