Review of Previous Study 39 engagement because it less proven. The approach in this journal is different from the research will be examine, this research use an exploratory factor analysis. The fourth is a journal with title “Construct Validation of TOEFL-iBT as a Conventional Test and IELTS as a Task-based Test among Iranian EFL Test- takers’ Performance on Speaking Modules” created by Elham Zahedkazemi. The aim of his research is to examine the construct validity of the speaking section of two well-known international English proficiency tests; TOEFL-iBT and IELTS. The researcher seeks how IELTS and TOEFL iBT tap the same construct validity on the speaking proficiency of their candidates and moreover about the similarity of results for both IELTS and the TOEFL speaking tests. 57 The result of his study showed that both TOEFL-iBT and IELTS have similar result in measuring the speaking ability of the test-takers which means that these two tests have strong correlation. Thus, his research and this research are very different in term of focus on doing the research. This research will be focus on the construct validity of each item on a TOEFL-like test while his research focus on investigating the construct validity by seeking the correlation between TOEFL-iBT and IELTS. The fifth is a research by Ebrahim Khodadady, with the title “Construct Validity of C-tests: A Factorial Approach”. The C-Tests designed by Klein-Braley 1997 were changed into two other types of tests called Spelling Test and Decontextualised C-Test and administered along with the disclosed Test of English as 57 Elham Zahedkazemi,“Construct Validation of TOEFL-iBT as a Conventional Test and IELTS as a Task-based Test among Iranian EFL Test-takers’ Performance on Speaking Modules” Theory and Practice in Language Studies, Vol. 5, No. 7, July 2015, pp. 1513-1519. 40 a Foreign Language TOEFL, a semantic schema-based cloze multiple choice item test S-Test and a lexical knowledge test LKT. The research focus on finding what factor structure is for the C-Tests, the Decontextualised C-Test, the Spelling Test and what factor structure is if the TOEFL, the lexical knowledge test and S-Test are included. The application of principal component analysis PCA to the responses of the participants on the three tests, i.e., C-Tests, Spelling Test and Decontextualised C- Test, revealed two components called language proficiency and direction specificity in this study. While the inclusion of the S-Test, TOEFL and the LKT in the PCA yielded the same two components, their rotation brought about the highest loadings of the included tests as well as the moderate loadings of the C-Tests on the first component, validating them as proficiency measures of language. However, they loaded the highest on the second component along with the Spelling Test and Decontextualised C-Test and thus confirmed their spelling and direction specificity. The difference between the study of C-test and this research is the focus of research. This research focuses on examining the construct validity of TOEFL-like test using factor analysis procedure. The sixth is a study by Neil J. Anderson, Lyle Bachman, Kyle Perkins and Andrew Cohen entitled “An exploratory study into the construct validity of a reading comprehension test: triangulation of data sources”. Reading, while recent thinking in language testing has recognized the importance of gathering information on test taking processes as part of construct validation. And while there is a growing body of research on test-taking strategies in language testing, as well as research into the 41 relationship between item content and item performance, no research to date has attempted to examine the relationships among all three - test taking strategies, item content and item performance. This study thus serves as a methodological exploration in the use of information from both think-aloud protocols and more commonly used types of information on test content and test performance in the investigation of construct validity. The study is examining the construct validity of the reading comprehension test by investigating the relationship among test taking strategies, item content and item performance. 58 In investigating the construct validity of TOEFL-like test, the researcher will only use one data resource, test item. Types of data analysis were performed, providing both qualitative and quantitative data. Each data analysis task reflects part of the triangulation of data sources to examine the construct validity of the reading comprehension test. The first task involved reviewing each of the think-aloud protocols and coding them for the use of reading and test-taking strategies. The second data analysis task involved the content analysis of each test item on both forms of the test from two perspectives: that of the test designer and of Pearson and Johnson’s 1978 taxonomy of relationships between texts and test items. Third, the data from the think-aloud protocols were then submitted to chi-squared analyses. The result of chi-square analyses indicate that these subjects are using these strategies differently, depending on the type of question 58 Neil J. Anderson, Lyle Bachman, Kyle Perkins and Andrew Cohen, “An exploratory study into the construct validity of a reading comprehension test: triangulation of data sources”. Language Testing , Vol.8 No.2, 1991, 41. 42 that is being asked. The results of the chi-square analysis also show that the subjects tended to use some strategies fairly consistently for the different types of items. The seventh is a journal entitled “Testing the Convergent and Discriminant Validity of the Decisional Balance Scale of the Transtheoretical Model Using the Multi-Trait Multi-Method Approach” by Boliang Guo, Paul Aveyard, and Antony Fielding of University of Birmingham and Stephen Sutton of University of Cambridge. The authors extended research on the construct validity of the Decisional Balance Scale for smoking in adolescence by testing its convergent and discriminant validity. Hierarchical confirmatory factor analysis multi-trait multi-method approach HCFA MTMM was used with data from 2,334 UK adolescents, both smokers and non-smokers. They completed computerized and paper versions of the questionnaire on 3 occasions over 2 years. The results indicated a 3-factor solution; Social Pros, Coping Pros, and Cons fit the data best. The HCFA MTMM model fit the data well, with correlated methods and correlated trait factors. Subsequent testing confirmed discriminant validity between the factors and convergent validity of both methods of administering the questionnaire. There was, however, clear evidence of a method effect, which may have arisen due to different response formats or may be a function of the method of presentation. Taken with other data, there is strong evidence for construct validity of Decisional Balance for smoking in adolescence, but evidence of predictive validity is required. The difference between the research above and this research is that this research use factor analysis procedure in case of measuring the 43 construct validity of this TEOFL-like test while the research use MTMM in case of measuring the discriminant validity which is the part of construct validity. The eight is a journal entitled “Does the text matter in a multiple choice test of comprehension? The case for the construct validity of TOEFL’s minitalks” by Roy Freedle and Irene Kostin. The current study addresses a specific construct validity issue regarding multiple choice language-comprehension tests by focusing on TOEFL’s minitalk passages: Is there evidence that examinees attend to the text passages in answering the test items? To address this problem, we analysed a large sample n = 337 of minitalk items. The content and structure of the items and their associated text passages were represented by a set of predictor variables that included a wide variety of text and item characteristics identified from the experimental language-comprehension literature. Stepwise and hierarchical regression techniques showed that at least 33 of the item difficulty variance could be accounted for primarily by variables that reflected the content and structure of the whole passage andor selected portions of the passage; item characteristics, however, accounted for very little of the variance. The pattern of these results was interpreted, with qualifications, as favoring the construct validity of TOEFL’s minitalks. Our methodology also allowed a detailed comparison between TOEFL reading and listening minitalk items. Several criticisms concerning multiple-choice language- comprehension tests were addressed. The difference between this research and the research which the researcher worked for is the method in examining the construct validity. As the researcher has said above, this research uses regression in case of 44 measuring the construct validity of TOEFL’s minitalks. The research which the researcher worked for is using factor analysis, exploratory analysis for specifically for measuring the construct validity of TOEFL-like test. 45


This chapter presents the method used to collect data of the study. The research methods includes the research design, setting of the study, population and sample, research variable, data collection technique, research instrument and data analysis.

A. Research Design

Intended, for analyzing the construct validity of the TOEFL-like test of Intensive English Program, the researchers will conduct a quantitative research. Moreover, Sugiyono states that quantitative research is a scientific, empiric, objective, rational, and systematic method. Therefore, research data is derived in the form of numbers and statistic table. It is named quantitive for research data is shaped as numbers 1 . In investigating the construct validity of the TOEFL-like test, the researchers will use an exploratory factor analysis approach. This approach investigates the correlation of each question on each section of the TOEFL-like test. Furthemore, In term of measuring the construct validity of the TOEFL-like test, the researchers will use exploratory factor analysis. 1 Sugiyono, Metode Penelitian Kuantitatif Kualitatif dan RD. Bandung: CV Alfabeta, 2011, 7. 46

B. Population and Sample

1. Population The population of this study is the entire English intensive program’s students of UIN Sunan Ampel Surabaya. But the researcher is only able to collecting 336 students’ answer sheets from intensive English program lecturers. The TOEFL-like test used in this research is from the intensive English program academic year 2012 – 2013. The reason of using this question sheet because the researcher is not able in getting the students’ answer sheets from another academic year. 2. Sample In measuring the number of sample in this study, the researcher uses Slovin formula. This formula uses to determining the number of sample from this population. The sample of this study is 183 students, by using Slovin formula. Here is the slovin formula for measuring the sample in this research: = 1 + = 336 1 + 336 0.05 = 336 1 + 0.84 = 336 1,84 = 183 47

C. Research Instrument

According to Suharsimi Arikunto, Research instrument is a useful tool which is choose and use to help the researcher in gathering the data systematically and easily. 2 This research use documentation of TOEFL-like test answers’ sheet as the research instrument. The degree of validity of this instrument is indefinite; the validity of this instrument has never been investigated before. Therefore, the result of this research will show the validity degree of both the subject and the instrument of the research.

D. Research Variable

A variable is a construct or a characteristic that can take on different values or scores. 3 There are two types of research variables: independent and dependent variables. Independent variables are antecedent to dependent variables. X represents the independent variable. Y represents the measure of the dependent variable. The independent variable is study skills textbook reading, note-taking, memory, test’s preparation, concentration and time management. The second variable is dependent variable; dependent variable which influenced by independent variable output. The symbol of this variable is Y. The dependent variable of this research is construct validity which is symbolized as X, while the independent variable of this research is TOEFL-like test question’s sheet and the students’ answers which is symbolized as Y. 2 Suharsimi Arikunto, Manajemen Penelitian Jakarta: Rineka Cipta, 2000, 134. 3 Donald Ary, Lucy Cheser Jacobs and Asghar Razavieh. Introduction to Research in Education.USA: Cangage Learning, 2010, 37. 48

E. Data Collection Technique

The data may be obtained by administering questionnaires, testing, personal observations, interviews and many other techniques of collecting quantitative and qualitative evidence. 4 In this research, the researcher use documentation review of TOEFL-like test question’s and answer’s sheets as the data collection technique. There are ten kinds of TOEFL-like test question’s sheet booklets, but there is only one observable booklet due to the permission of P2B of UIN Sunan Ampel Surabaya. The question’s sheet consists of 140 test items and divides into three sections: listening, structure and reading.

F. Data Analysis Technique

The data of this research will be examined using an exploratory factor analysis. This analysis technique was used because each question of the TOEFL-like test will be treated one by one. In the exploratory factor analysis model, the variance in each observed variable is explained in terms of the factor leadings on a number of different factors, as in the following equation: Equation : z i = b 1 F 1 + b 2 F 2 + …….b n F n + Ui, where z i is the normal deviate for a variable; b is the factor loading on a given factor; F is a factor; and 4 Yogesh Kumar Singh, Fundamental of Research Methodology and Statistics New Delhi: New Age International Publisher, 2006, 212.