Teacher – Made Tests Types of Tests

11 4. to measure aptitude for learning. 5. to measure the extent of students’ achievement of the instructional goals. 6. to evaluate the effectiveness of the instruction. Arifin 1985 has his own opinion. The evaluation activities have the objectives as follows: 1. to know how far the students master the materials that have been given. 2. to know how far the capability of the students in learning the lesson materials. 3. to know whether the students’ development level has been appropriate to the development level of the work program. 4. to know the efficiency and affectivity degree of the teaching strategy that has been used, both about the methods and techniques of teaching and learning. Howard 1963 clarifies the need of the test. On his opinion tests are used as tools in reaching decisions. It could be the decisions which are institutional; that is, the decision are made on the behalf of an institution school, college, corporation, etc and such decisions are made frequently. The test can be extremely effective in such situation because it help the institution to reach a higher percentage of good decisions which will affect himself or, perhaps, a son or daughter. All the opinion stated above explains the function of the test. Norman 1976 lists some more important supplementary uses of the test as follows: 1. Use in reporting pupil progress to parents. The systematic use of evaluation procedures in the classroom provides the Teacher with an objective and comprehensive of each pupil’s learning progress. It can be presented in the written or oral form. 12 2. Use in guidance and counseling The result is especially useful for guidance and counseling. It includes the process of assisting a pupil with educational and vocational decisions, guiding him in the selection of curricular and extracurricular activities, and helping him to solve personal and social adjustment problems. All require an objective knowledge of the pupil’s abilities, interests, attitudes, and other personal characteristics. 3. Use in school administration In this role, the administrator is able to judge the extent to which the objectives of the school are being achieved, to identify strength and weaknesses in the curriculum, and to appraise special programs in the school. 4. Use in school research Carefully controlled studies of such things as the comparative effectiveness of different curricular, different teaching method and different organizational plans require objectives measure of pupil performance. There are many opinions about the role of evaluation in education. We can divide it into two functions: evaluation provide feedback to improve the educational system and process and evaluation help determining the result of a program in a period of time. For example, students’ achievement of mastering the learning materials. Related to the function of tests explained above, the teacher-made test are used to determine students’ achievement and the effectiveness of instruction of learning English in SMAN 1 Pamulang after a period of time. 13

2.1.5. The Characteristics of a Good Test

Lado, Harris 1969 states that all good test has three qualities: validity, reliability, and practicality. Hatch and Farhady 1969 also says that a good test should have three basic requirements, two of which are absolutely crucial. Those two characteristics are validity and reliability, and the practicality as the third criteria is not so important. 2.1.5.1. Reliability Reliability is the degree of consistency among test scores. Hughes 1989: 36 suggests how to make test more reliable. There are two components of test reliability, that is the performance of candidates from occasion to occasion and the reliability of scoring. 2.1.5.2. Validity A test is said to be valid if it measures accurately what it is intended to measure Hughes, 1989: 22. The general concept of validity was traditionally defined as the degree to which a test measures what it claims, or purports, to be measuring Brown, 1996, p. 231. There are three types of validity as follows. 1. Construct validity Heaton 1975: 154 states that “if tests construct validity, it is capable of measuring certain specific characteristics in accordance with a theory of language behaviour and learning”. This assumes the existence of certain learning theories or constructs underlying the acquisition of abilities and skills. Hughes 1989: 26 adds that “a test, part of a test, or a testing technique is said to have construct validity if it can be demonstrated that it measures just the ability which it is 14 supposed to measure”. The word ‘construct’ refers to any underlying ability or trait which is hypothesized in a theory of language ability. One might hypothesis, for example, that the ability to read involves a number of sub- abilities, such as the ability to guess the meaning of unknown words from the context in which they are met. It would be a matter of empirical research to establish whether or not such a distinct ability existed and could be measured. If we attempted to measure that ability in a particular test, then that part of the test would have construct validity only if we were able to demonstrate that we were indeed measuring just that ability. Brown 1996 adds that a construct, or psychological construct as it is also called, is an attribute, proficiency, ability, or skill that happens in the human brain and is defined by established theories. Construct validity has traditionally been defined as the experimental demonstration that a test is measuring the construct it claims to be measuring. Such an experiment could take the form of a differential-groups study, wherein the performances on the test are compared for two groups: one that has the construct and one that does not have the construct. If the group with the construct performs better than the group without the construct, that result is said to provide evidence of the construct validity of the test. This idea is also supported by McNamara 2004, he said that construct mean the underlying ability or trait being measured by the test. Campbell and Stanley 1966 states that construct validity can be considered as labels that assign meanings to the test we are measuring.