Face Validity Content Validity

19 indicators. However, in order to meet the validity, the test should reflect the skills or behavior which would be assessed. There are five types of validity to determine whether or not a test is valid namely face validity, content-related evidence content validity, criterion-related evidence, consequential validity, and construct validity.

a. Face Validity

Gronlund 1998 states a test is considered having face validity if the students look the test as fair, pertinent, and utile for improving learning as cited in Brown, 2004, p.26. Face validity itself refers to how the test looks good and it obviously appears to measure the skills which are going to be measured. Furthermore, according to Brown 2004 criteria of a test which has face validity are that the test is well-constructed, the test has the time allotment, the items are obvious and simple, the directions are clear, the tasks meet content validity, and the difficulty level presents a reasonable challenge p.27.

b. Content Validity

According to APA 1954, content validity refers to the scale that the content of assessment items reflects the content domain of interest as cited in Miller, 2003, p.2. Shepard 1993 adds that content validity is an indicator to interference the result. It is “evidence-in waiting” as cited in Miller, 2003, p.5. It means that whenever a test meets validity in the content, the items of the test represent the skills or behavior to be measured in order to evaluate achievement tests. 20 Therefore, the scores of the test are effectively used as the meaningful indicators of students’ competence, for instance, a test for reading skills would be considered as a valid reading test if a test of reading measures reading skill and nothing else. The test is not a valid test for speaking or vocabulary because it does not test speaking or vocabulary. However, Seif 2004 claims it does not mean all educational objectives of a particular course are included in the test. Due to test practicality, the test designers should compose several questions which are able to be representatives of achieving the set educational goals. Seif 2004 claims content validity is one of essential parts to compose a test as cited in Jandhagi and Shateria, 2008, p.2. As a test does not meet validity in its content, there will be two possible outcomes. First, students are not able to perform the needed skills which are not included in the test. Second, there may be some inappropriate questions which students are not able to answer. Therefore, the test tasks should be appropriate to the test specifications on the blueprints. It is similar to what Seif 2004 says, evaluating content validity of a test can be carried out by matching the sample of the test questions to the test instructions as cited in Jandhagi and Shateria, 2008, p.2. Crocker and Algina 1986 advance that ‘matching method’ effectively ensure validity as cited in Miller, 2003, p.12. According to Bachman and Palmer 1996, blueprint is a completed plan providing the characteristics to develop the entire tests p.90. It contains task specifications for all type of tasks which are to be included in a particular test. The blueprints are evaluation tools to check whether or not the test items are appropriate to the test specifications stated in the blueprints. Brown 2004 states 21 that test specifications include the general outlines of the test and the test tasks p.50. The test specifications refer to a certain curriculum and it consists of only the general outlines of whole materials and skills to be tested since the test designers should consider test practicality.

c. Criterion-Related Evidence