Discriminating Power Level of Difficulty Item Validity

The try-out which had been done, also had an analysis result based on discriminating power, level of difficulty, item validity, and reliability. Those aspects influenced the number of items that were used in a test. Therefore, those aspects were very important to calculate.

4.1.1.1 Discriminating Power

Discriminating power is calculated from the number of students in the upper and lower group who answered correctly and the number of all students in the upper and lower group. After calculating the item number 1, the discriminating power of it was 7.772. Therefore, the criterion of the item number 1 was excellent. In the discriminating power, there are 4 categories. They are poor, satisfactory, good, and excellent. From 10 items of the try-out test, there were 2 items that were poor, 2 items were satisfactory, 2 items were good, and 4 items were excellent.

4.1.1.2 Level of Difficulty

Level of difficulty is calculated from the number of students who answered correctly and the total number of the students. A good test is a test which is not too easy and difficult. According to Arikunto 2006: 210, the item difficulty of the test is categorized into 3 levels; difficult, medium, easy. After calculating the item number 1 for example, the index of difficulty level was 0.61. According to the criteria, the difficulty level of item number 1 was medium. As mentioned in the previous chapter, there are 3 categories for level of difficulty. There are difficult, medium, and easy. After calculating 10 items of the try-out test, there were 3 items which were easy, 5 items which were medium, and 2 items which were difficult.

4.1.1.3 Item Validity

Validity means standard or criterion that shows whether the instrument is valid or not. Item validity is used to find out the index validity of the test. In the computation of item validity, the validity index of number 1 was 0.764. Then the writer consulted the table of r product moment with N= 18 and significant level 5 in which r was 0.468. Since the result of computation was higher than r in the table, the index validity of item number 1 was considered to be valid. From the 10 items, 8 items were valid and 2 items were invalid. The invalid items were number 6 and 9. Since there were 8 valid items, I only used 8 items.

4.1.1.4 Reliability