0.451 for the first semester of 2009 tests and 0.343 to 0.382 for the second semester of 2009 tests. Meanwhile, the reliability of the test items could be considered good
whose value ranged from 0.771-0.520. The effectiveness of the alternatives was 62 - 94. It meant that the alternatives were functional.
There is a lot of research that has been conducted by using ITEMAN program. From the related studies above, those studies mention that the researchers used
ITEMAN as a tool to analyze multiple choice tests in elementary school, junior high school, and university as the population and sample of the research. As known that,
the validity in the ITEMAN is concluded by covering level of difficulty, discriminating power, and proportion of the distracters. This research discerns the
validity seen from the content validity, construct validity, and face validity. Because of that, the researcher analyzed on those sides and investigated the population which
had different knowledgeable students and multiple choice items, as the focus of this research.
2.2. Review of Related Literature
For the specific explanation about the analysis of final semester test using ITEMAN software program, the researcher explains some related literature about
quality of a test, final semester test, multiple choice tests, guidelines for constructing multiple choice items, ITEMAN software program, and assessment of multiple
choice tests using ITEMAN software program.
2.2.1. Quality of Test
One commonly used tool in assessment is a test. That is to assess the outcome of the learning process. To determine the quality of the test, it is necessary to analyze
the test before the test is given to the participants of the test. According to Arikunto 2006:205, item analysis is a systematic procedure, which will provide information
that is very specific to the test items arranged. Nunnally 1978:301 states that item analysis is extremely useful. This furnishes a variety of statistical data regarding how
subjects responded to each item and how each item relates to overall performance. From the two definitions above, it can be concluded that the analysis of the test is a
systematic activity that involves the collection and processing of data in the form of a test that is done in order to obtain information to determine a conclusion about the
quality of the test. There are two approaches that can be used to determine the quality of a test,
namely qualitative and quantitative approaches Osterlind, 1998:84. A qualitative approach is done by reviewing items and should be done before the test is tested. The
thing which is emphasized is the assessment from the aspects of material, construction, and language. While the quantitative approach is a method of test item
review based on empirical data obtained through participant responses. Item characteristics are a quantitative parameter. In determining the characteristics of the
item, there are generally three things which should be considered, namely: 1 level of difficulty, 2 discriminating power, and 3 effectiveness of distracters. These three
characteristics of the item jointly determine the quality of the item. Linn Gronlund 1995:47 define that a good test must have three characteristics, namely validity,