The testing instruments Design, Testing Instruments, and Dependent Variables
tests where a contribution was expected to be given by individuals without prompting from peers. One way that such testing can be accomplished is given in detail by Stringer and Faraclas
1987:87–90, where three venues are used
•
with participants being instructed in one
•
with individuals moving from there to the second venue for testing, and then
•
proceeding to the third venue to take part in supervised activities until all have been tested. When this procedure was tried in the adult program in Urat, it was unacceptable; there were not
enough venues or testing personnel and treating people in such an individual way seemed to cause stress.
On the first occasion, the testing was carried out in the one available venue the communal church building with three testers. Participants were instructed to wait their turn and then leave
after being tested. In the course of the testing, one young woman became so stressed she began crying and others showed signs of anxiety, while those who had been tested were reluctant to
leave. The importance of eliminating anxiety from literacy learning is discussed by Downing 1982. As the program progressed, there was less tension shown when giving individual
responses. At this stage in the program, it was obvious that another more comfortable way had to be found to collect assessment data for comparison.
In the test on the first occasion, a number of procedures were included to make the students feel at ease. The test was explained by the test supervisor so that the participants would know
what to expect. All students were given a piece of paper with a sentence printed on it on which they were asked to write their names. This was a familiar activity which all could accomplish.
Students were then asked to read to one of the testers: first, a new sentence and then one they had been given. All contributions were recorded on audio-tape. Next, the tester asked the student to
point out specific items, that is, two words and three syllables from the sentences read. Finally, the tester chose a word from a list of known key words and asked the student to write it from
memory and then create and write a sentence including the word.
In giving a different sentence to the students individually, it was expected that they would try to read it to add to their confidence before being asked to read it to the tester. This worked
well for the first few, but those who could not read asked their peers to help them read their sentences correctly. The sentences were changed for these participants and the test went on, but
much closer supervision was needed if we were to use the same procedure in future tests.
One way to avoid some of these problems was to consider collecting more descriptive and interpretive data for analysis. Such data would need to be collected by each teacher in the course
of the teaching program. Considering the short training, however, the lack of materials and places to keep contributions from each student, such data would be difficult to collect and
analyse. At the end of the program, the teachers were certainly aware of the progress of each of their students, but to require informal, interpretive assessments to be recorded each week for
each student seemed too much to expect from volunteer teachers in the circumstances in which they worked. It would detract, also, from the systematic nature of the comparison.
The original testing format was continued with a number of changes to help relieve stress. In the test format there were four components: recognition of words and elements within words,
reading, comprehension, and writing see Appendix E for the final test format. As the classes progressed, the learners were less anxious about being asked to give individual contributions. At
the beginning of each test, the participants were given more detail on what to expect to make them feel more at ease.
For each test there were two or three trained testers to interact with individual students. The first two items tested were identification of words and elements within words to give an
opportunity for each person to succeed before the reading section. Although it was less accurate, it was more convenient for the testers to have the students point to the item to be identified rather
than have multiple copies where each item could be marked in some way. To help alleviate tension, any negative feedback for an incorrect response, as well as any indication of the correct
response, was discouraged.
To control for comparability and continuity between groups, it would have been preferable for the researcher to do all of the testing, but the physical and cultural situation made that
impossible. With the help of community volunteers the process became more manageable; the teachers were able to continue with their classes while individuals went to village houses nearby
for testing. Some difficulties occurred when the testers had little or no experience with literacy testing. Although each tester practised in trial situations before the tests, some difficulties
occurred with the use of the cassette recorders, the format and language of testing, and prompting, which is acceptable in such a cultural setting.
The underlying cultural setting of group-oriented activities made it more acceptable for students to return to the classroom after being tested. To help control for a contamination effect
through telling and copying, two or three similar excerpts for reading were prepared. These excerpts were short, covered similar content each in different contexts, and contained a similar
number of words of equal difficulty. These words included the same elements but with a variety of word formations and syntax. Testers rotated the texts so that students did not know which
actual text to expect. In the final test, all students were expected to read from four longer excerpts of texts, three of which had two different but similar sections for rotation.
There was some misunderstanding in communication when testing in both of the programs. In the Tok Pisin situation, the second language was not always understood. The Urat program
proved more difficult because it was necessary to translate the test instructions into Urat and check them for accuracy. For the early tests in Urat, even after using different drafts in trials,
there were still misunderstandings. One such misunderstanding became clear when the responses to the comprehension questions continued to be other than were expected. During testing, after
the participant re-told what he or she thought had happened in the passage read, there were two questions asked: “what do you think might have happened before that?” and, “what do you think
might have happened after that?” It was necessary to make the questions generic because of the different texts in the instrument and the different responses.
When the Urat translation of the test format was translated back into Tok Pisin, which was used by the researcher to check the sense, the meaning was acceptable. But it was not until an
English speaker translated the questions back into English that the misunderstanding became clear. The two questions translated as: “can you remember think, sort out what comes first?”
and, “can you remember what comes at the end?” Thus the responses were correct as repeats of the first and last instances in the passage, but not for the expected projection of thought outside
the passage read, giving clear evidence of understanding.
Since there were two different instructional methods involved, the texts used on the early testing occasions were prepared according to the constraints of the material taught in the primer
series. The degree of difficulty depended on the amount of material already taught. For the
reading section of the first test in the Urat program, a number of sentences were prepared by the linguist, who followed constraints dictated by the overlap of content taught in the two primer
series. For example, such sentences were: Ta sisipe metapa. ‘She will taste the breadfruit’; and Ti wasme wi wat.
‘She quit the game and came’. Similar texts were prepared for the tests in the Tok Pisin program by the researcher.
The test on the second occasion was based on the primers; for Urat the words, syllables, and texts were read from the primer pages, but for Tok Pisin separate texts were prepared. The
purpose for using the primers in Urat was to ascertain if the teachers were teaching the content adequately. For Urat there were three comparable sets of readings based on common vocabulary
from the Word-Building Track primer and the Gudschinsky primer. These sets were to be rotated; one for each student. For Tok Pisin there were three comparable texts prepared. In
addition, students were asked to read one new sentence and to write a dictated sentence. Since the portions to be read for occasions 1 and 2 were based on known material and were not
extensive, answers to comprehension questions were not relevant.
The third and fourth tests were fuller versions of the first test see Appendix E. For the third test, longer portions were prepared and rotated between participants. The tester read the first
sentences to introduce the topic and help the learners to be at ease and read more confidently. The learners were expected to read the final sentences.
The final test included four different excerpts of texts. These texts were selected on the basis of familiar and unfamiliar material. The first selection came from a familiar story of a predictable
legend which had been read in class. The second selection came from a story about an accident to a man in the village. The third selection was an excerpt from a legend, with unfamiliar
language and circumstances. Similarly, the fourth selection had some unfamiliar language with unpredictable content.
The data collected from the learners was scored in a similar way for all of the tests. The appropriateness of each variable for each particular test was governed by the scope of material in
the test and the degree of expectation of skills learned. In the following section, a description of the variables and scoring procedures is given.