Recency models Referential Choice and the Environment

established as both an individual item, but also as a name of a class of items. The actual name is of the class not of the individual item in the class. It is natural then that the reference in 141b is made to the class with a pronoun as easily as it could be made to the item. From the preceding discussion we have defined the idea of reference to be used in this work. A refer- ence is an expression which designates an object. References are always made to objects in some uni- verse of discourse which may or may not be completely accounted for by objects in the real world. Two references are coreferential only if they designate the same entity, or two identical groups of entities. Any appeal to claims of one reference referring to another is clearly spurious. In this work I only use the terms anaphoric or cataphoric to mean those references which precede another coreference, or which follow a coreference.

3.3 Referential Choice and the Environment

There are four different strategies commonly used in accounting for different referential choices based on environment: recency, episodes, memorial activation, and prominence. In this section I ex- plain these models, and what their claims are. I also lay out my theory of Goal Oriented Activation, and set forth the type of data that supports this theory but does not support the other theories. The third strategy invokes the ideas of memory more directly. From this point of view the status of a referent in short term memory is what determines the use of referential form. This point of view is re- flected by Chafe and Tomlin and Pu. The fourth strategy is based around the use of referential form to promote the goals of the language producer. Prominent among these are Hinds 1977, Grimes 1978, and Fleming 1978. In the following sections each of these four approaches to reference will be explored in some depth.

3.3.1 Recency models

Recency models are based on the idea of the limited capacity of short term memory. Given this limi- tation, it is asserted that older referents are replaced by newer referents. The newer referents are then replaced by other referents, either repeated old referents or totally new referents. Simple recency mod- els are linear models that say the closer a previous reference is to the current reference the easier it is to resolve the reference. They do not allow any hierarchical organization to memory, or ranking of refer- ents. Clark and Sengul 1979 proposed a modified linear model to account for their data that showed a reference was quicker to resolve if the previous referent was in the immediately preceding clause. This model used a dual search strategy for resolution of reference. It proposes that the comprehender activates a search in the immediately preceding clause for the referent and only if that fails does the comprehender do a search in each of the preceding clauses. A second linear model of reference has developed around Givón’s attempt to account for the topical- ity of the referent. Much of the work within this framework is based on the heuristics developed by Givón 1983c for measuring REFERENTIAL DISTANCE and TOPIC PERSISTENCE . These two measures form the framework for the majority of the work in the Topic Continuity volume edited by Givón 1983b. Givón developed this system of topicality measurement in which the topicality of a given referent at a given time was not in a binary system, but rather in a “scalar, graded continuum” Givón 1983e:16. Givón developed three measures of topicality: Referential Distance RD, Topic Persistence TP, and Interference. Of these measures only referential distance relates to the theoretical stance of recency, because RD invokes clauses since the last mention. This relates directly to Clark and Sengul’s earlier model. While the recency model is often associated with Givón, he recognized that the episode struc- ture and other things influenced referential choice in both his theoretical approach Givón 1983c and his individual language investigations 1983a, 1983e. The theoretical basis used for justifying referential distance as a valid measure of topicality is that the fewer clauses since the last mention of a referent, the easier is the topic identification. Givón ties the referential distance measure to ideas about working memory Just and Carpenter 1992. Because working memory is of limited capacity, referents that have been mentioned recently are more apt to be 3.3 Referential Choice and the Environment 51 readily available, i.e., identifiable to the hearer, than referents that have been mentioned previously. A strict recency model looks at text as a linear string with no hierarchical structures. 3 It views memory as a First In First Out buffer. The stream of speech is organized along a line. As speech is initiated, the hearer holds everything in working memory until it is full. Then the first part of the stream is removed to allow room for new input. This view of referential choice can be likened to beads on a string Clark and Sengul 1979 that are accessed sequentially until a match can be made. Givón developed three numerical measures of topicality, referential distance, topic persistence, and interference. The first two had consistent operational definitions in Givón 1983b. 4 Referential dis- tance RD was the distance in clauses, since the last mention of a referent. Topic persistence TP was a measure of how long, in clauses, a referent continuously appears. While the first two counts have been consistently applied, the calculation of an interference quotient has been much more individual with the various contributors in Givón 1983b; each interprets it in their own way. Interference is a calculation of how many other competing referents have occurred between the last mention of a refer- ent. The level of competition is also scalar. It varies according to animacy, gender, and similarity of se- manticgrammatical roles. The use of these counts, particularly RD and TP have given reliable crosslinguistic data. Givón clearly did not propose distance as the sole determining factor in the choice of referential form, but as one factor, albeit a highly important one. In addition to the length of absence from the reg- ister and potential interference from other topics, he included the availability of semantic information and the availability of thematic information as components in the process of choosing referential form Givón 1983c. While Givón clearly proposed a variety of factors that influence the choice of referen- tial form, the quantifiable results that come from the RD and TP measurements overshadow the theoretical claims for other factors. Givón and the other contributors to the Topic Continuity volume Givón 1983b found a high degree of correlation between the level of RD and TP and the choice of pronouns, verbal marking, and zero anaphora. There is by no means universal conformity in that not all languages have all the devices. In those languages with free stressed pronouns, however, the average RD and TP was lower than for just verbal marking and zero anaphora. While for those languages examined that had both unstressed or clitic pronouns and zero anaphora, the zero anaphora showed a much higher level of topicality using RD and TP. Givón recognized that highly “topical” referents could move from the “temporary file” to the “permanent file” for a given discourse. “They thus often constitute exceptions to the text measure- ments that reveal the rules which govern the discourse distribution of topics that are not filed as per- manently and as uniquely” Givón 1983c:10. The weaknesses of any strict recency model are directly related to its simplicity. Ignoring effects of highly topical referents and episode boundaries results in a certain amount of variation from the means reported in the measurements. Often investigators who use the heuristics report only means without providing information about the amount of variation from the mean. One prominent excep- tion to this is Jaggar’s 1983 work on Hausa. He cites an average RD of 1.3 for zero anaphoric direct objects, but also mentions that direct objects occurring as zero anaphors have been observed with RDs of up to 16. Finally, a strict recency approach confounds pure distance with other factors. The farther two items are apart the more likely an episode boundary will occur between the two items. Secondly the farther apart two references are the more likely another referent will intervene. From Gernsbacher’s work 1990 we know that the introduction of at least some nominals suppresses other nominals. A further problem has been the confounding of the introduction of new referents with the reintroduction of distant referents. Doing this ignores a whole class of data that has a bearing on the problem of referential management. This is one of the important changes in my methodology from ear- lier work. Despite these problems, the quantifiable approach that Givón and others have taken has moved the field forward. The recency model would predict that all referents are essentially treated the same. That is partici- pants are not ranked in importance. It would also predict that there should be observable differences in 52 The Rationale for Different Referential Forms 3 Givón does not himself adopt a strict recency model in his work, however, others are apt to characterize his work in this way, either as followers or opponents. 4 Givón 1994b adjusted both the RD and TP methodologies as will be discussed below. The exact manner of making the counts changed, not the essential character of what was being counted. some measure like referential distance for the different referential forms. According to the recency model topic continuity measures should not show any discernible difference for different referential forms. The third prediction is that discourse boundary phenomena would not correlate with different referential forms. The fourth prediction is that all new referents should be introduced by the same de- vices that are used for distant referents. The final prediction is that even if there is any ranking of refer- ents it should not have any influence on the choice of referential form. In the Olo data only the prediction regarding referential distance is sustained, but even there referential distance alone does not distinguish all the referential forms.

3.3.2 Episode models