Site selection Lexical similarity results

3 Methodology

3.1 Site selection

Figure 4 shows the data collection sites for Lohorung Pangma, Dhupu, and Angala, Yamphu Hedangna, and Southern Yamphu Devitar. Figure 4. Map of data collection sites in Sankhuwasabha district. ©2011 SIL International Figure 5 shows the data collection site for Southern Yamphu Rajarani in Dhankuta district. Figure 5. Map of data collection sites in Dhankuta district. 3.2 Subject selection The quota sampling plan used in this survey was based on the four variables of gender, age, education, and geographic location, as these factors are known to influence language use and attitudes. Also, the people in these demographic groups often have varying levels of exposure to other languages. ©2011 SIL International Within these demographic groups, we required subjects to meet four screening criteria for wordlists and the RTT: 1. The subject is “from the village,” defined as having grown up in the village, living in the village at present, and, if they have lived elsewhere, it was not for a significant amount of recent time. 3 2. Subject has at least one parent from target variety. 3. Subject has at least one parent who is a mother-tongue speaker of the variety, is from the village under study, and that parent spoke the variety with himher when heshe was a child. 4. Subject speaks the variety first and best. Only criteria one and two were required for a subject to be eligible to respond to the informal interview schedule. Note that the fourth criterion was relaxed in Dhupu. Since Lohorung language vitality is so low there, it was difficult to find speakers who could fit this criterion.

3.3 Research methods

Background research was conducted in Kathmandu prior to fieldwork. Lohorung speakers from Pangma village and Yamphu speakers from Hedangna and Num villages were interviewed and assisted with preparing various tools. The participatory methods used in this survey were facilitated by Santa Man Lawoti and Dal Bahadur Limbu in Sankhuwasabha and Dhankuta districts in September 2009. During fieldwork, wordlists, recorded text tests RTT, and informal interviews were administered in Gairi Pangma, Dhupu 4 , Angala, and Hedangna, Sankhuwasabha district, and Rajarani, Dhankuta district.

3.3.1 Wordlist comparisons

Description and Purpose: A comparison of wordlists to estimate the degree of lexical similarity between the speech varieties the word lists represent. Procedure: Wordlists were elicited in Nepali from mother-tongue Lohorung and Yamphu speakers and were transcribed using the International Phonetic Alphabet IPA. In order to ensure that the wordlists represent the speech variety in each location, a group of at least three speakers was involved in the wordlist elicitation. A lexical similarity analysis was carried out on each pair of wordlists. A complete description of wordlist comparison and the data collected can be found in Appendix A. Advantages: Data collection is relatively efficient. Wordlists can provide some broad insights into possible dialect groupings. Disadvantages: Above certain levels of lexical similarity, wordlists cannot give conclusive evidence of intelligibility between speech varieties compared.

3.3.2 Recorded Text Test RTT

Description and Purpose: Subjects listen to recorded stories, with comprehension questions asked within the stories. After the subject has listened to the stories, questions regarding language attitudes are 3 This criterion was extended in a few cases to include subjects who had grown up in nearby villages, where the speech variety is still the same as the village where the interview took place. It is difficult to define a specific time period e.g. more than the last five years for a significant amount of recent time. Thus, this criterion is intentionally subjective as it depends on how long the subject lived elsewhere and how long they have been back in the village relative to their age. 4 The RTT was not administered in Dhupu due to low Lohorung language use in that village. asked. This helps in the assessment of subjects’ understanding of and attitudes toward actual samples of the language from various areas. Procedure: We recorded a narrative story from a Lohorung speaker and a Yamphu speaker. It was then played for people, in other Lohorung and Yamphu communities, who were not told the story’s place of origin. As subjects listened to each story, they answered comprehension questions recorded in their own dialect about the story. After listening to each story, subjects answered questions about their understanding of and opinions toward the speech variety used by the storyteller. The tests were administered first in the community the speaker was from to ensure we had a representative recorded text from that variety. This is referred to as the home town test HTT. The stories used and responses for the RTT can be found in Appendix B. Advantages: By using actual samples of selected speech varieties, an initial assessment of intelligibility and attitudes can be made. Disadvantages: This test can be time consuming to develop. The type of RTT used in this survey only evaluates basic understanding of narrative texts. In addition it does not measure reading and writing ability in the second dialect.

3.3.3 Informal interviews

Description and Purpose: A prepared interview schedule based on the “Sociolinguistic Questionnaire A,” used by the Linguistic Survey of Nepal guided interaction in order to gather information regarding specific sociolinguistic issues, while allowing freedom to inquire or discuss issues further if it might provide additional information relevant to the research questions of the survey. An additional interview schedule dubbed the “Knowledgeable Insider Questionnaire” was used to investigate issues relevant to each village context, which are more factual in nature than individual patterns of language choice or attitudes. Procedure: The interview schedule was written in English and Nepali, and interviews were conducted in Nepali. An example of this procedure would be asking “What language do you usually speak with your children?” as on the planned interview schedule. If the interviewee happened to respond with two or more languages, we followed up with questions such as “Do you speak one of these languages more often than the other?” This allowed the interviews to focus more on patterns of language use and their impact on language vitality and shift than on other topics, such as generalized trends of multilingualism. The interview schedule, biographical data of respondents, and responses can be found in Appendix C. The Knowledgeable Insider Questionnaire and responses are in Appendix D. Advantages: Depending on the length of the interview schedule, the time in administration can be minimal, allowing for relatively large numbers of people to be interviewed. The informal nature of the interviews helps subjects feel comfortable and share openly, while allowing greater depth and context for their responses. Disadvantages: Informal interviews are limited in that subjects may only report what they want the researcher to hear, or what they believe the researcher would like to hear.

3.3.4 Dialect Mapping participatory method

Description and Purpose: This method initiates discussion of existing dialects, their geographic location, and perceived levels of comprehension between varieties. Procedure: Participants were invited to describe where their language is spoken and the different varieties spoken. They then identified how different other varieties of their language are from their own and how well they understand other varieties. They then identified which variety they use in conversation with people from each area, and identified which variety they believe has the greatest potential to be a written standard. The complete steps and data collected through Dialect Mapping can be found in Appendix E. Advantages: Provides a visual representation of which communities participants interact with, how well participants feel they understand other varieties, how their language may or may not be altered in these circumstances, and their attitudes towards each variety. Disadvantages: May seem complicated or redundant to participants. Although they are a useful indicator, emic perspectives do not always match linguistic reality.

3.3.5 Domains of Language use participatory method

Description and Purpose: This method aids the investigation of language vitality. Its purpose is to help participants from the language community describe the varying situations in which they use L1, the LWC, or other languages and then identify the domains and languages that are used more frequently. Procedure: Groups of Lohorung people were asked to identify which languages they speak on a regular basis and then list a variety of domains in which each of those languages is used. The participants then categorized the domains by their frequency. A full description of the Domains of Language Use tool and data gathered for this survey can be found in Appendix F. Advantages: This method does not assume domains or frequency of language use, rather, the community suggests and discusses domains and frequency of language use from their own perspective. Disadvantages: Categorizing domains may be confusing or difficult. Some people may not be comfortable making comparisons.

3.3.6 Bilingualism participatory method

Description and Purpose: This method helps language community members describe the demographics and patterns of multilingualism within their community. Procedure: Participants listed the languages spoken most frequently in their community. They then described categories of people who speak each language well, the relative size of each category of speakers, and which categories may be increasing most quickly. A complete description of the tool as well as results for this survey can be found in Appendix G. Advantages: This tool does not assume languages spoken in the community, but allows the community to name and discuss relevant languages themselves. Disadvantages: This method is not very accommodating to multilingual situations exceeding the complexity of bilingualism. Does not help document or illustrate community attitudes towards their bilingual context.

3.3.7 Appreciative Inquiry participatory method

Description and Purpose: This method helps community members discuss what they are proud of, what desires they have for their language, and begin planning for how to achieve those dreams. It shows what the community regards as priorities for their own language-based development. Procedure: Participants discuss things in their L1 or culture that have made them happy or proud. They then consider how to build upon the good things they identified, or list their own dreams for their language. Next, they discuss which dreams might be accomplished sooner and which ones will take longer. Then, they identify which dreams are most important to them. Finally, participants choose a dream they would like to create a plan for, including first steps, who will be involved, and when the plan will be put into action. A full description of the Appreciative Inquiry method and results can be found in Appendix H. Advantages: This method is very adaptable. Its emphasis is on what the community can do now to work towards their dreams for language development. Appreciative Inquiry helps build a concrete context by which to understand actual priorities that a community has for its own development. Disadvantages: If not carried out appropriately, this method may raise false hopes of outside assistance in reaching their goals. 4 Language variation and attitudes One of the primary questions this sociolinguistic research seeks to answer is: What are the relationships between Lohorung [lbr], Yamphu [ybi], and Southern Yamphu [lrr]? Based on lexical similarity percentages, recorded text testing, attitudinal questions, and observations, we have concluded that Yamphu spoken in Hedangna and Southern Yamphu spoken in Rajarani are separate but related languages to Lohorung spoken in Pangma. This section is divided into sections that address the relationships between varieties according to the results of our research. This includes levels of lexical similarity and the degree of comprehension between the varieties, as well as attitudes expressed on questions before and after the recorded text test.

4.1 Relationship between Lohorung Pangma and Yamphu Hedangna

4.1.1 Lexical similarity results

Lexical similarity is measured by comparing the phonetic similarity of vocabularies among speech varieties. In this study, we used the procedures outlined in Blair 1990:31–32, described further in Appendix A. This method involved collecting and comparing a standardised wordlist. The researchers transcribed the wordlists using the International Phonetic Alphabet IPA, shown in Appendix A. These wordlists were checked with other mother-tongue speakers from the same area in order to ensure accuracy. Lexical similarity calculations were made using the WordSurv computer program and expressed as percentages. A total of five wordlists were compared in this study. Three Lohorung, one Yamphu, and one Southern Yamphu variety were collected and will be discussed in this section and in 5 Dialect variation and attitudes section 5. For comparison between language varieties, the sites were chosen using information regarding the locations of Lohorung, Yamphu, and Southern Yamphu population centers. To measure lexical similarity between Lohorung and Yamphu, wordlists were elicited in Pangma Lohorung and Hedangna Yamphu. The lexical similarity between these wordlists is 65 percent. Blair 1990:23 states that if a lexical similarity is below 60 percent, no intelligibility testing is required. Sixty- five percent shows a low level of lexical similarity, but still warrants intelligibility testing. To investigate further, we administered Recorded Text Tests.

4.1.2 Intelligibility testing results

Recorded Text Testing RTT was used to evaluate comprehension between Lohorung and Yamphu. An RTT was developed in each location, using a Lohorung story from Pangma and a Yamphu story from Hedangna. Further description of the testing procedure can be found in Appendix B. Table 1 displays the results of the intelligibility tests. The gray sections display the results of the hometown test HTT, whereas the others are RTT results. Table 1. RTT results for Lohorung Pangma and Yamphu Hedangna speakers Test Location Story Hedangna Pangma Lohorung Pangma Avg 65 89 SD 23.5 11.0 n= 10 10 Yamphu Hedangna Avg 91 44 SD 9.6 17.8 n= 10 10 In order to interpret RTT results properly, three pieces of information are necessary. The first is average percentage shown as Avg in Table 1, which is the mean or average of all subjects’ individual scores on a particular story at a particular test site. Another important piece of information is a measure of how much individual scores vary from the community average, which is known as standard deviation SD in Table 1. The third important component of the data is the size of the sample of people tested on each story n= in Table 1. Blair 1990:25 has written about the relationship between test scores and their standard deviation, as seen in Figure 6. Standard Deviation High Low Average Score High Situation 1 Many people understand the story well, but some have difficulty. Situation 2 Most people understand the story. Low Situation 3 Many people cannot understand the story, but a few are able to answer correctly. Situation 4 Few people are able to understand the story. Figure 6. Relationship between test averages and standard deviation. In general, average RTT scores of around 80 percent or higher with accompanying low standard deviations usually ten and below; high standard deviations are about 15 and above are taken to indicate that the subjects from the test point display adequate comprehension of the variety represented by the recording. However, RTT average scores lower than 60 percent are interpreted to indicate inadequate comprehension. The results of each HTT were not ideal. Average HTT scores of 95 percent or higher, with a low standard deviation less than 10 to 12 points are preferred. Average HTT scores of 89 percent Lohorung and 91 percent Yamphu show that the tests could have been stronger. Despite that, the differences in scores and standard deviations between each HTT and RTT reveal valuable information. The average score of Yamphu speakers in Hedangna on the Lohorung RTT was 65 percent. With a high standard deviation of 23.5, these results show that many people cannot understand the story, but a few are able to answer correctly. There is no apparent correlation between RTT scores and factors of gender, age, education, or initial contact. The average score on the Yamphu RTT for Lohorung speakers who took the test in Pangma was 44 percent. This is a low average score. The standard deviation among the scores was 17.8. This shows that most people do not understand the story, though some scored higher than others. With such a high standard deviation, language contact may play a role. The data shows that the three Lohorung participants with the lowest scores have never been to Hedangna. Men scored higher than most women and all said they had been to Hedangna. Their higher scores could be due to more extensive travel and contact with Yamphu. Respondents from Hedangna had both a higher average score and higher standard deviation than those from Pangma. It is possible that people in Hedangna understood the Lohorung story better than people in Pangma understood the Yamphu story because Yamphu people are more exposed to Lohorung than Lohorung people are to Yamphu. Hedangna is more remote than Pangma. People from Hedangna travel through Pangma to reach the district headquarters, but people from Pangma have fewer reasons to travel to Hedangna.

4.1.3 PrePost-RTT question results

After Yamphu speakers in Hedangna listened to the Lohorung story, we asked them a series of questions related to the language they heard in the story. When asked where they think the storyteller is from, most respondents recognized that the storyteller is Lohorung. Two believed he was from a different Yamphu area Num or Devitar and two others believed he was from another country India or Germany. Respondents were also asked if they liked the speech of the storyteller. Half of the respondents thought the Lohorung speech was “OK” while the other five respondents had a variety of opinions about it. The responses were quite varied, with little correlation to contact with Lohorung or their RTT score. There is a high degree of contact with speakers of Lohorung in Pangma at least eight of ten have been to Pangma, which may explain why there is a higher degree of comprehension of Lohorung among Yamphu respondents than there is of Yamphu among Lohorung speakers. Participants were also asked how much they felt they understood the story and to identify how different the storyteller’s speech was from their own. Only one respondent felt they understood all of the story and they said the speech was very different from their own. All but one said the language is different from theirs. While Lohorung and Yamphu are separate languages, their identity as “brothers” and shared vocabulary create an affinity that allowed several respondents who said they only understood half of the story to also say the storyteller’s speech was only a little different from their own. Post-RTT questions in Pangma show different patterns. After listening to the Yamphu Hedangna story, every respondent identified the storyteller as being from Hedangna. While some respondents clearly recognized the speech as Yamphu, others called it their own variety Lohorung. Despite the fact that many people scored quite low on the RTT, they still said the speech was “good,” “OK,” or even that it was their own language. There is a slight correlation between the scores of those who have travelled to Hedangna and those who have not. We only know that five respondents have been to Hedangna and the average score among them is 52 percent with a standard deviation of 11.5. This implies that few if any of the people who have been to Hedangna understand the story. The average score of those who have not been to Hedangna is lower with a higher standard deviation.

4.1.4 Summary

There is a distinct difference between the Lohorung spoken in Pangma and the Yamphu spoken in Hedangna. Many of the post-RTT comments reflect the strong ethnic identity of the Lohorung and Yamphu being brothers historically. This identity appears to supersede comprehension when it comes to attitudes between the two groups.

4.2 Relationship between Lohorung Pangma and Southern Yamphu Rajarani

4.2.1 Lexical similarity results

The lexical similarity between the Lohorung in Pangma and the Southern Yamphu in Rajarani is 66 percent. While this is a rather low percentage of lexical similarity, intelligibility testing was necessary to confirm that they are separate languages.

4.2.2 Intelligibility testing results

The results of the Lohorung Pangma RTT administered to people in Rajarani are displayed in Table 2. Table 2. RTT results for Southern Yamphu speakers Rajarani Rajarani scores Lohorung Pangma story Avg 61 SD 19.3 n= 10 The average score for people who took the RTT in Rajarani was low at 61 percent, with a high standard deviation of 19.3. Usually, contact is a primary factor in high standard deviation. However, none of the RTT participants reported having ever been to Pangma, and have not even been to Sankhuwasabha district. There is also no predictable demographic influence on scores.

4.2.3 PrePost-RTT question results

After listening to the Lohorung Pangma story, participants were asked, “What village do you think the storyteller is from?” No one could identify the location of the storyteller’s speech variety. Even though RTT scores were low and most participants said they did not understand all of the story, most seven of ten reported that the speech is only a little different from their own. Because none of the participants identified where the storyteller was from, these responses were based on the speech sample itself, not on the linguistic identity of the speaker.

4.2.4 Summary

Low lexical similarity and RTT scores confirm that Lohorung and Southern Yamphu Rajarani are different languages. Speakers of Southern Yamphu identify themselves ethnically and linguistically as Yamphu. However, the majority of respondents in Rajarani felt that Lohorung speech is only a little different from their own language. 5 Dialect variation and attitudes Descriptions of dialect boundaries are informed by gathering lexical similarity information and testing intelligibility between language areas. In order to investigate potential dialects within Lohorung, we administered the Lohorung Pangma Recorded Text Test RTT in Angala, elicited wordlists in Pangma, Angala, and Dhupu, and conducted informal interviews in each location. This section will discuss the findings of these tools by comparing Angala and Dhupu with Pangma.

5.1 Lexical similarity results

In each Lohorung village we collected and analyzed a wordlist based on the guidelines in Appendix A. Sites were selected based on information regarding Lohorung population and geographic location. Pangma was chosen based on its status as the largest and oldest Lohorung community. Angala has a high Lohorung population and is in the western part of the Lohorung area. Dhupu was chosen as a data collection site because it is one of the easternmost Lohorung villages. Upon arrival, we found that, while there is a strong ethnic identity among Lohorung, there are very few people in Dhupu who speak the Lohorung language. The Lohorung wordlist participants from Dhupu did not fully meet our screening criteria, due to Lohorung not being the language they spoke best. However, given the low vitality of Lohorung in Dhupu, a wordlist was elicited to document the variety that was spoken in Dhupu. Lexical similarity percentages among the three wordlists compared are shown inTable 3. Table 3. Lexical similarity percentages matrix Gairi Pangma 90 Dhupu 88 88 Angala Analysis of Dhupu’s wordlist shows 90 percent lexical similarity with Pangma, which points towards likely high intelligibility between the Dhupu and Pangma varieties of Lohorung. Wordlist comparison between Gairi Pangma and Angala reveals a lexical similarity of 88 percent. This is suggests that there may be intelligibility between any two of these two varieties, but testing is needed to confirm that hypothesis.

5.2 Intelligibility testing results