The contribution of Semai to historical linguistics Collection of wordlists

various dialects are. In order to preserve a written record of the Semai dialects, copies of the compiled wordlists have been turned over to the Jabatan Hal Ehwal Orang Asli as well as the Economic Planning Unit in the Prime Minister’s Department. Ultimately, for the usage of Semai to be preserved, some form of standardization will need to take place so that important decisions, such as orthography, can be effectively made. One of the key questions regards determining the optimal dialect or dialects that allow adequate communication with all speakers of Semai. Identification, documentation, and systematic comparison of the Semai dialects are critical first steps for standardizing Semai.

1.3 The contribution of Semai to historical linguistics

The Semai language, true to its Mon-Khmer heritage, has a rich set of vowels—nearly thirty, when counting all the nasal and length features. Furthermore, as Diffloth 1976a has noted, Semai has preserved a number of disyllabic and polysyllabic words, features that have largely been lost in other Mon-Khmer languages in Southeast Asia. Thus the Semai people, as well as other speakers of Aslian languages in Malaysia, have much to offer humanity as we endeavor to reconstruct the history of the Mon-Khmer languages. It is hoped that the documentation and the reconstruction of the Semai ancestor language in this report will help to further such efforts. 2 Methodology

2.1 Collection of wordlists

A wordlist of 436 words was constructed, including words from the basic 200 Swadesh wordlist, words that are typical of Southeast Asian languages, and words that are culturally and linguistically specific to the speakers of Central Aslian languages. The items in the wordlist were arranged by semantic categories and listed in Malay and English. This wordlist was then used to elicit words from twenty-seven dialects of Semai. Dialects were selected based on a combination of information gleaned from existing literature on Semai and from asking the Semai themselves which areas spoke dialects different from their own. The following table shows the locations of the dialects selected for this research. A map showing the geographic locations of these villages is shown in Appendix A. Table 1. Wordlist locations Kampung District State Batu 17 Batang Padang Perak Bidor Batang Padang Perak Chinggung Batang Padang Perak Cluny Batang Padang Perak Rasau Batang Padang Perak Sungai Bil Batang Padang Perak Sungkai Batang Padang Perak Tapah Batang Padang Perak Gopeng Kinta Perak Kampar Kinta Perak Bota Perak Tengah Perak Tangkai Cermin Perak Tengah Perak Cenan Cerah Cameron Highlands Pahang Relong Cameron Highlands Pahang Renglas Cameron Highlands Pahang Sungai Ruil Cameron Highlands Pahang Terisu Cameron Highlands Pahang Bertang Lipis Pahang Betau Lipis Pahang Cherong Lipis Pahang Kuala Kenip Lipis Pahang Serau Lipis Pahang Lanai Lipis Pahang Pagar Lipis Pahang Simoi Lipis Pahang Pos Buntu Raub Pahang The wordlists were generally elicited using direct questioning in Bahasa Malaysia Malay. Once the complete list was elicited, the data were rearranged according to the similar phonetic segments encountered. The list was then rechecked. By grouping the elicited words together according to similar sounds for instance, all the words containing front vowels were put together, it was easier to hear the often-subtle differences between similar sounds. In some cases a recording was also made of the same Semai speaker pronouncing the words that had just been elicited. These recordings were quite helpful in clearing up remaining inconsistencies later discovered in the elicited words, and thus often avoided the need to return to the same village for further checking. The elicited wordlists were then used to determine the degree of linguistic similarity between dialects. The comparison of wordlists was used to determine the number of phonetically similar lexical items, to discover word families, to identify phonological changes in order to establish the linguistic relationship between the speech communities, and finally, to propose a reconstruction of several hundred lexical items for proto-Semai.

2.2 Language assistant questionnaires