6 An extended introduction to the Central Taic languages and a review of previous research is available in
Johnson 2010.
2 Lexical Comparison
2.1 Lexical Similarity Percentages “Lexicostatics” Among Zhuang Languages
In prior surveys conducted under the auspices of SIL International, a lexical similarity percentage method sometimes called “lexicostatics” is often used in order to narrow down the potential range of mutually
intelligible dialects Casad 1974, Blair 1990, Bergman 1990. Phonetically similar words or “apparent cognates” are identified based on systematic criteria, and a percentage of similar lexemes is calculated
for each pair of languages in the analysis. In the experience of SIL’s numerous dialect intelligibility surveys, a one-way correlation has been discovered between this type of analysis usually based upon the
Swadesh 200 wordlist and the possibility of mutual intelligibility of related language varieties. Simons 1979 shows that nearly always a pair of languages sharing less than 60 will not be adequately
understood by speakers of the other, even on simple basic narratives. Grimes 1997 validates Simons’ 60 in as much as wordlist similarities lower than this percentage can allow us to assume inherent
intelligibility to be unlikely,
3 4
but he makes the case that above this level of similarity, nothing can be assumed about intelligibility from wordlist similarity percentages. In fact, we do have numerous
examples of languages with quite high lexical similarity for which mutually inherent intelligibility levels are quite low, and we know that even inherent intelligibility is not always equal in both directions, as it
can be affected by various factors such as phonemic mergers and splits, retention of archaisms, etc.
In the case of this survey, all of the Zhuang languages of China demonstrate a fairly large percentage of cognate vocabulary. Although the Central Taic Zhuang languages that is, so-called “Southern Zhuang
dialects” reportedly have more lexical variation than the Northern Taic Zhuang languages “Northern Zhuang dialects”, among Central Taic languages in China there is an average lexical similarity of around
70 among central Taic Zhuang languages in China, according to Wei and Qin 1980, and on average 65 shared vocabulary with the Northern Taic varieties, after removing recent Chinese loanwords. Wei
and Qin state that the Dai Zhuang and Nong Zhuang languages
4 5
share 64 of the core vocabulary, excluding recent Chinese loans. Prior to the fieldwork stage of our research, we independently obtained
similar results by calculating lexical similarity percentages for the subset of the data presented in Zhang et al. 1999 that were included in our own our wordlist:
3 4
We are using the term “inherent intelligibility” to define that type of comprehension due to linguistic similarity between the two varieties, not due to speakers’ exposure to the dialect in question or conscious or unconscious language acquisition, which
we refer to as “acquired comprehension.” Of course, when speakers of related dialects are in frequent contact with each other, it becomes impossible to separate out these two types of comprehension.
4 5
Dai Zhuang and Nong Zhuang are referred to as the Wen-Ma and Yan-Guang sub-dialects of the Southern dialect of Zhuang in Wei and Qin’s work 1980.
7
Lexical Similiarity Among Zhuang Languages
Central Taic Southern Zhuang
Northern Taic Northern Zhuang
W en
sha n D
ai
Z hu
ang G
ua n
gna n N
o n
g
Z hu
ang Y
ans ha
n N o
n g
Z hu
ang Ji
ng xi
Y an
g
G ua
ng xi
D eba
o Y ang
G ua
ng xi
G ua
n gna
n Y ei
Z hua
ng Q
iu b
ei Y
ei Z
hu ang
S ta
nda rd
Z hua
ng
W um
in g
, G X
Wenshan County Dai Zhuang “Tu” 68 67 64 63 57 55 54
Guangnan Couny Nong Zhuang 68
87 73 72 72 66 63 Yanshan County Nong Zhuang
67 87 71 72
68 64
61 Jingxi County Yang Zhuang Guangxi
64 73 71 90 68
64 68
Debao County Yang Zhuang Guangxi 63
72 72
90 68
64 69
Guangnan County Yei Zhuang “Sha” 57 72 68 68 68 83
71 Qiubei County Yei Zhuang “Sha”
55 66
64 64
64 83
69 Standard Zhuang Wuming, Guangxi
54 63 61 68 69 71 69 In calculating the above similarity percentages, we made use of a revision of Casad and Blair’s method
developed by Noel Mann of SIL International to better fit the syllable structures of tonal languages, called “syllostatics” Mann 2005. It should be noted that there is naturally a certain margin of error to
all of these percentages, though we have no meaningful way to compute that average statistically. Also of note is that in computing these percentages we have neither established that similar pairs are true
historical cognates, nor that dissimilar pairs are not historically related. Rather, the method used requires us to look for similar syllables based on similarity of segments in terms of articulatory distinctive features
and regular correspondences. In some cases, pairs of words are deemed to be similar because the same phonological correspondence is evidence at least in two other instances in the dataset, although
phonetically, the syllables are distinct enough that listeners might not recognize them as the same word upon first hearing. So, while these “lexical similarity percentages” may be slightly lower than the
percentage of true cognates, they are usually higher than the perceptions of speakers who have not had prior exposure to the other variety.
5 6
In any case, in Wei and Qin’s percentages and in our own calculations, what we see is that virtually all of the Zhuang languages remain similar enough lexically to be within the range of potential inherent
intelligibility, according to the 60 similarity cut-off established by Grimes. The only combinations excluded by this criteria are those of the Wenshan Dai Zhuang with the northern Taic varieties, both the
two wordlists from within Yunnan province: Qiubei and Guangnan, and that of Wuming county, Guangxi the government standard dialect for the Zhuang nationality. The other pairs, all being above 60 similar,
leave open the possibility of potential inherent intelligibility, even between the central Taic varieties Nong and Yang Zhuang and the Northern Taic varieties. Therefore, in this particular language cluster,
lexicostatics, no matter what method is used to calculate similarity percentages, is of little use in identifying intelligibility groups and potential reference dialects for language standardization and spelling
conventions.
5 6
We thank intern Lim Chiong for her work in calculating these lexical similarity percentages.
8 Intelligibility testing, questionnaire results and historical ethnic identity have established that there is
indeed a significant difference between the Nong Zhuang and Dai Zhuang languages, to the degree that there is very little inherent intelligibility between these two groups, and almost none with northern Taic
varieties like the Wuming Standard Zhuang.
6 7
The Min Zhuang have not been previously researched to our knowledge, and in fact, are only mentioned in one prior publication to our knowledge Kullavanijaya
and L-Thongkum 1998, but they also seem to have a distinct ethnic identity and appear unable to understand either Nong or Dai Zhuang. So, based on the intelligibility and language use data presented
above, we are confident in identifying these three as distinct language groups, although a majority of their lexicon seems to be closely related.
2.2
Lexical Comparison from an Historically Informed Phonological Approach
Because wordlist similarity percentages yielded little information as to the distinctives of the various Central Tai languages of Yunnan, we decided to compare the languages from a phonological perspective.
A full historical analysis, with the goal of accurately describing the linguistic history of the central Tai Zhuang languages is beyond the scope of the present research. Others such as Li 1977, Luo 1997, and
Fine 2005 have already addressed this issue. So our approach will be primarily practical and synchronic, though we will take account of the proposed reconstructions of the above works, in order to more efficiently
organize the data.
Although extensive wordlist collection was conducted during the Zhuang language survey by the Chinese Academy of Social Sciences during the late 1950s Zhang et al. 1999, only three Southern Zhuang
locations in Yunnan were included two Nong villages in Yanshan and Guangnan counties, and a Dai Zhuang village in Wenshan county. As there is a good deal of phonological variation within the Yunnan
Central Taic languages, we felt it useful to collect shorter wordlists in additional locations in order to better understand the pronunciation differences, and to determine how Yunnan Southern Zhuang speakers
could best utilize the modern Zhuang orthography to represent their languages in a way that facilitates both use by writers and comprehension of written materials by readers. Finally, we also hope to
supplement previously available Central Tai data, as most comparative research to date has had to rely on a small number of representatives from the Central Tai branch.
2.3 Wordlist Construction