LD calculated using broad, phonemicised transcriptions

Figure 7.7. Sui cluster dendrogram using narrow transcriptions and WPGMA. In order to check the validity of these clusters we produced a multidimensional scaling MDS plot of the data. This is shown in figure 7.8. Relative phonetic distances between Sui varieties are plotted onto a two-dimensional plane. Separate Pandong and Yang’an clusters are unambiguous. The Sandong varieties, however, appear to be in more of a “dialect continuum”, ranging from RL closest in pronunciation to Pandong to JR the most distant from all other varieties. SW Southern appears almost midway between the Central Sui dialects SD, ZH and the other Southern Sui dialects JQ, SY, JR. TZ, which belongs to Western Sui by other analyses, appears to be closer to Central SD, ZH. Figure 7.8. MDS plot of LDs based on narrow transcriptions.

7.4.2 LD calculated using broad, phonemicised transcriptions

We now look at the results using broad, phonemicised transcriptions. Raw Levenshtein distances are given in table 7.4. Unsurprisingly, the figures are lower than for the narrow transcriptions. The smallest phonetic distance is 0.013, between SJ and RL both Eastern. The greatest distance is 0.123, between JR Southern and Pandong lects PD, JL and between SW Southern and JL Pandong. The overall clusters revealed, however, are essentially identical to those revealed using the narrow, phonetic transcriptions. WPGMA clustering results are shown in figure 7.9. Again, other algorithms including fuzzy clustering produced identical clusters. Table 7.4. Levenshtein distances for phonemicised transcriptions distances under 0.05 shaded in grey Sandong Pandong Yang’an Central Western Eastern Southern SD ZH AT TP TZ DJ SJ RL JQ SW SY JR PD JL TN BL 0.018 0.037 0.026 0.039 0.032 0.026 0.041 0.032 0.033 0.044 0.028 0.035 0.049 0.054 0.064 0.041 0.052 0.065 0.069 0.077 0.020 0.045 0.054 0.065 0.072 0.077 0.025 0.013 0.040 0.047 0.066 0.068 0.066 0.057 0.068 0.073 0.045 0.055 0.072 0.074 0.075 0.059 0.070 0.074 0.015 0.047 0.053 0.070 0.069 0.073 0.065 0.076 0.077 0.031 0.035 0.051 0.056 0.076 0.077 0.076 0.068 0.08 0.083 0.023 0.025 0.032 0.094 0.102 0.089 0.083 0.114 0.079 0.066 0.064 0.113 0.112 0.114 0.123 0.095 0.098 0.083 0.085 0.102 0.082 0.073 0.071 0.117 0.123 0.116 0.123 0.063 0.099 0.093 0.101 0.099 0.096 0.092 0.087 0.079 0.111 0.106 0.114 0.117 0.099 0.113 0.103 0.096 0.105 0.104 0.101 0.095 0.092 0.085 0.112 0.109 0.119 0.118 0.112 0.122 0.034 Figure 7.9. Sui cluster dendrogram using phonemicised transcriptions and WPGMA. The only significant difference between the two cluster charts in figures 7.7 and 7.9 is the overall hierarchy of the Pandong PD, JL and Yang’an BL, TN lects. Using narrow, phonetic transcriptions figure 7.7, Yang’an dark blue is grouped with the Sandong lects at a higher level and Pandong is the outlier light blue. Using phonemicised transcriptions figure 7.9, Pandong is grouped with the Sandong lects and Yang’an is the outlier. Thus the colours representing Yang’an and Pandong dialects are reversed in figure 7.9. Light blue still shows the outlier, but this time the outlier is Yang’an instead of Pandong. The clustering of LD using phonemicised transcriptions is more consistent with historical comparative analysis chapters 3 to 5 which shows that Yang’an is indeed an outlier and actually belongs to the Kam branch rather than the Sui branch. To test the validity of the clusters in figure 7.9, we produced a multidimensional scaling MDS plot based on LD calculated using the phonemicised transcriptions. This is shown in figure 7.10. In terms of the distance between the PandongYang’an clusters and the Sandong cluster, both MDS plots are extremely similar. A simple calculation of arithmetic means shows that the average LD of Pandong from all other lects is more than the Yang’an average when using phonetic transcriptions, whereas it is less than the Yang’an average when using phonemic transcriptions. 7 The difference is too small to be statistically significant. Perhaps the most obvious difference between the two MDS plots is that SW very clearly groups with Southern in the second plot whereas its position is much less certain in the first plot. In other aspects distinct Yang’an and Pandong clusters, clear clustering of the other three Southern lects, and a subclustering of TZ, SD and ZH, the two MDS plots are virtually identical. We can conclude, then, that phonemicised transcriptions produce just as meaningful clusters as narrow, phonetic transcriptions when using LD, and that in some respects clusters based on phonemicised data are clearer and correlate more closely with historical analysis. Similar studies must be done on a variety of languages and dialects to determine whether this is a chance result for Sui or whether it is generally true for all languages. Figure 7.10. MDS plot of LDs based on phonemicised transcriptions.

7.4.3 Sandong SD and Zhonghe ZH: the most representative varieties