Figure 7.7. Sui cluster dendrogram using narrow transcriptions and WPGMA. In order to check the validity of these clusters we produced a multidimensional scaling MDS plot
of the data. This is shown in figure 7.8. Relative phonetic distances between Sui varieties are plotted onto a two-dimensional plane. Separate Pandong and Yang’an clusters are unambiguous. The Sandong
varieties, however, appear to be in more of a “dialect continuum”, ranging from RL closest in pronunciation to Pandong to JR the most distant from all other varieties. SW Southern appears
almost midway between the Central Sui dialects SD, ZH and the other Southern Sui dialects JQ, SY, JR. TZ, which belongs to Western Sui by other analyses, appears to be closer to Central SD, ZH.
Figure 7.8. MDS plot of LDs based on narrow transcriptions.
7.4.2 LD calculated using broad, phonemicised transcriptions
We now look at the results using broad, phonemicised transcriptions. Raw Levenshtein distances are given in table 7.4. Unsurprisingly, the figures are lower than for the narrow transcriptions. The smallest
phonetic distance is 0.013, between SJ and RL both Eastern. The greatest distance is 0.123, between JR Southern and Pandong lects PD, JL and between SW Southern and JL Pandong. The overall
clusters revealed, however, are essentially identical to those revealed using the narrow, phonetic transcriptions. WPGMA clustering results are shown in figure 7.9. Again, other algorithms including
fuzzy clustering produced identical clusters. Table 7.4. Levenshtein distances for phonemicised transcriptions distances under 0.05 shaded in grey
Sandong Pandong
Yang’an Central
Western Eastern
Southern
SD ZH
AT TP
TZ DJ
SJ RL
JQ SW
SY JR
PD JL
TN BL
0.018 0.037 0.026
0.039 0.032 0.026 0.041 0.032 0.033 0.044
0.028 0.035 0.049 0.054 0.064 0.041 0.052 0.065 0.069 0.077 0.020
0.045 0.054 0.065 0.072 0.077 0.025 0.013 0.040 0.047 0.066 0.068 0.066 0.057 0.068 0.073
0.045 0.055 0.072 0.074 0.075 0.059 0.070 0.074 0.015 0.047 0.053 0.070 0.069 0.073 0.065 0.076 0.077 0.031 0.035
0.051 0.056 0.076 0.077 0.076 0.068 0.08 0.083 0.023 0.025 0.032 0.094 0.102 0.089 0.083 0.114 0.079 0.066 0.064 0.113 0.112 0.114 0.123
0.095 0.098 0.083 0.085 0.102 0.082 0.073 0.071 0.117 0.123 0.116 0.123 0.063 0.099 0.093 0.101 0.099 0.096 0.092 0.087 0.079 0.111 0.106 0.114 0.117 0.099 0.113
0.103 0.096 0.105 0.104 0.101 0.095 0.092 0.085 0.112 0.109 0.119 0.118 0.112 0.122 0.034
Figure 7.9. Sui cluster dendrogram using phonemicised transcriptions and WPGMA. The only significant difference between the two cluster charts in figures 7.7 and 7.9 is the overall
hierarchy of the Pandong PD, JL and Yang’an BL, TN lects. Using narrow, phonetic transcriptions figure 7.7, Yang’an dark blue is grouped with the Sandong lects at a higher level and Pandong is the
outlier light blue. Using phonemicised transcriptions figure 7.9, Pandong is grouped with the Sandong lects and Yang’an is the outlier. Thus the colours representing Yang’an and Pandong dialects are reversed
in figure 7.9. Light blue still shows the outlier, but this time the outlier is Yang’an instead of Pandong.
The clustering of LD using phonemicised transcriptions is more consistent with historical comparative analysis chapters 3 to 5 which shows that Yang’an is indeed an outlier and actually
belongs to the Kam branch rather than the Sui branch. To test the validity of the clusters in figure 7.9, we produced a multidimensional scaling MDS plot based on LD calculated using the phonemicised
transcriptions. This is shown in figure 7.10. In terms of the distance between the PandongYang’an clusters and the Sandong cluster, both MDS plots are extremely similar. A simple calculation of
arithmetic means shows that the average LD of Pandong from all other lects is more than the Yang’an average when using phonetic transcriptions, whereas it is less than the Yang’an average when using
phonemic transcriptions.
7
The difference is too small to be statistically significant. Perhaps the most obvious difference between the two MDS plots is that SW very clearly groups with
Southern in the second plot whereas its position is much less certain in the first plot. In other aspects distinct Yang’an and Pandong clusters, clear clustering of the other three Southern lects, and a
subclustering of TZ, SD and ZH, the two MDS plots are virtually identical. We can conclude, then, that phonemicised transcriptions produce just as meaningful clusters as narrow, phonetic transcriptions when
using LD, and that in some respects clusters based on phonemicised data are clearer and correlate more closely with historical analysis. Similar studies must be done on a variety of languages and dialects to
determine whether this is a chance result for Sui or whether it is generally true for all languages.
Figure 7.10. MDS plot of LDs based on phonemicised transcriptions.
7.4.3 Sandong SD and Zhonghe ZH: the most representative varieties