the parts of the sentence which the subject retold correctly. We wrote down any parts of the retelling which diverged from the original text. By using a different coloured pen for each subject, we could
record the retellings of the three subjects who listened to the same sets of sentences on the same response sheets.
After testing each group of sentences we asked the subject a set of questions given in appendix C to find out their own perception of how much they understood and their judgement of how similar the
dialect was to their own dialect.
8.4.6 Scoring the tests
In this section we explain how we assigned each subject an overall intelligibility score. The scoring process comprised five stages: 1 identifying the core elements of each sentence on which to base the
final scores; 2 assigning each subject a score based upon these core elements; 3 calculating an overall score for each test; 4 adjusting the scores based on the hometown test results in order to ensure that the
results for each RTT were comparable to each other; 5 adjusting the final percentages by excluding anomalous subject scores so as to raise the overall confidence limit.
8.4.6.1 Identifying core elements
For every sentence of each of the four dialects tested Central, Southern, Pandong and Yang’an we identified core elements on which to base our scoring. To do this we referred to two sets of materials: 1
the literal word-for-word translations of each of the original Sui sentences; and 2 the responses of the “hometown subjects” those who listened to and retold their own dialects. We did not base the scoring
solely upon the literal translations of the original sentences because we did not know whether any specific content would be “assumed” by local Sui speakers and would therefore tend not to be retold. In
other words, we did not know which elements of the sentences were deemed by mother-tongue speakers to be “important” enough to retell. Furthermore, some parts of the sentence may have been auditorily
clearer than others in the original recording. We did not wish to penalise subjects for not retelling parts of the recording which the hometown subjects themselves did not retell because they did not hear them
clearly.
The process of identifying core elements is best illustrated by working through an example step by step. We will use the second sentence in Group B, Sentence 15. The original sentence designed for the
RTT is given below. Although the Chinese and Sui could be described as a single “sentence”, the English translation is better rendered using several short sentences. Although neither the original
Chinese nor the Sui version of this sentence indicates a particular tense, we render all English translations in this section in the past tense for the sake of consistency.
Chinese: 冬天很冷,每个人都穿着厚厚的衣服,在家里烤火,哪里都不去。
English: It was very cold in the winter. Everyone was wearing very thick clothes and warming themselves by the fire at home. They did not go anywhere.
Table 8.4 shows a transcription of the same sentence as recorded in Pandong dialect. The Pandong version is not identical to the original version given above. The original states “very thick clothes”
whereas the Pandong version reads “lots of clothes”. It was common for our original sentences to be translated slightly differently across the four target dialects. This is why we developed a separate scoring
sheet for each dialect. Each scoring sheet was based on a different set of “core elements”.
Table 8.4. Sentence 15 Group B, Pandong dialect PD. Word-for-word gloss and free translation PD
haŋ² ȵi:t⁷ lən⁴
h̃ja:n⁵ tɕap⁸ ʔai³
tan³ ⁿduk⁷ lən⁴
kuŋ² Gloss
time season
cold very
cold every
CLF- person
wear clothes very
many PD
to³ ȵa:u⁶ ɣa:n² pʰja:u¹ ɥi¹
tɕum² nau¹ pu³
me² pa:
i¹. Gloss
all atin home
warm oneself
fire place
which all NEG
go. Free
translation It was very cold in the winter. Everyone wore lots of clothes and warmed
themselves by the fire at home. No one went anywhere. Table 8.5 gives the retellings of the four hometown subjects i.e., the four subjects at datapoint PD
who listened to this sentence. In general, any element that occurred in the original text and was also retold by at least two hometown subjects was counted as a core element.
5
Table 8.5. Pandong village speaker retellings of Sentence 15 Group B, Pandong dialect Subject
ref Retelling as retold in Chinese dialect, with English translation
PD03 里面很冷,在家里烤火,哪里都不去。
It was very cold inside. People warmed themselves by the fire at home and did not go anywhere.
PD05 这
里很冷,每个人都穿很多,在家里烤火,哪里都不去。 It was very cold here, everyone wore a lot and warmed themselves by the fire at
home, they did not go anywhere. PD06
冬天很冷,每个人穿着厚厚的衣服,在家里烤火,哪里都不去。 It was very cold in the winter. Everyone wore very thick clothing and warmed
themselves by the fire at home. They did not go anywhere. PD07
那个地方很冷,很多都穿很多衣服,在家里烤火,哪里都不去。 It was really cold there, very many people wore lots of clothes and warmed
themselves by the fire at home. They did not go anywhere. In total we identified nine core elements for this sentence: 1 very cold; 2 everyone lots ofmany
people = 0.5; 3 wear; 4 very manyvery thick; 5 clothes; 6 at home; 7 warmed by the fire; 8 anywhere; 9 didn’t go. We now explain how we identified these core elements.
In the Pandong version of this sentence, “winter” was translated as “cold time” or “cold season”. Only one hometown subject, PD06, retold this, so we did not include it as a core element. However, all
four subjects retold “very cold” exactly as in the PD original, hence we identified “very cold” as a core element. “Everyone” occurs in the PD version and was retold by two of the hometown subjects, hence
“everyone” was also counted as a core element. PD07 retold this as “very many”, which in our final scoring counted as half a point because “people” is assumed and “very many people” is close in meaning
to “everyone” see section 8.4.6.2 on score assignment. “Wear” was retold by three subjects so it counts as a core element. The subsequent “a lot of” or “very many” occurs in the original Pandong version as a
5
Usually there were only three hometown retellings for each sentence because we only tested a total of nine subjects at each location see Section 4.3. Due to time limitations we did not conduct a separate “hometown testing” phase
in which we specifically tested the four dialects on a larger pool of hometown subjects.
modifier of “clothes” and was retold by PD05 and PD07. In practice, its meaning is identical to “very thick” PD06 retelling since Sui people tend to wear many layers of clothing which can be described as
“thick” due to the number of layers rather than very thick individual articles of clothing. Thus we identified “lots of ”, “very many” or “very thick” as another core element.
6
“Clothes” was in the original PD sentence and was retold by two hometown subjects so we counted it as a core element. Finally, “at
home”, “warmed by the fire”, “anywhere” and “didn’t go” were retold by all hometown subjects, thus they were all identified as core elements.
If the recording contained a modern Chinese loanword it was not counted as a core element even if all of the hometown subjects retold it. This is because speakers of other dialects may recognise and
correctly retell it owing to its being a Chinese loanword, not necessarily implying comprehension of the target Sui dialect. However, an old Chinese loanword could be counted as a core element. This is because
old Chinese loanwords are now an integral part of the Sui language. Their pronunciation sometimes differs radically from one dialect to another due to divergent sound changes. Modern and old Chinese
loanwords can be readily distinguished by comparing their pronunciation with the modern local Chinese dialect and with Middle Chinese.
Table 8.6 gives the Southern Sui translation of Sentence 22 Group B along with an English word- for-word gloss. It is immediately obvious to a Chinese speaker that j
i⁶ wai⁴ from 以
为 ‘to
mistakenlywrongly think’ and ȶʰen⁴ pu¹ from
全部 ‘all’ are modern Chinese loanwords. These two
words were therefore not included as core elements. Owing to differences in the way each sentence was translated into the four target dialects, we
identified a separate set of core elements for each of our four RTTs. The core element charts for each RTT are provided in appendix E for reference purposes.
Table 8.6. Sentence 22 group B, Southern subdialect SY. Word-for-word gloss and free translation SY
ʔȵam⁵ kun⁵
kʰaːŋ⁵ laːu⁴ tsjeːŋ⁴
to¹ taŋ¹
ʔai¹ Gloss
evening before wind.
blows big
open door
come CLF-
person SY
kuŋ² ji⁶ wai⁴
ʔnaŋ¹ maːŋ¹ ȶʰen⁴ pu¹ tsam² taŋ¹. Gloss
many wrongly
think have
ghost all
hide come.
Free translation
The night before last, a strong wind blew open the door. Lots of people wrongly thought that it was a ghost and they all hid.
8.4.6.2 Score assignment
The RTTs were scored according to the core elements identified in 8.4.6.1. Each core element retold by the subject earned one point. If the basic meaning of a core element was retold but the subject worded it
in a different way from the Chinese translation of the recorded text, it still earned them one point. A partial retelling of a core element, or a retelling which was similar in meaning to the original but not
identical, would earn half a point.
Table 8.7 illustrates how the three RTT subjects in JR Southern who listened to Sentence 15 of the PD Pandong RTT were scored. JR07 earned half a point for “more” because it is similar to the core
element “very many” but not identical in meaning. JR09 earned no points at all. This subject did not understand this sentence and appears to have made something up to avoid embarrassment.
6
If a subject retold “very many” as part of the phrase “very many people” but did not retell “very many” or “very thick” as a modifier of “clothing”, we would not give them the point for the core element “very many clothes”, but
only the half point for the core element “everyone”.
Table 8.7. Sentence 15 Group B, Pandong dialect PD. JR Southern retellings. Core elements gaining one point are underlined with a solid line, partial core elements gaining
half a point are underlined with a dotted line PD
original 冬天很冷,每个人都穿着很多衣服,在家里烤火,哪里都不去。
It was very cold in the winter. Everyone wore lots of clothes and warmed themselves by the fire at home. No one went anywhere.
Max score: 9
Core elements
very cold, everyone, wear, very manyvery thick, clothes, at home, warmed by the fire, anywhere, didn’t go
JR07 很冷,在家
哪里 都不去,出去穿
多点 衣服。
It is very cold, stay at home and don’t go anywhere, and when you go out wear more clothes.
6.5 JR08
在家里 烤火,哪里
都不去。 warm themselves by the fire at home and don’t go anywhere
4 JR09
得一个老婆 还
想娶另一个。 He has already got one wife but he wants to marry another.
Average score: 3.5
We should point out that we deliberately chose this example as an illustration of scoring. In reality, such a wide range of scores for the same sentence was uncommon. This sentence is the only such
example among these three JR subjects listening to the PD Group B sentences. Their scores on almost all the other sentences were within one or two points of each other, giving us confidence that the overall
comprehension score for JR listening to PD Group B was reliable.
In some instances, a subject may retell a correct core element but we did not award them a point because they retold it in the incorrect place in the sentence, indicating that they did not understand it in
the correct way according to the context of the core element in the original sentence. For example, Sentence 11 Group A says, “She didn’t like salty, sour or spicy food, she would only eat sweet things, so
in the end all of her teeth fell out.” One subject retold this sentence in this way, “She didn’t like salty, sweet or sour food, so in the end all of her teeth fell out.” This subject did not earn a point for “sweet”
because he did not understand it in the correct context of “she would only eat sweet food”.
8.4.6.3 Calculating the overall scores
After assigning scores to each individual RTT subject, we entered all the scores for each RTT into a predesigned score chart. Table 8.8 shows our score chart for Zhonghe ZH subjects taking the Central
GC RTT. As illustrated in the table, the raw scores were converted into percentages and then averaged across all nine participants to give an overall score. Note that each group of sentences had a different
maximum score depending on the total number of core elements in that group. Converting the scores for each group into percentages ensured that each group of sentences was equally weighted in the overall
score.
Table 8.8. Scores for Zhonghe ZH subjects taking the Central Sui GC RTT Sentences tested:
Subject ref.
SD Group A
Percentage Max score: 111
100 ZH01
96.5 86.94
ZH02 107
96.40 ZH03
104.5 94.14
Average score: 102.67
92.49 Sentences tested:
Subject ref.
SD Group B
Max score: 112 100
ZH04 106
94.64 ZH05
104.5 93.30
ZH07 91
81.25 Average score:
100.5 89.73
Sentences tested: Subject ref.
SD Group C
Max score: 112 100
ZH06 100
89.29 ZH08
106 94.64
ZH09 97
86.61 Average score:
101 90.18
Overall score for ZH subjects listening
to SD RTT: 92.49 + 89.73 + 90.18
÷ 3 = 90.8
8.4.6.4 Adjusting the group scores
Theoretically, all subjects taking the test in their own dialect should gain full marks because they should have complete comprehension of their own dialect. This was rarely the case, however, because the
recordings were not always crystal clear and some sentences contained words which the text provider “swallowed” during recording. In order to maximise the comparability of the RTTs to each other, we
adjusted the scores so that each group of sentences in each RTT was equally weighted.
For example, the Shuiyao hometown subjects who listened to Groups A and C of the SY RTT achieved an average of 92 on both groups of sentences. Those listening to the Group B sentences only
achieved an average of 85.3. The fact that the latter subjects achieved a higher score than the former subjects on both of the other RTTs that they took Central and Yang’an indicates that their lower score
of 85.3 on the SY sentences was not due to a lesser ability in “test-taking” or in “translation”. Rather, it was probably due to the fact that the Group B SY recordings were not as easy for native speakers to
understand and retell as the Group A and C recordings. Therefore, for the SY RTT, we gave Group B a greater weighting than Groups A and C in order to ensure a fair, comparable overall score for the SY
RTT.
Similarly, our hometown test results indicated that the four RTTs were not directly comparable. For example, the overall hometown test score for the SY RTT was 89.8 whereas the hometown test score
for the PD RTT was 93.5. In order to make the results of each RTT comparable to each other, we
adjusted the scores for each test based on the hypothesis that the hometown subjects had 100 comprehension of their own tests. We did this by multiplying the listeners’ scores by the group weighting
coefficient illustrated in table 8.10. Tables 8.9 and 8.10 illustrate the mechanism by which we adjusted the RTT scores. For each test,
each group of sentences was weighted using a “group weighting coefficient” calculated according to the hometown subject results. Table 8.9 shows how we calculated the weighting coefficient for each group,
in this case taking the SD RTT as an example. Table 8.10 illustrates how these coefficients were used to adjust the SD RTT results in all datapoints, taking Zhonghe ZH results as an example.
Table 8.9. Scores for Sandong SD subjects taking the Central Sui GC RTT hometown test, showing the calculation of group weighting coefficients used to adjust all test scores
Sentences tested: Subject ref.
SD Group A
Percentage Group
weighting coefficients
Max score: 111 100
SD01 95.5
86.04 SD02
105.5 95.05
SD04 98.5
88.74 Average score:
99.83 89.94
100 ÷89.94= 1.11185
Sentences tested: Subject ref.
SD Group B
Max score: 112 100
SD03 104
92.86 SD05
108.5 96.88
SD06 110
98.21 Average score:
107.5 95.98
100 ÷95.98= 1.04186
Sentences tested: Subject ref.
SD Group C
Max score: 112 100
SD07 105
93.75 SD08
99 88.39
SD09 103.5
92.41 Average score:
102.5 91.52
100 ÷91.52= 1.09268
Overall score for SD subjects listening
to SD RTT: 89.94+95.98+91.52
÷ 3=
92.48
Table 8.10. Calculation of adjusted scores each group equally weighted for Zhonghe ZH subjects taking the Central Sui GC RTT
Sentences tested: Subject ref.
SD Group A
Percentage Group weighting
coefficient Adjusted
score Max score: 111 100
ZH01 96.5
86.94 x 1.11185 =
96.66 ZH02
107 96.40
x 1.11185 = 100
ZH03 104.5
94.14 x 1.11185 =
100 Average score:
102.67 92.49
98.89 Sentences tested:
Subject ref.
SD Group B
Max score: 112 100 ZH04
106 94.64
x 1.04186 = 98.60
ZH05 104.5
93.30 x 1.04186 =
97.21 ZH07
91 81.25
x 1.04186 = 84.65
Average score: 100.5
89.73 93.49
Sentences tested: Subject ref.
SD Group C
Max score: 112 100 ZH06
100 89.29
x 1.09268 = 97.56
ZH08 106
94.64 x 1.09268 =
100 ZH09
97 86.61
x 1.09268 = 94.63
Average score: 101
90.18 97.40
Overall score for ZH subjects listening
to SD RTT: = 90.8
96.59 8.4.6.5
Raising the confidence limit An extremely wide range of test scores as indicated by the standard deviation among subjects at the
same location listening to the same RTT would suggest that the overall average score is unreliable in some way Blair 1990. In all cases where the standard deviation was over 10 we examined the
individual test scores to see if there were any subjects who were skewing the overall score. If such a subject could be identified and if we had external evidence to show that their result may be unreliable
for example from information garnered from the pre-RTT questionnaire or from the interviewer’s own observations, we excluded their result from the overall score.
For example, the original average score for PD Pandong dialect subjects listening to the Central Sui GC RTT was 45.4. The standard deviation was high, at 14.0. On closer inspection we found that
one subject had a particularly high score of 71.2 whereas none of the other subjects achieved higher than 54.9. Our pre-RTT questionnaire indicated that the high-scoring subject was a graduate of the
teacher’s college in Duyun the prefectural capital where he would have had contact with many Sui speakers from Sandu county including from the Central Sui area. Furthermore, he had worked in Duyun
for several years after graduating. After excluding his score from the results, the overall average was lower, at 42.2, and the standard deviation was much lower, at 10.9.
Due to the low number of subjects nine at each location, we endeavoured not to exclude more than one subject from any one set of results. It would be difficult to show that divergent results of two or
more subjects in a population of just nine are statistically significant. All RTT scores, including original scores, post-adjustment scores and information on any subjects whose results were excluded from the
final scores, are provided in appendix F.
8.5 Results