Table 1. Sites from which wordlists were obtained
Language State
District Village
Bhumij Bihar
Singhbhum Champi
Bhumij Bihar
Singhbhum Ladhiramsai
Bhumij Bihar
Singhbhum Munduy
Bhumij Orissa
Balasore Baigodia
Bhumij Orissa
Mayurbhanj Dighinuasahi
Bhumij Orissa
Mayurbhanj Dumadie
Bhumij Orissa
Mayurbhanj Madhupur
Bhumij Orissa
Mayurbhanj Mohuldiha
Bhumij Orissa
Mayurbhanj Podadiha
Mundari Bihar
Ranchi Chalagi
Mundari Orissa
Mayurbhanj Dhungarisai
Mundari Orissa
Sundargarh Jharmunda
Mundari Dictionary
Bihar Ranchi
—— Bhumij Mundari
Orissa Mayurbhanj
Udala Ho
Orissa Mayurbhanj
Dillisore Santali
Orissa Mayurbhanj
Nayarangamot ia
Santali Dictionary
Bihar Santal Parganas
? ——
Oriya Orissa
Cuttack ——
2.1.3 Results and analysis
The lexical similarity percentages for the speech varieties investigated are calculated and presented in the following two tables The capital letter at the beginning of the sites refers to the speech variety; i.e., B
– Bhumij, M – Mundari, BM – Bhumij Mundari, H – Ho, and S – Santali. The first chart, in table 2, is ordered by percentages, with the highest percentage being placed nearer to the top.
Table 2. Lexical similarity organised by percentages within each speech variety B – Podadiha, Mayurbhanj
96 B – Madhupur, Mayurbhanj 94 91 B – Dumadie, Mayurbhanj
92 86 86 B – Champi, Singhbhum 86 83 83 82 B – Mohuldiha, Mayurbhanj
Bhumij speech varieties 85 80 82 85 79 B – Munduy, Singhbhum
85 83 81 81 78 75 BM – Udala, Mayurbhanj 84 82 81 81 80 76 89 B – Baigodia, Balasore
84 78 79 79 76 77 82 81 B – Ladhiramsai, Singhbhum 77 73 74 75 77 73 83 81 77 B – Dighinuasahi, Mayurbhanj
83 81 80 79 76 74 94 87 80 79 M – Dhungarisai, Mayurbhanj 78 72 76 74 72 73 78 74 83 73 75 M – Dictionary, Mundari speech variety
78 72 76 73 74 77 69 71 74 68 67 79 M – Jharmunda, Sundargarh 77 70 72 72 71 71 74 72 82 74 72 84 74 M – Chalagi, Ranchi
75 71 71 71 69 72 71 70 80 66 69 75 74 77 H – Dillisore, Mayurbhanj Ho 71 69 72 68 72 66 67 71 65 72 66 65 68 65 61 S – Nayarangamotia, Mayurbhanj
73 69 70 66 66 63 67 70 73 64 67 76 66 68 69 79 S – Dictionary Santali 20 20 17 20 20 19 18 21 13 18 21 13 12 10 15 18 17 Oriya Cuttack
The lexical similarity chart in table 3 is organised by geographic location, roughly from north to south, except in the case of Jharmunda since it is further west and the non-Bhumij and Mundari sites.
Table 3. Lexical similarity organised by geography within each speech variety B – Ladhiramsai, Singhbhum
79 B – Champi, Singhbhum 76 82 B – Mohuldiha, Mayurbhanj
78 86 83 B – Madhupur, Mayurbhanj 79 86 83 91 B – Dumadie, Mayurbhanj
Bhumij speech varieties 84 92 86 96 94 B – Podadiha, Mayurbhanj
77 85 79 80 82 85 B – Munduy, Singhbhum 77 75 77 73 74 77 73 B – Dighinuasahi, Mayurbhanj
82 81 78 83 81 85 75 83 BM – Udala, Mayurbhanj 81 81 80 82 81 84 76 81 89 B – Baigodia, Balasore
82 72 71 70 72 77 71 74 74 72 M – Chalagi, Ranchi 83 74 72 72 76 78 73 73 78 74 84 M – Dictionary, Mundari speech varieties
80 79 76 81 80 83 74 79 94 87 72 75 M – Dhungarisai, Mayurbhanj 74 73 74 72 76 78 77 68 69 71 74 79 67 M – Jharmunda, Sundargarh
80 71 69 71 71 75 72 66 71 70 77 75 69 74 H – Dillisore, Mayurbhanj Ho 73 66 66 69 70 73 63 64 67 70 68 76 67 66 69 S – Dictionary Santali
65 68 72 69 72 71 66 72 67 71 65 65 66 68 61 79 S – Nayarangamotia, Mayurbhanj 13 20 20 20 17 20 19 18 18 21 10 13 21 12 15 17 18 Oriya Cuttack
Bhumij and Mundari vocabulary Typically for two speech varieties that have less than 60 lexical similarity, it can be concluded that the
speech varieties are quite distinct and, with other supporting evidence, be classed as separate languages Blair 1990:24. The Bhumij and Mundari wordlist sites, however, show higher percentages than this
60 threshold. Two-language comparisons from among Bhumij wordlist sites, including Bhumij Mundari from Udala, yield an average of 82 similarity 45 comparisons. Comparisons of Bhumij sites with
Mundari sites including the dictionary list which scored approximately equal to the speech sites in most cases, and excluding the Udala wordlist yield an average of 76 36 comparisons. This indicates that
some difference exists, but not sufficient enough to warrant a separate language classification, especially since Mundari wordlist sites in comparison with one other only average 77 similar 10 comparisons,
including the Bhumij Mundari wordlist. Because of this, it is not possible to say from the data that Bhumij and Mundari are separate languages.
One issue that should be investigated is the relative homogeneity of Mundari, especially in comparison with Bhumij. One might expect to find the Mundari sites to show higher scores when making
Mundari-internal comparisons than the Bhumij sites in Bhumij-internal comparisons. This would be based on various descriptions of Bhumij, particularly of Risley 1891, reprinted in 1981, who says the
Bhumij tend to adopt the speech forms of whatever place they are living. Many Bhumij communities in West Bengal have cut ties with the tribal groups and have assimilated to Bengali language and culture.
If this view of the Bhumij as a non-conservative group is accurate, much greater variation would be expected between their wordlist sites than found in the more conservative Mundari locations. This
question cannot be adequately addressed due to the lack of sufficient Mundari data one of the wordlists is from a dictionary, and is therefore not strictly comparable. Nonetheless, it is interesting to note that
the Bhumij wordlist points appear to show more homogeneity than the Mundari wordlist points. This might be evidence for a set of distinctly “Bhumij” vocabulary within Mundari not separate since the
Bhumij-Mundari comparisons are high. As it stands, however, it cannot be concluded whether the lower in-group scores for Mundari are a reflection of the wider geographic spread of the sites bringing down
the in-group average, or if indeed Mundari is a macro language subsuming Bhumij as early research suggests. Additional Mundari wordlists could help clarify this issue.
There is a large degree of variation 73–96 when comparing Bhumij wordlists with each other, which does not seem explainable by geographical location. Both the southern Mayurbhanj district sites
Udala Baigodia have scores in the mid-70s to mid-80s when comparing with the sites clustered around the Bihar-Orissa border. The only exception to this may be Dighinuasahi which shows less
similarity mid-70s with the border sites to the north. The Bhumij wordlist percentages appear to indicate no clear dialect groupings.
One question raised in the Varenkamp report 1989 was whether the people of Udala considered their speech Bhumij or Mundari. As it turns out, this issue of ambiguous language identity became one of
the motivating questions for this project. The lexical similarity percentages in the previous charts seem to indicate that the Udala wordlist is slightly closer to the vocabulary of Bhumij sites than to that of
Mundari sites—if such a distinction can be made.
Comparison with neighbouring languages While not central to the team’s research goals, two Santali wordlists and a Ho wordlist from Varenkamp’s
reports were used in the lexical similarity comparison. If these lists can be taken as representing their respective languages, the lexical similarity percentages confirm the belief that Santali is more distinct
68 similar on average from Bhumij and Mundari. Ho seems to share slightly more resemblance 72 average with the Bhumij and Mundari wordlists. Oriya, the Indo-Aryan state language of Orissa, not
surprisingly shares very little in common 10–21 with the Munda family wordlists.
2.2 Intelligibility testing