The Intercept
4.2 The Intercept
When we regress base frequency on derived frequency, the intercept of the resulting line displays a fair amount of variation across affixes. When we consider the graphs in Fig- ure 1 for example, the intercept for -ment (i.e. the place where it crosses the Y axis) is 1.25, and the intercept for -ism is 3.53. A consequence of this difference is that fewer points fall below the x=y line (and also, the more empirically motivated parsing line – not shown in figure 1) for -ism than for -ment.
From a production perspective, the intercept (regardless of the slope of the line) indi- cates how frequent a base word needs to be before it is likely to spawn an affixed word. Affixes with high intercepts are affixes for which the base words are fairly high frequency relative to the derived forms. For such affixes, a base word needs to be fairly frequent From a production perspective, the intercept (regardless of the slope of the line) indi- cates how frequent a base word needs to be before it is likely to spawn an affixed word. Affixes with high intercepts are affixes for which the base words are fairly high frequency relative to the derived forms. For such affixes, a base word needs to be fairly frequent
From a perception perspective, a high intercept reflects an overall pattern in which base frequencies tend to be high relative to derived frequencies. That is, it reflects a dis- tribution in which many words are prone to parsing, and very few are prone to whole word access. A low intercept, on the other hand, would reflect a distribution in which
a larger proportion of forms fall below the parsing line. Such a distribution has a larger proportion of forms which are prone to whole word access.
Taking the claims about production and perception together, we reach a surprising conclusion — but one which is nonetheless borne out by the empirical facts. The less useful an affix is (in terms of the degree of use of the words it creates – not in terms of how many different words it could potentially create), the more likely it is to be parsed, and so the more productive it is likely to be. Relatively useless affixes remain productive because their derived forms remain low frequency relative to the frequency of the base words. This leads to high rates of parsing, and so to a robust representation of the affix.
Evidence for the existence of a relationship between the intercept and productivity can be seen in the graphs in Figure 4. These graphs are based on robust regression lines fit through individual affixes (a small subset of which were shown in Figure 1). Each point in the graphs in Figure 4 represents a single affix. The X-axis of the graphs shows the intercept for robust regression lines fit through derived and base frequency for each
individual affix. Of the 80 affixes, 44 show a significant correlation between derived
frequency and base frequency. All 80 are shown here, however, as the intercept is relevant (both from a production and a perception perspective), regardless of the significance of the slope of the line with which it is associated. For those with a non-significant slope, the intercept merely reflects the average base frequency.
The left panel of Figure 4 shows a significant relationship between the intercept, and
productivity as measured by . Affixes which return high intercept values when base
frequency is regressed on derived frequency, show significantly higher levels of produc- tivity. The right panel shows the relationship between the value of the intercept, and the total number of tokens containing the affix. Affixes with high token frequency (towards the top of the graph) are more likely to be represented by high frequency words, which fall below the parsing line, and so are prone to whole word access. When we regress base frequency is regressed on derived frequency, show significantly higher levels of produc- tivity. The right panel shows the relationship between the value of the intercept, and the total number of tokens containing the affix. Affixes with high token frequency (towards the top of the graph) are more likely to be represented by high frequency words, which fall below the parsing line, and so are prone to whole word access. When we regress base
rs=−0.38, p=0
Figure 4: The relation between the Intercepts of the by-affix linear models regressing
base frequency on derived frequency, and log (left panel) and log (right panel). Ev-
ery point represents an affix. The lines represent a non-parametric scatterplot smoother (Cleveland, 1979) fit through the points. rs=non-parametric correlation (Spearman’s rho).
frequency on derived frequency, large numbers of high frequency words will lower the value of the intercept by pulling the regression line closer to the X-axis.
The results shown in Figure 4, then, further demonstrate that the relationship between base frequency and derived frequency for a given individual affix profoundly influences that affix’s degree of productivity.