The Intercept

4.2 The Intercept

  When we regress base frequency on derived frequency, the intercept of the resulting line displays a fair amount of variation across affixes. When we consider the graphs in Fig- ure 1 for example, the intercept for -ment (i.e. the place where it crosses the Y axis) is 1.25, and the intercept for -ism is 3.53. A consequence of this difference is that fewer points fall below the x=y line (and also, the more empirically motivated parsing line – not shown in figure 1) for -ism than for -ment.

  From a production perspective, the intercept (regardless of the slope of the line) indi- cates how frequent a base word needs to be before it is likely to spawn an affixed word. Affixes with high intercepts are affixes for which the base words are fairly high frequency relative to the derived forms. For such affixes, a base word needs to be fairly frequent From a production perspective, the intercept (regardless of the slope of the line) indi- cates how frequent a base word needs to be before it is likely to spawn an affixed word. Affixes with high intercepts are affixes for which the base words are fairly high frequency relative to the derived forms. For such affixes, a base word needs to be fairly frequent

  From a perception perspective, a high intercept reflects an overall pattern in which base frequencies tend to be high relative to derived frequencies. That is, it reflects a dis- tribution in which many words are prone to parsing, and very few are prone to whole word access. A low intercept, on the other hand, would reflect a distribution in which

  a larger proportion of forms fall below the parsing line. Such a distribution has a larger proportion of forms which are prone to whole word access.

  Taking the claims about production and perception together, we reach a surprising conclusion — but one which is nonetheless borne out by the empirical facts. The less useful an affix is (in terms of the degree of use of the words it creates – not in terms of how many different words it could potentially create), the more likely it is to be parsed, and so the more productive it is likely to be. Relatively useless affixes remain productive because their derived forms remain low frequency relative to the frequency of the base words. This leads to high rates of parsing, and so to a robust representation of the affix.

  Evidence for the existence of a relationship between the intercept and productivity can be seen in the graphs in Figure 4. These graphs are based on robust regression lines fit through individual affixes (a small subset of which were shown in Figure 1). Each point in the graphs in Figure 4 represents a single affix. The X-axis of the graphs shows the intercept for robust regression lines fit through derived and base frequency for each

  individual affix. Of the 80 affixes, 44 show a significant correlation between derived

  frequency and base frequency. All 80 are shown here, however, as the intercept is relevant (both from a production and a perception perspective), regardless of the significance of the slope of the line with which it is associated. For those with a non-significant slope, the intercept merely reflects the average base frequency.

  The left panel of Figure 4 shows a significant relationship between the intercept, and

  productivity as measured by . Affixes which return high intercept values when base

  frequency is regressed on derived frequency, show significantly higher levels of produc- tivity. The right panel shows the relationship between the value of the intercept, and the total number of tokens containing the affix. Affixes with high token frequency (towards the top of the graph) are more likely to be represented by high frequency words, which fall below the parsing line, and so are prone to whole word access. When we regress base frequency is regressed on derived frequency, show significantly higher levels of produc- tivity. The right panel shows the relationship between the value of the intercept, and the total number of tokens containing the affix. Affixes with high token frequency (towards the top of the graph) are more likely to be represented by high frequency words, which fall below the parsing line, and so are prone to whole word access. When we regress base

  rs=−0.38, p=0

  Figure 4: The relation between the Intercepts of the by-affix linear models regressing

  base frequency on derived frequency, and log (left panel) and log (right panel). Ev-

  ery point represents an affix. The lines represent a non-parametric scatterplot smoother (Cleveland, 1979) fit through the points. rs=non-parametric correlation (Spearman’s rho).

  frequency on derived frequency, large numbers of high frequency words will lower the value of the intercept by pulling the regression line closer to the X-axis.

  The results shown in Figure 4, then, further demonstrate that the relationship between base frequency and derived frequency for a given individual affix profoundly influences that affix’s degree of productivity.

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

The correlation intelligence quatient (IQ) and studenst achievement in learning english : a correlational study on tenth grade of man 19 jakarta

0 57 61

An analysis of moral values through the rewards and punishments on the script of The chronicles of Narnia : The Lion, the witch, and the wardrobe

1 59 47

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22