The linguistic features affecting the choice of ditransitive constructions in English

(1)

i

THE LINGUISTIC FEATURES AFFECTING THE CHOICE

OF DITRANSITIVE CONSTRUCTIONS IN ENGLISH

A THESIS

Presented as Partial Fulfilment of the Requirements to Obtain the Degree of Magister Humaniora (M. Hum.)

in English Language Studies

by

Eko Windu Prasetya Student Number: 116332022

THE GRADUATE PROGRAM IN ENGLISH LANGUAGE STUDIES SANATA DHARMA UNIVERSITY

YOGYAKARTA 2014


(2)

(3)

(4)

iv

STATEMENT OF WORK ORIGINALITY

This is to certify that all the ideas, phrases, and sentences, unless otherwise stated, are the ideas, phrases, sentences of the thesis writer. The writer understands the full consequences including degree cancellation if he took somebody else‟s ideas, phrases, or sentences without proper references.


(5)

v

LEMBAR PERNYATAAN PERSETUJUAN PUBLIKASI KARYA ILMIAH UNTUK KEPENTINGAN AKADEMIS

Yang bertanda tangan di bawah ini, sebagai mahasiswa Universitas Sanata Dharma:

Nama : Eko Windu Prasetya

Nomor Mahasiswa : 116332022

Demi perkembangan ilmu pengetahuan, memberikan kepada Perpustakaan Universitas Sanata Dharma karya ilmiah saya yang berjudul:

THE LINGUISTIC FEATURES AFFECTING THE CHOICE

OF DITRANSITIVE CONSTRUCTIONS IN ENGLISH

beserta perangkat yang diperlukan. Dengan demikian penulis memberikan hak kepada Perpustakaan Universitas Sanata Dharma untuk menyimpan, mengalihkan dalam media lain, mengelolanya dalam bentuk pangkalan data, mendistribusikannya secara terbatas, dan mempublikasikannya di internet atau media lain untuk kepentingan akademis tanpa perlu meminta ijin dari penulis maupun memberikan royalti kepada penulis selama tetap mencantumkan nama penulis.

Demikian pernyataan ini dibuat dengan sebenarnya. Di : Yogyakarta


(6)

vi

ACKNOWLEDGEMENTS

This research would have been barely possible to finish without the support of many people around me. I would like to express my gratitude to the people who have contributed to the completion of this research.

I would like to express my very great appreciation to Dr. B.B. Dwijatmoko, M.A., my thesis supervisor, for his valuable and constructive suggestions during the planning and development of this research. His patient guidance, enthusiastic encouragement, useful critiques, and tireless effort in keeping my progress on schedule have been very much appreciated. I thank you for not giving up on me and for helping me grasping the completion of this thesis. Deepest gratitude are also due to the members of the supervisory committee, Dr. Fr. B. Alip, M.Pd., F.X. Mukarto, Ph.D., and Drs. Barli Bram, M.Ed., Ph.D. without whose knowledge and assistance this research would not have been successful. Their feedbacks and suggestions have greatly improved my thesis.

I am particularly grateful for the assistance given by the lecturers in the Graduate Program. The knowledge, insights, and encouragement from Prof. Dr. Soepomo Poedjosoedarmo, Dr. J. Bismoko, Dr. Novita Dewi, M.S., M.A. (Hons), Dr. F.X. Siswadi, M.A., Prof. Dr. C. Bakdi Soemanto, S.U., Dr. Alb. Susanto, S.J. during my study in Sanata Dharma has made me a better person.

I would also like to extend my thanks to the Graduate Program staff, Ms. Lely, for keeping me in the loop for every information update. I am also grateful for the kindness shown by Pak Antonius Mulyadi, whom with many conversations I made have inspired and conveyed me the spirit of learning from nothing.

To my friends in KBI who didn‟t accidentally come by but by God‟s plan drew closer into my life, I thank you. I thank you for the thoughts, well-wishes and prayers, phone calls, short messages, e-mails, visits, editing advice, and for being there whenever I need a friend.

An honorable mention goes to my beloved families who inspired, and fully supported me. I also thank them for giving me not only financial, but moral and spiritual support. Not to forget my special thanks are extended to my lovely wife,


(7)

vii

Yunita Hening Herdiyati F., S.Pd, M.Hum., to accompany and support me all of the time.

Above all, I would like to express my sincere gratitude to the lovely Jesus Christ, who had blessed me in my years of study and in finishing this thesis. I thank Him for guiding me into all the truth and for making straight my paths, including during my working on the thesis. I thank Him for giving generously and without reproach all the knowledge I lack of when I ask. He is the sculptor of me, and He shapes my life into the best I can be.


(8)

viii

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa


(9)

ix

TABLE OF CONTENTS

TITLE PAGE ... i

APPROVAL PAGE ... ii

DEFENSE APPROVAL PAGE ... iii

STATEMENT OF ORIGINALITY ... iv

LEMBAR P ERNYATAAN P ERSETUJUAN P UBLIKASI... v

ACKNOWLEDGEMENTS ... vi

TABLE OF CONTENTS ... ix

LIST OF TABLE ... xi

LIST OF FIGURES ... xii

LIST OF APPENDICES ... xiii

LIST OF ABBREVIATIONS ... xiv

ABSTRACT ... xv

ABSTRAK ... xvii

CHAPTER 1 INTRODUCTION 1.1 Background of the Study ...1

1.2 Problem Limitations ...6

1.3 Research Questions ...7

1.4 Research Objectives ...8

1.5 Research Benefits ...9

CHAPTER 2 THEORETICAL REVIEW 2.1 Ditransitivity ...11

2.1.1 Dative ...12

2.1.2 Benefactive ...13

2.2 The Linguistic Features Relevant to Benefactive and Dative ...15

2.2.1 Semantic Verb Class ...16

2.2.2 Syntactic Complexity of Theme and Beneficiary ...18

2.2.3 Animacy of Theme and Beneficiary ...19

2.2.4 Discourse Accessibility of Theme and Beneficiary...21

2.2.5 Pronominality of Theme and Beneficiary...22


(10)

x

2.2.7 Person of Beneficiary ...24

2.2.8 Number of Theme and Beneficiary ...26

2.2.9 Definiteness of Theme and Beneficiary ...26

2.3 Corpus Linguistics ...27

2.4 Probabilistic Model of Logistic Regressions ...29

2.5 Related Studies ...30

2.6 Theoretical Framework ...33

CHAPTER 3 RESEARCH METHODOLOGY 3.1 Research Type ...36

3.2 Research Data ...38

3.3 Data Analysis ...44

3.4 Triangulation ...47

CHAPTER 4 ANALYSIS RESULTS AND DISCUSSION 4.1 The Syntactic and Semantic Features Affecting Benefactive Construction ...50

4.1.1 Givenness of Beneficiary ...58

4.1.2 Animacy of Beneficiary ...62

4.1.3 Pronominality of Theme ...66

4.1.4 Definiteness of Theme ...70

4.1.5 Person of Beneficiary ...75

4.1.6 Syntactic Complexity ...79

4.2 Features Relevant on Dative and Benefactive ...96

4.2.1 Effect Direction and Size of the Linguistic Features to Ditransitive ...97

4.2.2 The Interchangeability of the Significant Features to Ditransitive ...103

CHAPTER 5 CONCLUSIONS AND RECOMMENDATIONS 5.1 Conclusions ...108

5.2 Recommendations ...111

BIBLIOGRAPHY ...114


(11)

xi

LIST OF TABLES

Table 4.1 Classification model table showing the probabilistic binary logistic

regression model accuracy with fourteen variables ... 53

Table 4.2 Multivariable binary logistic model summary table ... 54

Table 4.3 Table of fourteen variables in the equation of the probabilistic binary logistic regression model ... 56

Table 4.4 Crosstabulation of givenness of beneficiary toward ditransitivity ... 59

Table 4.5 Variable givenness of beneficiary removed from full model ... 62

Table 4.6 Crosstabulation of animacy of beneficiary toward ditransitivity ... 63

Table 4.7 Variable animacy of beneficiary removed from full model ... 66

Table 4.8 Crosstabulation of pronominality of theme toward ditransitivity .... 67

Table 4.9 Variable pronominality of theme removed from full model ... 70

Table 4.10 Crosstabulation of definiteness of theme toward ditransitivity ... 71

Table 4.11 Variable definiteness of theme removed from full model ... 74

Table 4.12 Crosstabulation of person of beneficiary toward ditransitivity ... 76

Table 4.13 Variable person of beneficiary removed from full model... 78

Table 4.14 Variable syntactic complexity removed from full model ... 84

Table 4.15 Table of six variables in the equation ... 87


(12)

xii

LIST OF FIGURES

Figure 2.1 Terminology in benefactive construction ... 13

Figure 2.2 Terminology in benefactive construction ... 15

Figure 3.1 The process of ten-fold cross-validation ... 49

Figure 4.1 Benefactive frequencies in COHA data setbased on the PP and DO realizations ... 52

Figure 4.2 Tabular data showing distribution of syntactic complexity in benefactive alternation ... 80

Figure 4.3 Effect sizes of the significant features to the choice of benefactive construction ... 85

Figure 4.4 Model plots of observed against estimated responses. ... 86

Figure 4.5 Excel processing of PP realization prediction ... 91


(13)

xiii

LIST OF APPENDICES

Appendix 1 : List of Benefactive Instances Chosen for Data Set ... 118

Appendix 2 : Annotating and Coding Process of the Data Set ... 145

Appendix 3 : The Relevance of Features Analysis ... 160

Appendix 4 : Benefactive Construction Realization Analysis ... 177

Appendix 5 : Ten-Fold Cross-Validation Analysis ... 186

Appendix 6 : External Validation Analysis ... 198

Appendix 7 : Mix-Effect Binary Regression Analysis of Benefactive Construction ... 204


(14)

xiv

LIST OF ABBREVIATIONS

S : Subject

V : Verb

O : Object

PP : Prepositional Phrase

COHA : Corpus of Historical American English

Io : Indirect Object

DO : Direct Object

prepto : Preposition to

Op : Object of Preposition CS : Conceptual Construction MAva : Make Available

VoCr : Verb of Creation VPrf : Verb of Performance Vpre : Verb of Preparation

VIdi : Verb with Idiomatic Meaning

NP : Noun Phrase

DOC : Double Object Construction

β/B : Coefficient

x : Linguistic Feature

p : Probability

S.E. : Standard Error

df : Degree of Freedom

Sig : Significance

exp (B) : Odds Ratio


(15)

xv ABSTRACT

Prasetya, Eko. 2013. The Linguistic Features Affecting the Choice of Ditransitive Constructions in English. Yogyakarta: The Graduate Program in English Language Studies, Sanata Dharma University.

The alternation of benefactive construction either as NP NP or NP PP appears to be troublesome and hard to undertake by many language users. As theoretical linguistics traditionally relies on linguistic intuition such as grammatical judgement for such data, the low proficient or non-native speakers will find it hard to solve this benefactive construction problem. While the problem of how language users decide which structure to use has been analyzed using approaches like syntactic, semantic, and discourse, this very research proposes the analysis of benefactive construction using probabilistic grammar. This research combines corpus linguistic study and logit formula of probabilistic earned from binary logistic regression. The data set of benefactive construction taken from

Corpus of Historical American English (COHA) provides the tendency of occurrences of the construction. The model of benefactive probabilistic is built from the analysis of corpus data and then is computed into the logit formula to predict the occurrences of benefactive construction.

Two research questions were explored in this research. The first research question was What linguistic features affect the choice of benefactive construction? The second research question was How do the significant features differ in the effect size of the effect toward ditransitive construction?

In attempt of answering the research questions, twelve theories were employed. The theories were theories of ditransitivity, semantic class of verbs, syntactic complexity, animacy of theme and beneficiary, discourse givenness of theme and beneficiary, pronominality of theme and beneficiary, concreteness of theme, person of beneficiary, number of theme and beneficiary, definiteness of theme and beneficiary, corpus linguistic, and probabilistic model of logistic regression. The theories were used during annotation process and used to analyze the significant of the features toward benefactive construction.

A conclusion is drawn from the process of selecting and annotating the data from COHA that the choice of benefactive alternation is hardly influenced by single feature on its own. Mixed-effect features appear to be the best explanation for the choice of benefactive construction. The mixed-effect binary logistic analysis shows that six linguistic features emerge to be significant toward benefactive alternation. The features include syntactic complexity, animacy of beneficiary, givenness of beneficiary, pronominality of theme, person of beneficiary, and definiteness of theme. Based on the probabilistic analysis, altogether those significant features are able to predict the occurrences of benefactive construction with 90% accuracy. Furthermore, the uses of benefactive probabilistic model on dative data and vice versa result in the fact that the accuracy decline drastically. This result supports the findings that although the directions of the feature relevance are similar, the size of the feature effects differ much between the two constructions.


(16)

xvi

To improve the findings of this research, future research with much larger data hopefully can be done with the purpose to clarify the different amount of features affecting dative and benefactive. The future research with larger data are expected to answer the question whether dative is more complicated than benefactive as it possesses more significant features and if this different number of significant features is simply because of the amount of instances taken into data set.


(17)

xvii ABSTRAK

Prasetya, Eko. 2013. The Linguistic Features Affecting the Choice of Ditransitive Constructions in English. Yogyakarta: Program Pasca-Sarjana Kajian Bahasa Inggris, Universitas Sanata Dharma.

Permasalahan pemilihan pemakaian konstruksi benefaktif tampaknya membingungkan dan sulit untuk dipecahkan oleh banyak pengguna bahasa. Karena Linguistik teoritis tradisional bergantung pada intuisi linguistik seperti penghakiman gramatikal untuk data tersebut, pengguna bahasa dengan tingkat pemahaman bahasa rendah dan yang bukan penutur alami akan kesulitan untuk memecahkan masalah konstruksi benefaktif ini. Sementara masalah bagaimana pengguna bahasa menentukan struktur kebahasaan yang akan dipakai telah dianalisa dengan menggunakan pendekatan seperti sintaksis, semantik, dan wacana, penelitian ini menyajikan analisis konstruksi benefaktif menggunakan pendekatan probabilistik linguistik. Penelitian ini menggabungkan studi linguistik corpus dan logit rumus probabilistik yang diperoleh dari regresi logistik biner. Kumpulan data konstruksi benefaktif diambil dari Corpus of Historical American English ( COHA ) memberikan peluang kecenderungan kemunculan konstruksi benefactif. Model probabilistik benefaktif dibuat berdasar analisis data corpus dan kemudian dihitung ke dalam rumus logit untuk memprediksi kejadian konstruksi benefaktif.

Penelitian ini dilakukan guna menjawab dua pertanyaan. Pertanyaan yang pertama adalah Elemen linguistik apakah yang berpengaruh terhadap pemilihan konstruksi benefaktif? Pertanyaan yang kedua adalah Seberapa besar dan bagaimana pengaruh elemen yang signifikan terhadap pemilihan konstruksi ditransitif?

Dalam upaya menjawab pertanyaan penelitian, dua belas teori digunakan. Teori-teori tersebut antara lain teori ditransitivity, kelas semantik dari kata kerja, kompleksitas sintaksis, kebernyawaan tema dan penerima, wacana sudah belumnya tema dan penerima dibahas, kata ganti tema dan penerima, konkrit tidaknya tema, lokal dan ketidaklokalan penerima, jumlah penerima manfaat dan tema, kepastian dari tema dan penerima, corpus linguistik, dan model probabilistik regresi logistik. Teori-teori ini digunakan selama proses penjelasan dan digunakan untuk menganalisa elemen-elemen yang signifikan terhadap pemilihan konstruksi benefaktif.

Dapat ditarik kesimpulan dari proses pemilihan dan analisa data dari COHA bahwa pilihan pemakaian benefaktif tidak dipengaruhi hanya oleh satu elemen linguistik. Fitur efek berpadu tampaknya menjadi penjelasan terbaik untuk menganalisa pilihan konstruksi benefaktif. Analisa efek berpadu pada logistik biner menunjukkan bahwa enam elemen linguistik ternyata berpengaruh terhadap pemilihan konstruksi benefaktif. Elemen- elemen yang termasuk di dalamnya adalah kompleksitas sintaksis, kebernyawaan penerima, wacana penerima, kata ganti dari tema, lokal dana ketidaklokalan penerima manfaat, dan kepastian dari tema. Berdasarkan analisa probabilistik, ketika dipadukan enam elemen linguistic tersebut dapat memprediksi kejadian konstruksi benefaktif dengan akurasi 90%. Selain itu, penggunaan model probabilistik benefaktif pada data datif dan sebaliknya menghasilkan fakta bahwa akurasi menurun drastis. Hasil ini


(18)

xviii

mendukung temuan bahwa meskipun arah relevansi elemen serupa, kekuatan efek fitur berbeda jauh antara kedua konstruksi .

Untuk meningkatkan temuan penelitian ini, penelitian masa depan dengan data yang jauh lebih besar diharapkan bisa dilakukan dengan tujuan untuk memperjelas jumlah yang berbeda dari elemen yang mempengaruhi datif dan benefaktif. Penelitian- penilitian di masa mendatang dengan data yang lebih besar diharapkan dapat menjawab pertanyaan apakah datif lebih rumit daripada benefaktif karena memiliki jumlah elemen yang signifikan lebih banyak, dan pertanyaan apakah perbedaan jumlah elemen yang signifikan hanya karena jumlah data dianalisa jauh berbeda.


(19)

1 CHAPTER 1 INTRODUCTION

The background is presented in this chapter to get a preview of the concept of ditransitivity including dative and benefactive constructions before the researcher goes into the discussion of the linguistic features that affect the choice of the structures. The background covers some overviews of traditional linguistic towards benefactive constructions. In addition, it presents the problem faced by language users to the choice of benefactive alternations. In the final sections, problem limitations, research questions, research objectives, and research benefits are presented. Generally, the chapter presents the difficulty of the choice of benefactive constructions.

1.1 Background of the Study

Ditransitive verbs are defined as verbs with double-object construction (Quirk, Greenbaum, Leech, and Svartvik (1985)). Yet, the definition has interesting consequences. A non-native English speaker may often find difficulties in using verbs with double-object construction. A construction with SVOO structure or Subject + Verb + Object 1 + Object 2 as in the example below shows one of the difficulties:

(1) a.*My friend said me Hi.


(20)

2

Although semantically acceptable, the verb said in dative sentence (1) is grammatically unacceptable unless it is paraphrased into

b. My friend said Hi to me.

S V O2 to O1/PP

In other dative example below, the verb brought requires two obligatory objects to be grammatical. When omitting the theme some apples or the recipient me, the construction will be judged as ungrammatical.

(2) He brought me some apples.

Similarly, the construction of benefactive has been quite troublesome for some of language users. Often the language users get confused of what construction to use, whether it is benefactive PP or double object construction.

(3) a. They don‟t tend to make you as much money. b. They don‟t tend to make as much money for you.

The two constructions in example (3) show that benefactive construction seems to be alterable. The problem arises whether the choice of construction is purely free for the language users to choose, or certain formula for the pattern should be obeyed.

Theoretical linguistics traditionally relies on linguistic intuition such as grammatical judgment for such data (Bresnan (2010)). The certain grammatical pattern like benefactive construction for instance, possesses certain pattern to be remembered. The language users have no chance to freely alter their own construction. As by doing so, the language users are predicted to end up in


(21)

3

producing ungrammatical construction. Even then, when they are attempted with much language exposure, the language users still do not understand how to construct such alternation in relatively grammatical forms.

In traditional linguistics such problem of benefactive construction is considered to be complex and difficult to deal with. Some would even consider this benefactive alternation problem uninteresting for linguistic theory. Yet, due to the immense growth of computer-readable texts and recordings, namely „corpus‟ which provide source for the analysis, such problem seems to be solvable. The very method with the more comprehensible analysis and explanation of such grammatical patterns is needed. The researchers can explore the natural language use written in corpus machine, thus it gives chance of the naturally proper usage of certain patterns.

The problem of how language users decide which structure to use has become the subject matter of many researchers in various fields. The approaches include syntactic (Quirk et al. 1972), semantic (Gries and Stefanowitsch 2004), and discourse (Collins 1995). In addition, certain kind of research method, which is probabilistic models for language, has been rapidly developed. In this probabilistic model approach, the research is done based on the real use of grammatical construction mostly in natural settings to predict the probability of the occurrences of certain constructions. Bresnan et al. (2007) have applied such probabilistic approach to explain dative alternation in the language produced by adult speakers in American English. Theijssen et al. (2009) applied similar approach to the data set of benefactive construction of adult and child data. Both Bresnan and Theijssen provide the results of significant features in dative and


(22)

4

Benefactive. Yet, they did not happen to compare the relevant features of dative and benefactive. The shared and un-shared linguistic features of the two constructions remain a mystery.

In computing the probability of occurrences like dative and benefactive alternations, many models can be employed. To predict the probability of occurrences of certain occurrence in this case construction, simple linear regression model, mix-effect linear regression model, multinomial regression model, simple binary logistic model, and mix-effects binary logistic model can be applied (Sunyoto 2007 ; Suharjo 2008). Some formula of probabilistic model can be employed in predicting the occurrences of the alternations including Friedman Tukey, Normit, and Logit Methods. Yet, these plenty of choice sometimes baffle the researcher to which models and methods should be done to the certain probabilistic grammar research.

In Bresnan‟s (2007) dative construction, thirteen features appear to be significant to the choice of dative. They include animacy of recipient, pronominality of recipient, discourse givenness of recipient, semantic class „transfer‟, definiteness of recipient, plurality of theme, person of recipient, givenness of theme, structural parallelism in dialogue, pronominality of theme, syntactic complexity, semantic class „communication‟, and definiteness of theme. Despite the fact that both dative and benefactive are ditransitive constructions, a question appears whether or not both construction share similar relevant linguistic features that affect the choice of their constructions.

To respond to the challenge of how to deal with the benefactive construction in a relatively easy and understandable way and to answer the


(23)

5

question of what linguistic features affecting dative and benefactive construction, this present research on probabilistic benefactive construction using binary logistic models is conducted. Using the similar models used by Bresnan et al.(2007) and Theijssen et al.(2009), the present research focuses on different data set which are benefactive and dative from COHA. The different data set which is used by the present research is due to the amount of instances taken. While Bresnan used 1260 instances of dative and Theijssen employed 143 instances of benefactive, this research uses 400 instances of benefactive in COHA along with 80 instances of benefactive and dative in TIME for cross-validation and external validation tests. This fair amount of instances is taken with the time reason. Bresnan works on dative only, and Theijssen does research on benefactive in adult and child data, this very research compute the data of benefactive construction and compare the results with the dative alternation. This research focuses on written text produced by native speakers of American English.

The problem with the choice of ditransitive constructions both benefactive and dative has tempted the researcher to do the research on the alternations. The inability and or the obscurity of the low-proficient and non-native English language users to deal with the constructions have encouraged the researcher more. The research is considered important because on the top of that the construction has been proven troublesome for the native language users of English who rely much on their language intuition (Bresnan (2010)).


(24)

6 1.2 Problem Limitations

This research limited its discussion into two focused problems. First, this research was limited to the findings of features which are significant to the choice of benefactive construction along with their effect direction and size. Second, this research was limited to the corpus probabilistic grammar model of benefactive construction built from the mixed-effect binary logistic regression model. Third, this research was limited to the shared features which are relevant to the choice of dative and benefactive constructions.

The research used simple and mixed-effect binary logistic regressions model to find out the features significant to the choice of benefactive. The data from COHA corpus which were then coded in SPSS were used as the basis of the analysis. Making use of the binary variables and continuous scaled variable, the binary logistic models were able to present the relevant features toward benefactive construction. In this research, the analysis of the syntactic and semantic features was limited to fourteen features employed by Bresnan et al.

(2007) in their study of dative construction. The effect bias of the verb senses was excluded because the fixed and random effect will be too complicated to analyze.

Additionally, the research was limited only to the logit corpus probabilistic formula to predict the occurrence of benefactive construction. The analysis of the logit formula was intended to support the significant features revealed. The coefficients of the significant features found early on, showed the direction and size of the features toward benefactive construction. From those statistical data of the coefficients, logit probabilistic model of benefactive construction was made. Practically, when the logit was applied into the language in real life context, it was


(25)

7

able to predict the occurrence of the benefactive construction with high accuracy. Taking Bresnan‟s (2007) argument into account, that having no access to the intuition of the language speakers, the regression models allow us to notice the dynamics of syntactic alternation and to predict the alternation in a cognitively realistic way.

Finally, the research was limited to the analysis of the shared significant features toward dative and benefactive constructions. When the shared features were obtained, the cross-model application was done to check the interchangeability of the two probabilistic models. This method provided the information of the different effect sizes of the features when applied to dative and to benefactive. By providing the different size of the effect and testing the models into the cross-data sets of dative and benefactive, the research was able to explain whether dative and benefactive alternations were very much similar or not.

1.3 Research Questions

The research intends to answer the following formulated problems: 1. What linguistic features affect the choice of benefactive construction? 2. How do the significant features differ in the effect size on ditransitive


(26)

8 1.4 Research Objectives

As stated previously, the research has been conducted to answer the research questions above. There are two objectives that the research tries to overcome. The first objective of this research is to reveal the features which are significant to the choice of benefactice construction. Analyzing the features relevant to the choice of benefactive construction will help in understanding the natural use of this construction. As acquiring such a complicated pattern is troublesome for the language users, for non-native speakers moreover, the research will help the language users to realize the construction in a cognitively realistic way. Some features which are found significant to the choice of benefactive construction will be accommodating to predict the choice of benefactive construction.

The second research objective of this study is to identify the shared relevant features of dative and benefactive construction and how they differ to each other. The corpus probabilistic models obtained from logistic regression models are employed to find the direction and size of the feature effects. Hence, the cross-model application is done to check the possibility of model interchangeability.

To support the significant features toward benefactive construction, logit formula of probabilistic benefactive construction is used in the predicting probabilities process. The logit model is used to predict the outcomes of instances possessing some significant features toward benefactive construction. This logit formula provides the highly accurate prediction of the benefactive PP and double object realizations. The idea of probabilistic grammar has been applied in some


(27)

9

studies (Bybee & Hooper 2001; Bod et al. 2003; Gahl & Garnsey 2006; Gahl& Yu 2006).

The mixed-effect of binary logistic model is combined with simple binary logistic model to bring about strong analysis of the feature relevance. The mix-effect binary logistic model basically analyzes a single dependent variable based on many independent variables as predictors which are bounded in the equation. The mix-effect binary logistic model occupies a dependent variable which is the ditransitivity of occurrence and 14 independent variables as predictors which are the linguistic features affecting the ditransitivity. The mixed-effect binary logistic model will provide the effect of variables in the equation, and the simple binary logistic model will give the effect of variables when are not in the equation. The effect of variables in the equation are needed to formulate the logit probabilistic model, while the effect sizes from variables not in the equation are needed to show how each feature contribute to increase the model fit accuracy. Complementing mixed-effect binary logistic model with simple binary logistic model, thus will offer the more precise and complete analysis of the significant features toward dative and benefactive construction.

1.5 Research Benefits

The research offers both theoretical and practical benefits. Theoretically, this research contributes to better knowledge of Corpus Linguistic Study and probabilistic study. Analyzing the corpus data set of the benefactive constructions with corpus probabilistic binary logistic model will reveal how the choice of


(28)

10

benefactive alternation is influenced by some linguistic features. Thus, it suggests that corpus probabilistic model is powerful tool to analyze the grammar alternation.

In addition, this research also contributes to the understanding of ditransitive construction, both dative and benefactive. While prior research done by Collins (1995) use both dative and benefactive constructions, there is no evidence that he makes comparison between the features affecting dative and benefactive. This research, will contribute to the insight of the similarity and difference between features relevant to dative and benefactice construction.

Practically, for readers in general this research gives perception on the process of learning and production language. While language is traditionally driven by certain exact patterns and is hardly analyzed, this research renews the view of language users that language is probabilistic. This probabilistic model will help the language users to find the construction without merely relying on pattern memorization.

In addition, the simple model built from this research can be helpful, especially for non-native language users of English who are generally less proficient than native language users. When the non-native language users at least know the significant features of the benefactive construction, and then occupy the feature coefficients into the corpus probabilistic model, they will be able to find the more possible grammatical form of the construction. Doing this, the probabilistic model will provide prediction of the benefactive construction which is likely to appear, and it is proven to be highly accurate.


(29)

11 CHAPTER 2

THEORETICAL REVIEW

This chapter presents a theoretical review which is used in the analysis of the research. The chapter is divided into 6 parts. The theoretical review presents theories of ditransitivity, linguistic features (semantic class of verbs, syntactic complexity, animacy of theme and beneficiary, discourse givenness of theme and beneficiary, pronominality of theme and beneficiary, concreteness of theme, person of beneficiary, number of theme and beneficiary, definiteness of theme and beneficiary), corpus linguistic, probabilistic model of logistic regression, review of related studies, and theoretical framework.

2.1 Ditransitivity

Ditransitivity is needed to be reviewed as its alternations become the focus of the discussion. Ditransitive constructions have been prototypically defined as the combinations of ditransitve verb with an indirect object and a direct object (Quirk et al.(1985). In its occurrence, ditransitive verb requires both direct and indirect objects. In other terms, the verb requires both theme and recipient/beneficiary. Ditransitive verbs involve dative and benefactive constructions.


(30)

12 2.1.1 Dative

Dative verbs include the ditransitive verbs assign the role of recipient and goal. Dative verbs assigning the role of recipient and goal must syntactically facilitate the to-dative alternation (Subject+ Predicate + Indirect Object + Direct Object/SPIoDo  Subject + Predicate + Direct Object + preposition to + Object of preposition/SPDopreptoOp) and semantically, as well as belonging to the dynamic types expressing action-process, also have the meaning as verbs of motion or movement; caused movement, caused possession, communication implying either transfer of message (Gropen et al (1989).

Many English dative verbs appear in alternative dative PP and dative NP constructions:

(1) a. I give them some money. b. I give some money to them.

Example 1 involves what (Bresnan (1978), cited by Gropen et al. (1989) as denial and repetition, which presupposes that the verb give has the same meaning in both constructions. Here the alternative syntactic constructions are apparently used primarily for a shift of emphasis. Elsewhere, however, different constructions are associated with different semantics.

(2) a. I sent a package to the librarian. ~ I sent the librarian a package. b. I sent a package to the library. ~* I sent the library a package.


(31)

13

In a more specific case, the alternation is not allowed at all, though. It is because the alternation will not be syntactically accepted and or semantically change the meaning.

(3) a. Diana whispered the news to me. b. *Diana whispered me the news.

To establish the terminology in dative construction, table 2.1 is given below: Terms used with the dative alternation

Prepositional dative structure …gave [bread] [to her family] V NP PP Double object construction …gave [her family] [bread] V NP NP Dative PP …gave [bread] [to her family] V NP PP

Dative NP …gave [her family] [bread] V NP NP

Theme …gave [bread] [to her family] V NP PP

…gave [her family] [bread] V NP PP Recipient …gave [bread] [to her family] V NP PP …gave [her family] [bread] V NP NP Figure 2.1 Terminology in dative construction

2.1.2 Benefactive

Benefactive verbs are verbs that bear benefactive role, a thematic or semantic role that shows an argument benefitting from what another argument does (Quirk et al.(1985). Semantically, Quirk et al.(1985) as cited by Nia (2009) classifies verbs into two main groups; dynamic and static. Benefactive verbs belongs the dynamic ones, to be more specific, action verbs and action-process verbs. These verbs, according to Dowty in Jackendoff (1990), when posing as a ditransitive predicator, have the following criteria; first, benefactive verbs assigning beneficiary roles must syntactically facilitate the for-dative alternation


(32)

14

(SPIoDo  SPprepforOp) and semantically, as well as belonging to the dynamic verbs indicating action-process, express the meanings; make available, creation, performance, or preparation.

As mentioned before, benefactive verbs are verbs that assign benefactive roles on the argument or noun phrase functioning as an Indirect Object (Io) or an Object of preposition (Op) in the construction with prepositions like me in sentence 4.

(4) a. She gave me a piece of cake. b. She gave a piece of cake to/for me.

Both constructions according to Jackendoff (1990), bears the following conceptual structure [CS[GO([Y], FROM [X] [TO] [Z]])] which indicates that argument Y experiences a change of possession or situation or location as a result of the action argument X deliberately performs. CS in the above formulation stands for Conceptual Structure, the underlying logic of the preposition (clause or sentence). X is the argument that bears the agent role and occupies the syntactic function of subject in ditransitive construction. Y is the argument that bears the patient/theme role and occupies the syntactic function of direct object in the construction. Z is the argument that bears the benefactive role and occupies the syntactic function of indirect object or object of preposition.

Benefactive verbs employed in this research will be based on the classification of Dowty in Jackendoff (1990). He categorizes the benefactive verbs as carrying the meanings of „make available‟ (MAva), „of creation‟ (VoCr), „of


(33)

15

performance‟ (VPrf), „of preparation‟ (VPre), and verbs with idiomatic meanings (VIdi).

(5) a. I bought him some snacks.

b. His father built him a huge house. c. Diane played me a romantic song.

In the context above, the verbs bought, built, and played are benefactive verbs. However, they are under different sub-categories. Bought belongs to the benefactive verb carrying the meaning „make available‟ (MAva), built belongs to the benefactive verb carrying the meaning „of creation‟ (VoCr), and played under the category „of performance‟ (VPrf).

To establish the terminology in benefactive construction, table 2.2 is given below: Terms used with the benefactive alternation

Prepositional benefactive structure …make[an offer] [for me] V NP PP Double object construction …make [me] [an offer] V NP NP Benefactive PP …make [an offer] [for me] V NP PP

Benefactive NP …make [me] [an offer] V NP NP

Theme …make [an offer] [for me] V NP PP

…make [me] [an offer] V NP PP

Beneficiary …make [an offer] [for me] V NP PP

…make [me] [an offer] V NP NP Figure 2.2 Terminology in benefactive construction

2.2 The Linguistic Features Relevant to Benefactive and Dative

The research makes use of fourteen explanatory variables which were considered likely to influence the choice of alternative benefactive structures. As


(34)

16

the measures of syntactic complexity or „weight‟ are highly correlated (Arnold et al., 2000; Wasow, 2002; Szmrecsanyi, 2004a; Bresnan et al., 2007), the difference in number of graphemic words between the theme and beneficiary to measure their relative weight. The factors of animacy, definiteness, and pronominality of theme were taken into account. Animacy and definiteness were coded using the coding practices of Garretson (2004), and discourse accessibility was coded based on Prince (1981) and Gundel et al. (1993). Pronominality was defined to distinguish phrases headed by pronouns (personal, demonstrative, and indefinite) from those headed by nonpronouns such as nouns and gerunds. In addition to these features, concreteness of theme, person of beneficiary, and number of beneficiary and theme are taken into accounts. From the cross-linguistic evidence, number (singular/plural) and person could also have an influence (Aissen, 1999,2003; Bresnan, 2003; Haspelmath, 2004).

2.2.1 Semantic Verb Class

In Bresnan et al (2007), the dative verbs are classified into six semantic classes. The classification includes „transfer‟ of possession as with give, „future

transfer‟ as with offer, „communication‟ of information as withtell, „prevention of

possession‟ as with deny, and „abstract‟ as with give that a thought. Theijssen et al. (2009) formed semantic categorization with four classes. They are „creation of possesion‟ as with produce, „obtaining of possesion‟ as with get, „keeping of

possession‟ as with keep, and „abstract‟ as with do someone a favor.

Nia (2009) based on Jackendof (1990) divides the verbs indicating benefactive semantic role into 5 classes. The first class includes benefactive ditransitive verbs with the meaning „make available‟. The verbs are buy, organize,


(35)

17

save, catch, fetch, find, get, order, and take. The second class will be the one includes benefactive verbs with the meaning of „creation‟. The verbs are build, make, and write. The third category carries the meaning of „performance‟. The

verbs belong to this class include do, give, play, show, and sing. The fourth category with the meaning „preparation‟ includes verbs fix and pour. The last category of benefactive ditransitive verbs is the one brings the meaning „idiomatic‟. The verbs are bet, bear, spare, do, deal, earn, and grant.

Referring to Dowty in Jackendoff (1990) this research classifies the semantic feature into 4 clusters of semantic categories: the verbs carrying meaning „make available‟, „creation‟, performance‟, and „preparation‟. The first semantic class was represented by the verb get. The second semantic class was represented by the verb make. The third class was represented by the verb play. The last class is represented by the verb fix (See the examples below) (See also Appendices 1 and 2)

6. a. to rest while your Dutch girl - what s her name? Catrine? – gets us something to eat. " Miss Hammond followed her brother to her room, b. the beneficiary of his work savings account. Once married, she pressured him into making her the beneficiary of a $100,000 insurance policy he had, Mallard told the jurors

c. but he doesn't provide enough help on the defensive boards. Although freshmen play the most important roles for this team, OSU has shown it can excel on

d., I don't care. I got ta have a martini. So the bartender fixes him a martini


(36)

18

2.2.2 Syntactic Complexity of Theme and Beneficiary

Syntactic complexity of theme and beneficiary is one of important predictors of word order and construction. The previous study on word order and construction type claim that relative syntactic complexity is one of considerable factors (Hawkins, 1994). In his study of relative clause, Hawkins suggests that as the cumulative size and complexity of nominal modifiers increase, the distance between P and N increases in the pronominal order and the efficiency. It puts the longer, relatively more complex expression at the end of the construction.

One of technique of measuring syntactic complexity can be done by counting the number of graphemic words (Wasow(2002), Szmrecsanyi(2004)). The results of Wasow‟s (2002) corpus study show a clear effect of constituent weight on syntactic alignment in dative sentences. In the double object construction variants theme NP tends to be longer, whereas in prepositional dative variants, recipient NP tends to be longer. His calculation shows that in both variants the final constituent is on average 3.5 times heavier than the constituent occurring immediately post verbally.

Bresnan et al.(2007) use the metric to count the relative syntactic complexity, in which the complexity predictor is the signed logarithm of the absolute value of the difference between the theme and recipient lengths in words. This kind of measure is intended to obtain the relative complexity of theme and recipient in a continuous scale variable.

This present research, the researcher uses a simpler measure as proposed by Bresnan (2010). The relative complexity log scale is obtained by substracting the natural logarithm of the theme from the natural logarithm of beneficiary


(37)

19

length. This measure will result in an ordinal value, in the form of a continuous variable. The examples below illustrate the feature syntactic complexity (See also Appendix 1 and 2).

7. " he said. " Really, it's the old cliche -- I play them one game at a time. " # Elliott, the wide receiver, It's a lot of fun. "

verb : plays

theme : one game

1 2 (log scale counted based on the number of words, for example the theme one game is 2 log scale as it consists of 2 words )

beneficiary : them

1 (log scale)

Length different = beneficiary length – theme length = 1 – 2

= -1 (log scale)

2.2.3 Animacy of Theme and Beneficiary

Animacy is another important predictor affecting the English word order. Recent studies claim that animacy is an important cognitive category in humans with subtle effects on English word order, primarily showing up in variation (Thompson (1990, 1995), Rosenbach (2002), Bresnan & Hay (2008)). These prior studies suggest that animate constituents appear before inanimate ones. Animate theme prefers prepositional construction, putting the theme in post verbal position.


(38)

20

Similarly, inanimate beneficiary favors prepositional construction, putting the inanimate beneficiary in the end of the sentence.

Garretson (2004) classifies the animacy into nine categories- „human‟, „organization‟, „animal‟, „place‟, „time‟, „concrete‟, „nonconcrete‟, „machine‟, and „vehicle‟. The choice between human, organization, and nonconcrete depended on how the coders interpreted the referent of the expression. Although guidelines were given about the difference between human and organization, the cut-off point remains unclear. The categories time and place were defined in a way that did not go beyond the coders‟ perceptive of them. The time was supposed to refer to „periods of time‟. Yet, it left the ambiguity as the time sometime was also claimed as nonconcrete.

For Bresnan‟s (2007) dative data set, animacy was coded in four categories- „human‟, „organization‟, „animal‟, and „inanimate‟ derived from Garretson et al. (2004). The categories „place‟, „time‟, „concrete inanimate‟, „nonconcrete inanimate‟, „machine‟, and „vehicle‟ were collapsed into a single „inanimate‟ category. The boundary between human and organization followed the guidelines from Garretson.

For this research on benefactive, the researcher follow the animacy coding system by Bresnan (2010). The animacy of the data was categorized into human or animal, which is animate vs other. This categorization fits the model in use, which is logistic regression model which require binary variables (See the examples below to clarify the idea of animacy of theme and beneficiary) (See also Appendices 1 and 2)


(39)

21

8. Amelia No. Baron. Baron Wildenhain Does not your face glow, when he makes you a fine speech? referring, perhaps, to love or marriage. Amelia verb : makes

theme : a fine speech : inanimate beneficiary : you : animate

2.2.4 Discourse Accessibility of Theme and Beneficiary

Discourse accessibility is reviewed as the feature which is proven to influence the choice of alternative constructions (Halliday (1970), Thompson (1995)). The role of the tonic is fully demonstrated, and the power of theme/rheme in relation to given/new is very powerful. The feature givenness of theme and or beneficiary is strongly related to the focus placement. The focus of placement of given or non-given information is the main spotlight of the so called alternations.

Bresnan et al (2007) state that many of previous studies on dative alternations, the data were coded into seven levels of discourse accessibility – „evoked‟, „situationally evoked‟, „frame inferrable‟, „generic‟, „containing inferrable‟, „anchored‟, and „new‟ (Prince (1981), Gundel et al. (1993), Michaelis & Hartwell (2007)). Prince (1981:1) hypothesizes a “conspiracy of syntactic construction” designed to prevent NPs that represent unfamiliar information from occupying subject position. In this conspiracy of syntactic construction, given information, which the speaker assumes the addressee is aware of the knowledge precede new information, which the speaker assumes he is introducing into the addressee‟s consciousness (Chafe (1976)).

To make a simple coding in modeling, this research takes the categorization made by Bresnan (2010). The seven categories of discourse


(40)

22

givenness were simplified into two categories. The theme and beneficiary phrase was defined as „given‟ if first, its referent was mentioned in the previous ten lines of discourse („evoked‟), or second, it was a first or second person pronoun (denoting a „situationally evoked‟ referent). All others were „non-given‟. The examples of given and non-given theme and beneficiary are given below. (See also Appendices 1 and 2).

9. was rather the result of principle than of personal predilection. When Mr. West had made a sketch for the Regulus, and submitted it to His Majesty, after some

verb : made

theme : a sketch : non-given beneficiary : the Regulus : given

2.2.5 Pronominality of Theme and Beneficiary

The feature pronominality of theme and beneficiary refers to whether the theme or beneficiary was headed by pronouns or not. Different nominal expression types, such as pronouns, proper names, and common nouns have been found to affect the choice of syntactic alternations (Silverstein (1976), Aissen (1999), O‟Conor et al. (2004) in Bresnan (2010)). The various categories of nominal expressions were ranked to Local person > Pronoun 3rd> Proper noun 3rd> Human 3rd> Animate 3rd> Inanimate 3rd. The findings suggest that 1st or 2nd person pronouns are marked when they are subjects of transitive clauses, but not when they are objects.


(41)

23

In some research of dative construction, the nominal expression of theme and recipients were coded in several coding systems. Cueni (2004) in Bresnan (2007) coded theme and recipients in dative data set into seven categories. The nominal expression types were given values „personal pronoun‟ (her), „impersonal

pronoun‟ (someone), „demonstrative pronoun‟ (that), „proper noun‟ (Jeanne), „common noun‟ (a native African), „gerund‟ (employing some foreigners), and „partitive‟ (the rest of the team). Bresnan (2007) simplified this coding system into two. In particular, pronominality was simplified to phrases headed by personal, demonstrative, indefinite, or reflexive pronouns from those headed by non-pronouns such as nouns and gerunds.

This research occupies the categorization by Bresnan (2010), with similar categorization from the one he made in 2007, yet defining „pronouns‟ as personal (including it, them and generic you), demonstrative, or reflexive. Indefinites is excluded from the categorization. However, basically the feature is coded in binary variable pronoun and non-pronoun (See the examples below to clarify the feature pronominality of theme and beneficiary) (See also Appendices 1 and 2)

10.dear master, you'll be cleared. Mar. Marcel (aside.) Play him some trick to frighten him and he'll confess all. Ber. Bertrand

verb : play

theme : some trick : non-pronoun beneficiary : him : pronoun


(42)

24 2.2.6 Concreteness of Theme

Garretson (2004) coded the theme arguments for whether they referred to a concrete object, defined as a prototypically concrete inanimate object or substance perceivable by one of the five senses. The „prototypical‟ limitation was used to bring the category into the ordinary categorization of what a concrete object is: for example, it excludes water but includes plants. While the previous categorization of animacy was simplified by omitting concrete and nonconcrete inanimates, this feature concreteness of theme tries to compensate the simplification.

This research makes use of the categorization of Garretson (2004) above, yet it assumes that water is concrete object. The categorization of this research relies more of the ability of the four senses to sense the object. When the object can be touched, tasted, smelled, or seen, the object is claimed as concrete. When the object can only be heard, it is included under the category of inconcrete. (See the examples below which illustrate the feature concreteness of theme) (See also Appendices 1 and 2).

11.And then, when' -- Well I hope you will then feel like getting me a new silk gown. You know, Mr. Prouty, that my white

verb : getting

theme : a new silk gown : concrete

2.2.7 Person of Beneficiary

Departing from the findings of Silverstein (1976), the feature person of beneficiary is reviewed. Silverstein ranked the various nominal expressions to Local person > Pronoun 3rd> Proper noun 3rd> Human 3rd> Animate 3rd>


(43)

25

Inanimate 3rd. The findings suggest that 1st/2nd person pronouns are marked when they are subjects of transitive clauses, but not when they are objects. This categorization, however, mix the locality of person (inclusive/exclusive) with pronominality and animacy. Thus, this very feature of person of beneficiary is put under a different category.

In the studies of dative and benefactive alternation, the feature person of recipient/beneficiary is coded into two. Bresnan et al. (2001) claim that person influences syntactic alternations in some languages and variations in English. He then, confirms Cueni‟s (2004) categorization, distinguishing the feature person into inclusive and specific uses of both first and second persons as „local‟ and third person as „non-local‟. In the research of dative construction, Theijseen et al.

(2009) annotated person of recipient by giving it the value local or nonlocal. Local recipients are in first or second person (e.g. I, me, yourself), non-local ones in third person. In this research of benefactive construction, the categorization system is similar with Theijssen‟s. However, this research includes we and us as local, and puts inanimate beneficiary under the category of non-local. The examples of the benefactive construction with the feature person of beneficiary are given below (See also Appendices 1 and 2)

12.in breakfast or dinner isn't of much account. Now, there's Dinah gets you a capital dinner, -- soup, ragout, roast fowl, dessert,

verb : gets


(44)

26 2.2.8 Number of Theme and Beneficiary

Number plays important roles in syntactic variation of grammar. Number is a typologically important category in grammar (Greenberg (1966)). Bresnan (2002), Bresnan et al. (2007) add that feature number is greatly matter in some types of morphosyntactic variation in English. In the dative data set, words with formal plural marking like –s/-es and such kind of instance like fish that the context clearly indicated that it was plural, were coded as „plural; other words were coded as „singular‟.

This research uses the categorization of Bresnan (2007), classifying the feature number into singular and plural. In a special pronoun you, the antecedent was checked to find out whether the pronoun you is plural or singular. The sentences below exemplify the feature number of theme and beneficiary (See also Appendix 3).

13.Availing herself of the decided preference shown her, she might have aimed at making her husband a party in the dispute; and, by his means, have

verb : making

theme : a party : singular beneficiary : her husband : singular

2.2.9 Definiteness of Theme and Beneficiary

In his research on predicting syntax, dative construction, Bresnan (2010) use the coding system as utilized by Garretson (2004). When the theme or recipient is placed into a phrase in the context of There is/are __ permits an


(45)

27

existential interpretation, then the NP is coded as indefinite. Referring to Cueni (2004), examples of indefinite NPs include one, a little bird, more jobs, something I can eat; examples of definite NPs include her, that bag, the dog, my photo album, Diane, all my classmate.

This research employs the categorization system by Theijssen (2009) which divide the feature definiteness into two, definite and indefinite. All (syntactic) object heads that were preceded by a definite article or a definite pronoun (e.g. demonstrative and possessive pronouns), and all objects that were proper nouns or definite pronouns themselves, were annotated definite. The remaining objects were given the value indefinite. The examples below illustrate the feature definiteness of theme and beneficiary (See also Appendix 3).

14.it might assist in the accomplishment of her hopes. You took your part – made me a promise that you would exercise all your abilities as an actor, to

verb : made

theme : a promise : indefinite beneficiary : me : definite

2.3 Corpus Linguistics

The debate on the importance of Corpus Linguistics has been present for years. The two major sides conflicting are rationalists and empiricists. Rationalist theories are based on the development of a theory of mind, in the case of linguistics. The theories aim at developing a theory of language that not only


(46)

28

analyzes the external effects of human language processing, but also to make claim that it represents how the processing is actually undertaken within human mind. Empiricist theories, on the other hand, are dominated by the observation of naturally occurring data, typically through the medium of the corpus. In this case, sentences are said to be grammatical and are formed by natural collocation when they are tested in corpus.

According to McEnery and Wilson (2001), language is finite and is an enumerable set that can be gathered and counted. For this reason, the corpus was seen as source of hard data in the formation of linguistics theory and was said to be a perfect place to test linguistics theory. In addition, the four characteristics of corpus linguistic study proposed by Biber (1998) provides the fact that corpus linguistic study offers natural environment to check the phenomenon of grammar construction, in this case benefactive construction. The four characteristics include first, the fact that corpus-based analysis is empirical, analyzing the actual patterns of use in natural texts. Second, corpus study utilizes large data collection of natural texts, known as „corpus‟. Third, it makes extensive use of computers for analysis, using both automatic and interactive technique. The last characteristic is that the corpus study depends on both quantitative and qualitative analytical techniques. These characteristics make corpus linguistic study exploitable in predicting the tendency of grammatical pattern used by language users in real life situation.


(47)

29

2.4 Probabilistic Model of Logistic Regressions

Probabilistic model of regression model makes use of corpus data as the basis analysis and then combined the result of analysis with regression model in SPSS. The result of the variables in the equation in SPSS then is computed with the logit probabilistic formula of logistic regression. The method of dynamic probabilistic grammar has been used in several prior studies (Bybee & Hooper 2001; Bod et al. 2003; Gahl & Garnsey 2006; Gahl& Yu 2006). But then, Bresnan (2007) focuses the use of this dynamic probabilistic grammar to the domain of syntactic variation. The focus of probabilistic model in syntactic variation is to predict the occurrences of certain construction. When the probabilistic value gets closer to one, the tendency for the event to happen is high. Conversely, when the probabilistic value is closer to zero the tendency for the event to happen is low. Below is the logit formula of probabilistic binary logistic regression model (Sunyoto 2007; Suharjo 2008) :

constant+β1 x1+β2 x2+β3x3+… +βn xn e

In

p

constant+β1 x1+β2 x2+β3x3+… +βn xn

1+e

exp(constant+β1 x1+β2 x2+β3x3+… +βn xn 1+ exp(constant+β1 x1+β2 x2+β3x3+… +βn xn In

p

=

constant+β1 x1+β2 x2+β3x3+… +βn xn

1 - p

Where p represents the probability of the event to happen, in this case benefactive PP or double object construction. β is the coefficient of each linguistic features. β1


(48)

30

represents the coefficient of feature 1, β2 represents the coefficient of feature 2, and so forth.

When the constant and coefficients were obtained from the mixed-effect binary logistic, the formula above can be completed and was able to predict the occurrence of certain grammatical alternation, in this research casebenefactive construction. The probabilistic formula gives us the percentage of certain instance to take benefactive PP or double object construction. In addition, from the computed data in the regression models, the research could obtain the model fit accuracy of the corpus probabilistic formula of benefactive construction.

2.5 Related Studies

The first related study by Colleman (2007) aimed to examine the semantic evolution of the English and Dutch ditransitives over the last three to four century (later Modern English/Dutch). He focused on the constructions of ditransitive/double object in English and Dutch which showed that the constructions are formally and semantically quite similar. Ditransitive consists of a subject and unmarked NP objects. V-slot can be filled by verbs of giving + verbs from a number of other, semantically related classes (e.g. promise, send). Especially for Dutch ditransitive, the semantic structure consists of semantic core „beneficial transfer of a concrete entity from an agent to a willing recipient‟. Additionally, extensions include various semantic dimensions („direction of transfer‟, nature of the transferred entity‟).

In the study, Colleman took the corpus data from CLMET (extended version), complied by H. De Smet: literary texts drawn from Project Gutenberg


(49)

31

etc. ; three subperiods: 1710-1780, 1780-1850, 1850-1920, adding up to 15 million words (De Smet, 2005) and from a similar self-compiled corpus of (later) Modern Dutch literary texts drawn from dbnl.org; four subperiods: 1640-1710, 1710-1780, 1780-1850, 1850-1920, adding up to 10,5 million words.

The findings showed that the semantic retraction happened. Both in English and Dutch, the semantic range of the ditransitive seems to have narrowed rather than broadened over the investigated period. The study concluded that there were 4 uses of ditransitive in English and Dutch have either disappeared from the Late Modern period, or are on the decline (in terms of lexical possibilities and/or frequency). The ditransitives included “benefactive” ditransitive, „dispossesion‟, „banishment‟, „envy‟ and „forgive‟: attidutinal ditransitives. All 4 of them are semantically quite distant from the central „Agent causes Recipient to receive Patient‟ sense.

Another research on corpus linguistics was conducted by Nia (2009). Emphasizing on the significant feature of ditransitive verbs that is they are mostly or usually used in sentences that bear benefactive role, she conducted the study on benefactive verbs in Double Object Construction (DOC) in English sentences. The benefactive role is a thematic or semantic role that shows an argument benefitting from what another argument does. The study answered the questions of what the syntactic features of the benefactive verbs in DOC in English sentences are, and of what the semantic features of the benefactive verbs in DOC in English sentences are.

Nia collected the data from five novels. There were 608 data obtained, sentences with ditransitive clause pattern with the following details: 112 data


(50)

32

taken from the Man in The Brown Suit, 128 data taken from The Runaway Jury, 248 data taken from Scarlett, 64 data taken from To Kill A Mocking Bird, and 56 data from Gaijin. Every datum was then analyzed using conceptual structure (CS) in order to find if the datum indicates benefactive features. Data with benefactive features were then analyzed based on clause structure, the benefactive role the verbs assign and the meanings of the benefactive verbs.

The findings revealed because of the conceptual structure (CS), the ditransitive or benefactive verbs in double object construction (DOC) in English assign three specific benefactive roles, namely beneficiary, recipient, and goal. Benefactive verbs may appear in two types of clauses, the double object construction (DOC) with a structure of (S+P/V+IO+DO), and the DOC with prepositions with the structure of (S+P/V+O+PREP+OP). Another valuable finding is that the benefactive verbs have inherent meaning of „make available‟ (MAva), „of creation‟ (VoCr), „of performance‟ (VPrf), „of preparation‟ (VPre), and verbs with idiomatic meanings (VIdi).

The third study on corpus which becomes the main reference to my study is the study on dative construction by Bresnan et al. (2007). Applying the probabilistic model, they are able to predict the occurrences of dative construction with 95% model fit accuracy. The study also gave insight on how to use the mixed-effect binary logistic regression model to obtain the coefficients which then are used to formulate the corpus probabilistic model. The study found out that thirteen features are relevant to the choice of dative construction. The features include definiteness of theme, semantic class communication, semantic class


(51)

33

givenness of theme, structure pararelism, concreteness of theme, person of reciepient, number of theme, definiteness of recipient, semantic class transfer of possession, discourse givenness of recipient, pronominality of recipient, and animacy of reciepient.

The fourth study, which is the study on benefactive construction done by Theijssen et al.(2009) found four features to be significant in the choice of benefactive construction. They analyze the benefactive construction in adult and child data. They found syntactic complexity, discourse givenness of theme,

number of theme, and semantic verb class ‘communication’ significant to the

choice of benefactive construction. The other interesting result of this study is that the adult and child data does not seem to greatly different. They argue that this due to the children tend to imitate the constructions made by adult as closely as possible.This study proposed the validity test of the data using ten-fold cross-validation technique which is then also employed in this research.

2.6 Theoretical Framework

This chapter has discussed some theories of ditransitivity, semantic verb class, syntactic complexity, animacy of theme and beneficiary, discourse givenness of theme and beneficiary, pronominality of theme and beneficiary, concreteness of theme, person of beneficiary, number of theme and beneficiary, and definiteness of theme and beneficiary. This part shows how the theories help answering the research questions.

The theory of ditransitivity tells us that the construction includes dative and benefative. It is very important to understand the nature of both dative and


(52)

34

benefactive and how they differ from one another. This theory is the basis used to determine the dependent variable and is useful in answering both research questions. The theory gives the alternation of benefactive which can take benefactive PP or double object construction. Similarly, it tells us how dative construction may appear in prepositional dative or double object constrictions. The issue of ditransitivity, especially in benefactive cases is prominent to conduct this research analysis to find out the significant features affecting the construction. The theories in the chapter 2 of semantic verb class, syntactic complexity, animacy of theme and beneficiary, discourse givenness of theme and beneficiary, pronominality of theme and beneficiary, concreteness of theme, person of beneficiary, number of theme and beneficiary, and definiteness of theme and beneficiary become prominent as these features are treated as independent variables, which are the predictors or parameters of the occurrences of benefactive construction. The research should use the theories of how some of these features which are basically nominal can be coded using binary values and one feature should be coded using ordinal/continuous scale value. Therefore, the results of the annotating process of these relevant features become the guidelines to reveal the probability of benefactive construction occurrences.

The related studies which are presented in this chapter provide the researcher with the example of corpus and probabilistic grammar research. These studies function as a type of guidelines in conduction the corpus probabilistic model research. This research puts itself among those related works, contributing an insight of how well corpus probabilistic model can predict benefactive and


(53)

35

dative at the same time. Hence, this study relies on the basic principle of conducting corpus linguistic study, the empirical data of language use.


(54)

36 CHAPTER 3

RESEARCH METHODOLOGY

This chapter discusses the methodology of the research, and it is divided into three parts: research type, research data, and data analysis. The research type explains the kind of study this research belongs to. The research data provides description of the nature and the origin of the data used in this study. This part also includes the explanation of how the data were collected and processed. The last part, data analysis explains how the data were processed and interpreted.

3.1 Research Type

In general, the research belongs to the domain of a Corpus Linguistic Study with some reasons. The first reason is that the research used a collection of data set which is claimed to be natural namely corpus. The research is empirical, analyzing the authentic patterns of use in natural texts (Biber, 1998). The natural data is also said to be a „real world‟ text as the instances in the corpus are simply written from the real usage of language. The corpus machine gives access to naturalistic language information, to texts which are products of real life situation. This research in fact took the natural language information from the corpus to make a probabilistic model to grammar of the real life language users.

The second reason is that the research used the corpus as foundation of analysis. The research utilized a large and principled collection of natural texts, known as a “corpus” as the basis of analysis (Biber, 1998). The research was applied in the domain of the modern form of corpus linguistics, where the


(55)

37

collection of the data relies heavily on the use of computer. The research gathered the instances from millions words of corpus. The research analyzed the cases of benefactive construction which appeared in COHA data set. COHA corpus possesses four hundred millions of words written in Corpus BYU web site. The corpus consists of a large number of naturally occurring texts. Although, the data is sometimes claimed not to represent the entire language, the validity tests done to the data helped to convince the readers that the corpus information represents most of the language use in real life situation.

The third reason is that this research employed both automatic and manual procedure. Corpus linguistics research extensively uses computer for analysis, using both automatic and interactive technique (Biber, 1998). In this research, the computer played important role in selecting the data from the corpus machine. This electronically readable corpus had reduced the time needed to find particular construction in this case benefactive construction. The researcher only needed to put query into the string, then the instances involving the query appeared on the screen of computer. The researcher, however still needed to manually select, recheck, and annotate the occurrences with benefactive construction.

Finally, this research used both qualitative and quantitative analytical technique. The research included the process of annotating the linguistic features of the instances. The research also occupied the analysis of significance of the features to the choice of benefactive construction. The direction and the size of the effect of the relevant features were described qualitatively. At the same time, the research observed the frequency of the benefactive occurrences. Also, the research involved the binary-coded features and continuous scale feature of dependent and


(56)

38

independent variables. The coefficient (B), standard error (S.E), odds ratio (exp(B)), and 95% CI of the linguistic features were basically obtained from the analysis of the numbers in the SPSS. This kind of analysis belongs to the domain of quantitative research. This fact confirms one of the characteristics of Corpus Linguistic Study suggest by Biber (1998).

In addition, the research also belongs to the domain of probabilistic grammar. Theoretically, the research adopt the idea of Bresnan (2007) to apply a dynamic probabilistic grammar (Bybee & Hooper 2001; Bod et al. 2003; Gahl & Garnsey 2006; Gahl& Yu 2006) to the domain of syntactic variation. Besides aiming at finding the significant features which are relevant to the choice of benefactive, this research also tried to predict the occurrences of certain instances based on the features found. Given the coefficients of the significant features, the probabilistic model was able to predict what construction tends to be used by the speakers.

3.2 Research Data

The data of the research was taken from Corpus of Historical American English (COHA) which can be accessed through web page http://corpus.byu.edu/coha/. The corpus was created in 2007 by Mark Davies, a linguistics professor at Brigham Young University with funding from US National Endowment for the Humanities. The corpus provided 400 million words of text of American English and covered the period of time from 1810 to 2009. The diachronic data is taken as it presents the use of the benefactive constructions during years so that any possible change of usage is covered. The idea of taking


(1)

220

275 0.99663 predicted DO 1 1 accurate 276 0.42556 predicted PP 0 0 accurate 277 0.90721 predicted DO 1 0 inaccurate 278 0.72112 predicted DO 1 0 inaccurate 279 0.98879 predicted DO 1 1 accurate 280 0.98738 predicted DO 1 0 inaccurate 281 0.90721 predicted DO 1 0 inaccurate 282 0.99899 predicted DO 1 1 accurate 283 0.89660 predicted DO 1 0 inaccurate 284 0.68783 predicted DO 1 0 inaccurate 285 0.99899 predicted DO 1 1 accurate 286 0.71300 predicted DO 1 0 inaccurate 287 0.99970 predicted DO 1 1 accurate 288 0.68783 predicted DO 1 0 inaccurate 289 0.97040 predicted DO 1 0 inaccurate 290 0.00159 predicted PP 0 0 accurate 291 0.99899 predicted DO 1 1 accurate 292 0.99991 predicted DO 1 1 accurate 293 0.99965 predicted DO 1 1 accurate 294 0.18694 predicted PP 0 0 accurate 295 0.99304 predicted DO 1 0 inaccurate 296 0.72112 predicted DO 1 0 inaccurate 297 0.43536 predicted PP 0 0 accurate 298 0.05520 predicted PP 0 0 accurate 299 0.99979 predicted DO 1 1 accurate 300 0.86989 predicted DO 1 1 accurate 301 0.89283 predicted DO 1 0 inaccurate 302 0.99663 predicted DO 1 1 accurate 303 0.99791 predicted DO 1 1 accurate 304 0.90721 predicted DO 1 0 inaccurate 305 0.99663 predicted DO 1 1 accurate 306 0.99663 predicted DO 1 1 accurate 307 0.99899 predicted DO 1 1 accurate 308 0.99663 predicted DO 1 1 accurate


(2)

221

309 0.99981 predicted DO 1 1 accurate 310 0.99791 predicted DO 1 1 accurate 311 0.99663 predicted DO 1 1 accurate 312 0.99938 predicted DO 1 1 accurate 313 0.05520 predicted PP 0 0 accurate 314 0.43536 predicted PP 0 0 accurate 315 0.99938 predicted DO 1 1 accurate 316 0.99981 predicted DO 1 1 accurate 317 0.99304 predicted DO 1 1 accurate 318 0.99999 predicted DO 1 1 accurate 319 0.99304 predicted DO 1 1 accurate 320 0.99791 predicted DO 1 1 accurate 321 0.39652 predicted PP 0 0 accurate 322 0.99938 predicted DO 1 1 accurate 323 0.99791 predicted DO 1 1 accurate 324 0.99791 predicted DO 1 1 accurate 325 0.99791 predicted DO 1 1 accurate 326 0.99184 predicted DO 1 0 inaccurate 327 0.91759 predicted DO 1 0 inaccurate 328 0.99605 predicted DO 1 0 inaccurate 329 0.99981 predicted DO 1 1 accurate 330 0.99683 predicted DO 1 0 inaccurate 331 0.99791 predicted DO 1 1 accurate 332 0.39652 predicted PP 0 0 accurate 333 0.98687 predicted DO 1 1 accurate 334 0.95730 predicted DO 1 0 inaccurate 335 0.99899 predicted DO 1 1 accurate 336 0.99899 predicted DO 1 1 accurate 337 0.99663 predicted DO 1 1 accurate 338 0.99663 predicted DO 1 1 accurate 339 0.99755 predicted DO 1 1 accurate 340 0.99899 predicted DO 1 1 accurate 341 0.99993 predicted DO 1 0 inaccurate 342 0.98879 predicted DO 1 1 accurate


(3)

222

343 0.99994 predicted DO 1 1 accurate 344 0.99791 predicted DO 1 1 accurate 345 0.01712 predicted PP 0 0 accurate 346 0.87325 predicted DO 1 0 inaccurate 347 0.99791 predicted DO 1 1 accurate 348 0.97040 predicted DO 1 0 inaccurate 349 0.99663 predicted DO 1 0 inaccurate 350 0.67261 predicted DO 1 0 inaccurate 351 0.89660 predicted DO 1 0 inaccurate 352 0.39652 predicted PP 0 0 accurate 353 0.68783 predicted DO 1 0 inaccurate 354 0.99970 predicted DO 1 1 accurate 355 0.97040 predicted DO 1 0 inaccurate 356 0.00517 predicted PP 0 0 accurate 357 0.06416 predicted PP 0 0 accurate 358 0.05520 predicted PP 0 0 accurate 359 0.05679 predicted PP 0 0 accurate 360 0.87325 predicted DO 1 0 inaccurate 361 0.39652 predicted PP 0 0 accurate 362 0.39652 predicted PP 0 0 accurate 363 0.16383 predicted PP 0 0 accurate 364 0.95851 predicted DO 1 0 inaccurate 365 0.69424 predicted DO 1 0 inaccurate 366 0.16383 predicted PP 0 0 accurate 367 0.89283 predicted DO 1 0 inaccurate 368 0.43536 predicted PP 0 0 accurate 369 0.05679 predicted PP 0 0 accurate 370 0.05679 predicted PP 0 0 accurate 371 0.91759 predicted DO 1 0 inaccurate 372 0.99791 predicted DO 1 0 inaccurate 373 0.05679 predicted PP 0 0 accurate 374 1,00000 predicted DO 1 1 accurate 375 0.43536 predicted PP 0 0 accurate 376 0.95851 predicted DO 1 0 inaccurate


(4)

223

377 0.16798 predicted PP 0 0 accurate 378 0.95851 predicted DO 1 0 inaccurate 379 0.16383 predicted PP 0 0 accurate 380 0.95851 predicted DO 1 0 inaccurate 381 0.68783 predicted DO 1 0 inaccurate 382 0.39652 predicted PP 0 0 accurate 383 0.39652 predicted PP 0 0 accurate 384 0.05679 predicted PP 0 0 accurate 385 0.95851 predicted DO 1 0 inaccurate 386 0.39652 predicted PP 0 0 accurate 387 0.74460 predicted DO 1 0 inaccurate 388 0.01764 predicted PP 0 0 accurate 389 0.43536 predicted PP 0 0 accurate 390 0.16383 predicted PP 0 0 accurate 391 0.43536 predicted PP 0 0 accurate 392 0.99899 predicted DO 1 0 inaccurate 393 0.72112 predicted DO 1 0 inaccurate 394 0.95730 predicted DO 1 0 inaccurate 395 0.39652 predicted PP 0 0 accurate 396 0.96121 predicted DO 1 0 inaccurate 397 0.99938 predicted DO 1 1 accurate 398 0.96675 predicted DO 1 0 inaccurate 399 0.05679 predicted PP 0 0 accurate 400 0.16798 predicted PP 0 0 accurate

242 = 60.5%

Accuracy of Benefactive Probabilistic Model on Dative Constructions in

TIME

No p DO predicted expected observed accuracy

1 0.83327 predicted DO 1 1 accurate 2 0.08778 predicted PP 0 1 inaccurate 3 0.34052 predicted PP 0 1 inaccurate 4 0.03989 predicted PP 0 1 inaccurate 5 0.18831 predicted PP 0 1 inaccurate


(5)

224

6 0.13693 predicted PP 0 1 inaccurate 7 0.60563 predicted DO 1 1 accurate 8 0.02144 predicted PP 0 1 inaccurate 9 0.01378 predicted PP 0 1 inaccurate 10 0.18831 predicted PP 0 1 inaccurate 11 0.11910 predicted PP 0 1 inaccurate 12 0.90016 predicted DO 1 1 accurate 13 0.02251 predicted PP 0 0 accurate 14 0.05064 predicted PP 0 1 inaccurate 15 0.60563 predicted DO 1 1 accurate 16 0.10997 predicted PP 0 1 inaccurate 17 0.42043 predicted PP 0 1 inaccurate 18 0.73478 predicted DO 1 1 accurate 19 0.10997 predicted PP 0 1 inaccurate 20 0.00098 predicted PP 0 0 accurate 21 0.00570 predicted PP 0 0 accurate 22 0.28680 predicted PP 0 1 inaccurate 23 0.83327 predicted DO 1 1 accurate 24 0.22253 predicted PP 0 1 inaccurate 25 0.60563 predicted DO 1 1 accurate 26 0.11910 predicted PP 0 1 inaccurate 27 0.62691 predicted DO 1 1 accurate 28 0.60563 predicted DO 1 1 accurate 29 0.60563 predicted DO 1 1 accurate 30 0.00570 predicted PP 0 0 accurate 31 0.18831 predicted PP 0 1 inaccurate 32 0.58880 predicted DO 1 1 accurate 33 0.73478 predicted DO 1 1 accurate 34 0.20246 predicted PP 0 1 inaccurate 35 0.83327 predicted DO 1 1 accurate 36 0.18228 predicted PP 0 1 inaccurate 37 0.90796 predicted DO 1 1 accurate 38 0.60563 predicted DO 1 1 accurate 39 0.00731 predicted PP 0 1 inaccurate


(6)

225

40 0.05064 predicted PP 0 1 inaccurate