Introduction Directory UMM :Data Elmu:jurnal:E:Economics of Education Review:Vol20.Issue1.2001:

Economics of Education Review 20 2001 81–92 www.elsevier.comlocateeconedurev Can flexible non-linear modeling tell us anything new about educational productivity? Bruce D. Baker Department of Teaching and Leadership, The University of Kansas, Lawrence, KS 66045, USA Received 28 November 1998; accepted 24 March 1999 Abstract The objective of this study is to test, under relatively simple circumstances, whether flexible non-linear models — including neural networks and genetic algorithms — can reveal otherwise unexpected patterns of relationship in typical school productivity data. Further, it is my objective to identify useful methods by which “questions raised” by flexible modeling can be explored with respect to our theoretical understandings of educational productivity. This study applies three types of algorithm — Backpropagation, Generalized Regression Neural Networks GRNN and Group Method of Data Handling GMDH — alongside linear regression modeling to school-level data on 183 elementary schools. The study finds that flexible modeling does raise unique questions in the form of identifiable non-linear relationships that go otherwise unnoticed when applying conventional methods.  2001 Elsevier Science Ltd. All rights reserved. JEL classification: I21 Keywords: Neural networks; Functional form

1. Introduction

Production function applications to educational research have gradually, but steadily, evolved since the Coleman Report of 1966. Among the current trends are increased emphasis on student-level data Goldhaber Brewer, 1996, school-level data Harter, 1999; Mur- nane Levy, 1996, greater understanding of the hierarchical design of our system of schooling Kaplan Elliott, 1997, 1 and the relevance of the structural nature of direct and indirect relationships within that system Kaplan Elliott, 1997. In addition, substantial emphasis has been placed on identifying more useful measures E-mail address: bdbakerukans.edu B.D. Baker. 1 Although not formally a production function study, Kaplan and Elliott’s hierarchical structural model for validating policy indicators explores the sensitivity of schooling performance outcomes to changes in sets of policy indicators in a way that parallels the production function theoretical framework. 0272-775701 - see front matter  2001 Elsevier Science Ltd. All rights reserved. PII: S 0 2 7 2 - 7 7 5 7 9 9 0 0 0 5 1 - 5 of the outcome — “educational productivity”. Progress in outcome measurement has, however, led to a diver- gence rather than a convergence of philosophies, with current preferences ranging from economic impacts on the labor market and earnings Betts, 1996; Card Krueger, 1996 to more basal school achievement measures such as minimal concept mastery Harter, 1999. Despite the apparent substantive progress made with respect to conceptual and methodological concerns, a few basic rules continue to govern production function methodologies. First, production function studies are typically performed within the narrow confines of formal deductive hypothesis testing. That is, the researcher begins with a question along the lines of — “Are school- level instructional expenditures per pupil related to student achievement outcomes?” Next, the researcher estab- lishes hisher hypothesis, based on prior research and theoretical assumptions regarding the expected outcomes, and constructs a statistical model for testing the hypothesis. Although a well-understood and generally accepted 82 B.D. Baker Economics of Education Review 20 2001 81–92 paradigm, this purely deductive approach presents cer- tain potential difficulties to the researcher. For one, this approach requires that the researcher has or finds some prior knowledge as to how the system in question works. This knowledge may ultimately be rooted in anything from valid theoretical constructs to personal or political biases, the latter of these problems becoming more prevalent when dealing with politically heated issues like educational productivity. Problems associated with a priori understanding of the system are confounded when applied to the development and application of a statistical model for hypothesis testing. Typically, the production function is expressed as follows: fQ, X uS50, 1 such that outcomes, Q, are a function of schooling inputs, X, and non-school inputs, S. This function is most often analyzed in the form of a linear regression equation: Q ij 5bX ij 1gS j 1e ij , 2 where Q ij is the outcome of student i in school j, X ij are the schooling inputs to that student, S j is a vector of non- schooling inputs and e is a stochastic error term. Linear regression applications to production function modeling and related estimation procedures are limited in a variety of ways. These limitations include, but are not limited to, difficulties with the selection of model parameters. The usefulness of linear regression models lies in our ability to interpret individual regression coefficients, their statistical significance and respective mag- nitudes. More meaningful models tend to be those that are parsimonious, addressing necessarily narrow questions as exemplified by Goldhaber and Brewer 1996 in their study of the effects of teacher characteristics on student performance outcomes. While such studies provide valuable insights with respect to the question at hand, the necessity to repeatedly narrow research questions to this degree increases the probability that education researchers and economists may miss potentially important questions. Other studies have relied on data dumping 2 of massive numbers of potential inputs, neglecting the effects of multicollinearity on both the magnitude and significance of the regression coefficients of interest. Harter 1999, 2 I choose the term dumping here where others might use data mining. The intent is not to choose a more derogatory term, but to use a term that emphasizes the distinct difference between this method and data “mining” methods discussed later. Data dumping, as it is used herein, refers to attempting to include all possible variables in a single model whereas data mining is used to describe a process of sifting through all possible variables to find potential relationships that may be modeled. for example, separately includes salaries and benefits for each category of personnel in Texas schools, eventually concluding none of them to be significant, but finding more obscure measures such as salary supplements to be positively related to performance and substitute pay to be negatively related to performance. 3 While data dumping may in some ways serve as a useful preliminary step in such complex analyses, it is unlikely to yield clear, definitive or even useful results in linear regression modeling. In addition, tools such as step-wise regression for selecting more parsimonious linear regression models from among the various predictors are generally inad- equate. Selection of functional form is similarly problematic in regression modeling. Linear regression modeling, by definition, seeks to identify linear relationships between specified input and outcome measures. 4 That is, relationships are assessed on the extent to which unit increases in X are constantly related to unit increases in Y. Hanu- shek 1996, p. 55, for example, discusses 90 studies which collectively generate 377 attempts to estimate a linear, or some highly restricted variant, 5 relationship between teacher–pupil ratios andor teacher education and student performance. Hanushek concludes that no systematic linear relationship exists. We would perhaps be wise to consider the possibility, if not the probability, that within matrices of data on schooling productivity, there are actually some non-linear relationships that are “tighter” 6 than some linear relationships. These relationships, where their curvilinear nature substantially violates assumptions of linearity, may go unrecognized or their magnitude underestimated when using linear methods Cohn Geske, 1990, p. 166. The only way to identify these relationships via conventional methods is to know or at least expect in advance that they exist and integrate them into econometric models as higher-order terms or alternative functional forms. A common a priori assumption of non-linearity rooted in economic theory is that of diminishing returns. As noted by Betts 1996, p. 163, “the education production function, like all well-behaved production functions 3 Harter 1999, p. 294. 4 Similarly, in structural models such as those applied by Kaplan and Elliott 1997, each direct effect in the structural equation model is represented as a linear relationship. Although combinations of direct and indirect effects may yield non-lin- earities, structural equation models do not explicitly allow for sets of non-linear direct effects. It is presumed that these relationships could be accommodated by either data re-scaling log–log relationships or the inclusion of higher-order terms squared, cubed, etc.. 5 For example, linear relationship between logged ln terms and other functional forms to be discussed later. 6 In terms of R-squared if fitted with a curve. 83 B.D. Baker Economics of Education Review 20 2001 81–92 [emphasis added] is subject to diminishing returns. This behavior is generally well captured by applying a log– log specification of wages relative to per pupil spending”. Others have replaced the log of spending with a quadratic function, achieving a similar interpretation Johnson Stafford, 1973. More recently, Figlio 1999 questioned the effectiveness of highly restrictive speci- fications of functional form for estimating education production functions, noting in particular the usefulness of more flexible estimation procedures. 7 Betts’ choice of the phrase “well-behaved” is indica- tive of the standard mindset with which we approach production function modeling. This common econometric phrase suggests that our primary objective as a researcher is to determine the extent to which reality or data generated by the underlying processes of our reality “behaves” according to the mathematical specification of our mental model of that process. 8 An underlying pre- sumption being that if the data fail to conform to our model, that there is either some flaw in the data or the system, rather than a flaw in our mental model. In light of this perspective, an appropriate re-framing of Hanu- shek’s 1996 conclusion might be: “We have yet to generate statistical findings to support that the relationship between teacher–pupil ratios andor teacher education and student performance conforms to our mental model for that relationship”. 9 Complementary inductive methods and analytical tools do exist — some of which can specifically provide support in the areas of parameter selection and identifi- cation of potential non-linear forms. The methods dem- onstrated in this study fall under the broad analytic umbrella of Data Mining. Data mining is the process of exploring the available data for patterns and relationships Lemke, 1997. Data mining activities range from the visual exploration of bivariate scatterplots, often done as a preliminary to formal econometric modeling, to the use of iterative pattern learning algorithms or neural networks to search for potential relationships in data sets. While it is presumed that the development of most econometric models involves a great deal of inductive 7 In particular, Figlio finds that a more flexible transcen- dental logarithmic translog functional form Christensen, Jor- gensen Lau, 1971 provides more sensitive estimates of the spending-to-achievement relationship than a more restrictive Cobb–Douglas specification. See also Douglas and Sulock 1995. 8 The intent of this criticism is not to make a particular example of Betts, but to exemplify how deeply rooted and broadly accepted this mindset has become. The phrasing chosen by Betts is indeed standard. It just happens that he was the author of the passage that I chose to cite. 9 And even more specifically, that the relationship fails to conform to our “restricted mathematical representation of our existing mental model”. tinkering by the researcher, it is also presumed that the researcher cannot efficiently explore all possibilities or conceptualize the plethora of non-linear relationships that may exist. In addition, the human researcher brings with himher the baggage of personal and political pre- disposition as to what the data should say. Thus, this study explores the use of flexible non-linear modeling, including neural networks, as a supplement to the typical preliminary activities of induction, and as a complement to conventional deductive production function analysis. Broadly speaking, neural networks are iterative pattern learning algorithms modeled after the physiology of human cognitive processes. Unfortunately, the term “neural network” is also frequently misused as an over- arching classification encompassing other types of algorithms, including genetic algorithms, that achieve similar ends, but by different means. This study applies both neural and genetic algorithms and refers to them collectively as flexible non-linear models. Applied to econometrics, flexible non-linear models are free of a priori assumptions of functional form, deriv- ing deterministic equations from available data, selecting predictors that best serve the modeling objective — pre- diction accuracy. Cross-sectional predictive and time- series forecasting accuracy of flexible non-linear models has been validated in the fields of medicine Buchman, Kubos, Seidler Siegforth, 1994, real estate valuation Worzala, Lenk Silva, 1995, bankruptcy assessment Odom Sharda, 1994 and forecasting education spending Baker Richards, 2000. Others have noted the potential usefulness of neural networks for exploring data in social science research Liao, 1992. The objective of this study is to test whether flexible non-linear models can reveal otherwise unexpected patterns of relationship in typical school productivity data. This study builds on the work of Figlio 1999 by com- bining flexible functional form with inductive estimation algorithms to provide a more sensitive estimation of potential relationships in the given data set. The ultimate goal of this exercise is to identify useful methods and develop a framework by which “questions raised” by flexible modeling can be explored with respect to our theoretical understandings of educational productivity. This study applies three types of flexible estimation pro- cedure, alongside linear regression modeling, to school- level data on 183 elementary schools.

Introduction Directory UMM :Data Elmu:jurnal:E:Economics of Education Review:Vol20.Issue1.2001:

1. Introduction

2. Methods

Parts

Dokumen yang terkait

The Basic Tools of Finance

The Science of Macroeconomics

Three-Dimensional (3D) Reconstruction for Detecting Shape and Volume of Lung Cancer Nodules

SECTIONS: SC, AR, KRK, IK, FP, CP Respondent is a HH Member 18 Years or Older who is Knowledgeable About Characteristics of Household Members NAME OF RESPONDENT: RSPNDNT: RESPONDENT

The Fourth Wave of the Indonesia Family Life Survey: Overview and Field Report

Theories of International Regimes

American Association for the Advancement of Science is collaborating with JSTOR to digitize,

10 Christopher K. Lamont International Criminal Justice and the Politics of Compliance Ashgate (2010)

International Review of Industrial and Organizational Psychology 2005 Volume 20

ASI Eksklusif dan Tingkat Kecerdasan Anak di Taman Kanak-Kanak Exclusive breastfeeding and The Intelligence of Children In Kindergarten

Dukungan

Links

Introduction Directory UMM :Data Elmu:jurnal:E:Economics of Education Review:Vol20.Issue1.2001:

1. Introduction

2. Methods

Parts

Dokumen yang terkait

The Basic Tools of Finance

The Science of Macroeconomics

Three-Dimensional (3D) Reconstruction for Detecting Shape and Volume of Lung Cancer Nodules

SECTIONS: SC, AR, KRK, IK, FP, CP Respondent is a HH Member 18 Years or Older who is Knowledgeable About Characteristics of Household Members NAME OF RESPONDENT: RSPNDNT: RESPONDENT

The Fourth Wave of the Indonesia Family Life Survey: Overview and Field Report

Theories of International Regimes

American Association for the Advancement of Science is collaborating with JSTOR to digitize,

10 Christopher K. Lamont International Criminal Justice and the Politics of Compliance Ashgate (2010)

International Review of Industrial and Organizational Psychology 2005 Volume 20

ASI Eksklusif dan Tingkat Kecerdasan Anak di Taman Kanak-Kanak Exclusive breastfeeding and The Intelligence of Children In Kindergarten

Dokumen yang Anda mencari sudah siap untuk unduhkan