Quantitative vs Qualitative factors

Introduction to Econometrics

  Ekki Syamsulhakim Undergraduate Program Department of Economics

  

factors

  • In previous chapters, the dependent and independent variables in our multiple regression models have had quantitative meaning.
  • Just a few examples include hourly wage rate, years of education, college grade point average, amount

  

factors

  • In empirical work, we must also incorporate qualitative factors into regression models.
  • The gender or race of an individual, the industry of a frm (manufacturing, retail, etc.), and the region in the

    United States where a city is located

  

Variables

  • Most of this chapter is dedicated to qualitative independent variables.
  • • Qualitative factors often come in the form of

    binary information:
    • – a person is female or male;
    • – a person does or does not own a personal computer;
    • – a frm ofers a certain kind of employee pension plan or it does not;

Dummy Variable

  • In all of these examples, the relevant information can be captured by defning a binary variable or a zero-one variable.
  • • In econometrics, binary variables are most

    commonly called dummy variables, although this name is not especially descriptive.
  • In defning a dummy variable, we must

  

Naming Dummy Variable

  • The variable name (for dummy variables) indicates the event with the value one
  • For example, in a study of individual wage determination, we might defne female to be a binary variable taking on the value

  one for females and the value zero for males.

  • The same information is captured by

  

Naming Dummy Variable

  • Either of these is better than using gender

    because this name does not make it clear

    when the dummy variable is one:
    • – does gender = 1 correspond to male or female?

    >

    • Suppose in the wage example that we have

    chosen the name female to indicate gender.
  • • Further, we defne a binary variable married

    to equal one if a person is married and zero

  

Variable

  • just add it as an independent variable in the equation
  • We use as the parameter on female in order to highlight the interpretation of the parameters multiplying dummy variables;
  •  
    • – later, we will use whatever notation is most

  

of dummy variable

  • • In model (7.1), only two observed factors

  •   afect wage: gender and education.
  • Since female =1 when the person is

    female, and female=0 when the person

    is male, the parameter has the following interpretation:
    • – is the diference in hourly wage between females and males, given the same amount

  

of dummy variable

  • Thus, the coefcient determines
  •  

  

whether there is discrimination against

women

  • – if , then, for

  the same level of other factors ,

women earn less than men on average.

  • In (7.1), we have chosen males to be

  the base group or benchmark group, that is, the group against which of dummy variable

Graphical Analysis

  • The situation can be depicted graphic
  •  

  as an intercept shift between males and females.

  • In Figure 7.1, the case is shown, so that men earn a fxed amount more per hour than women.
  • The diference does not depend on the amount of education, and this explains

  Example: wage1.xlsx

  

Dummy Variable Trap

  • • You may wonder why we do not also include in

    (7.1) a dummy variable, say male, which is one for males and zero for females.
  •  
    • – The reason is that this would be redundant.

  • In (7.1), the intercept for males is , and the intercept for females is
  • Since there are just two groups, we only need two diferent intercepts.
    • – This means that, in addition to we need to use

Dummy Variable Trap

  • Using two dummy variables would introduce perfect collinearity because female+male =1, which means that male is a perfect linear function of female.
  • Including dummy variables for both genders is the simplest example of the so-called dummy variable trap , which arises when too many dummy variables

  

Interpretation Dummy Vrbl Continues…

  • Interpreting Coefcients on Dummy Explanatory Variables When the Dependent Variable Is log(y)
  • Using Dummy Variables For Multiple

  Categories

  • Incorporating Ordinal Information by Using Dummy Variables

  

When the Dependent Variable Is

log(y)

  

When the Dependent Variable Is

log(y)

  

For Multiple Categories

  • We can use several dummy independent variables in the same equation.
  • For example, we could add the dummy variable married to equation (7.9).
  • The coefcient on married gives the

  (approximate) proportional diferential in wages between those who are and are not married, holding gender, educ, exper,

  For Multiple Categories

  

For Multiple Categories

  • Assuming the coefcient of married is statistically signifcant, then:

  

“Misal terdapat 2 orang yang memiliki

pendidikan, pengalaman, masa kerja tetap,

dan jenis kelamin yang sama, namun salah

satu sudah menikah dan yang lainnya

belum , maka tingkat upah orang yang

menikah tersebut lebih tinggi rata-rata

  

categories

  • Instead of using year of schooling, we use dummy variables:
    • – primschd = 1 if the individual fnishes primary school and 0 otherwise
    • – secondschd = 1 if the individual fnishes secondary school and 0 otherwise
    • – Univd = 1 if the individual fnishes
    categories

  •  

  

categories

  • Consider 2 individuals having identical experience, tenure, gender, and marital status, but 1 individual has fnished university and the other has fnished primary school , the person who
by Using Dummy Variables

  • Ordinal variable: a variable which value representing rank
  • Ordinal variable can be something like:
    • – Subjective well being (4=very happy;

  3=happy; 2=not happy; 1=very sad)

  • – etc

  

by Using Dummy Variables

  • Hamermesh and Biddle (1993) used measures of physical attractiveness in a wage equation.
by Using Dummy Variables

  Omar Barkan vs Kiwil

  

by Using Dummy Variables

  • Hamermesh and Biddle (1993) used measures of physical attractiveness in a wage equation.

  

by Using Dummy Variables

  • • Because there are so few people at the two

    extremes, the authors put people into one of three groups for the regression analysis: average, below average, and above

    average, where the base group is average.

  • Using data from the 1977 Quality of Employment Survey, after controlling for the usual productivity characteristics,
by Using Dummy Variables

by Using Dummy Variables

Variables

  • We can recast the previous model (the model where female and married appear separately) by adding an interaction term between female and married
  • • This allows the marriage premium to

    depend on gender

  …previously…

  …with interaction femmar…

  

Variables

  • Setting female=0 and married=0 corresponds to the group single men , which is the base group. The intercept of single men is 0.321
  • • We can fnd the intercept for married

  men by setting female=0 and married=1

  

Variables

  • We can fnd the intercept for single

  women by setting female=1 and married=0 single women of

  • – This gives an intercept 0.321 – 0.11 =0.210.

  

  • – The (log) wage differfencfe between

  single men and married men is 0.210 –

  

Variables

  • We can fnd the intercept for

  married women by setting female=1 and married=1

  • – This gives an intercept of married women : 0.321 – 0.11 + 0.213 – 0.301

    = 0.321 – 0.198 = 0.123

  

  wage differfencfe between single men and married women is

  • – The (log)

  

of interaction dummy term

Differencee table

    FEMALE MALE

  SINGLE - 0.11

  (base) MARRIE

  • -0.198 0.213

  D

  

of interaction dummy term

“Misal terdapat 2 orang yang memiliki

pendidikan, pengalaman, masa kerja

tetap, namun salah satu adalah pria

yang belum menikah dan yang lainnya

wanita menikah , maka tingkat upah

wanita menikah tersebut lebih rendah

rata-rata sebesar 17.96%