Overview of Factor Analysis
11.2 Overview of Factor Analysis
A guiding principle of science is parsimony, the explanation of relatively complex phenomena
parsimony: The
in terms of a small number of basic principles. A second guiding principle is that scientific
simplest model to account for the
explanations often involve abstract concepts, such as gravity, that themselves are not directly
most phenomena.
observable. Instead, the existence of these concepts is inferred from measurements and observations of the world in which we live. Factor analysis is a statistical methodology and analysis procedure consistent with these scientific principles.
The purpose of doing a factor analysis is to build a model of the variables of interest
latent variable:
expressed in terms of a relatively small number of underlying abstractions, what are called latent
Unobserved, abstract variable.
variables. A classic example of a latent variable in social science research is an attitude such as
factor: The name
Machiavellianism or a personality trait such as Extroversion/Introversion. In a factor analysis
of a latent variable in factor analysis.
these latent variables are operationalized as factors, abstractions not directly observed, but which can be empirically inferred and validated from the correlations between the corresponding measured variables.
Many studies in social science research use surveys and questionnaires that consist of many items. Although the items may be presented in a randomized order, each attitude or personality characteristic of interest is typically measured by a set of items called a multi-item scale. Each person’s responses to each of the items on such a scale are then summed to form the
item analysis:
measurement of the underlying latent variable for that person. In this context the measured
Analysis of the relations of items
variables are the items on the attitude survey, and factor analysis becomes the primary tool
and the
for item analysis. Item analysis analyzes the relations between the items and scales to guide the
corresponding scales.
construction of the multi-item scales. Each derived scale typically corresponds to a factor from the factor analysis.
One of the key issues in a factor analysis is how many factors are needed to account for the relations among the measured variables. Usually the researcher begins with a one-to-one correspondence between the factors and the attitudes of interest. For example, consider the social construct of Machiavellianism as operationalized by the Christie and Geis (1970) 20-item
Mach IV scale,
Mach IV scale. If the Mach IV scale assesses the construct of Machiavellianism, then presumably
Table 1.2 , p. 26
there is a factor that would be revealed by a factor analysis that would indicate a measurement model that links each Mach IV item with this one underlying factor. This analysis would be evidence for a unified construct of Machiavellianism.
Factor/Item Analysis 253
The model that relates each measured variable to the factors is a measurement model, a concept measurement
first delineated by Charles Spearman (1904) in what has become one of the most influential model: Relates the observed measures papers in all of the social sciences. The measurement model that specifies a single Machiavellian to the latent factor links each item on the Mach IV scale to this one factor. A one-factor model, called a variables.
unidimensional model, postulates just a single dimension, or factor, to underlie the 20 Mach IV unidimensional
model: The
items.
observed measures
In the specification of the unidimensional measurement model the response to each scale have only one
latent variable in
item is expressed in terms of two attributes. The first attribute is the underlying factor itself, common. that is, the extent the response to the item is attributable to what is shared with all the other items on the scale. What is not shared with the other items on the scale is unique to a specific
item. For the unidimensional model of Machiavellianism the model for each of the 20 Mach IV regression model,
Section 9.2.2 ,
items can be written with the following regression model.
p. 205
m i =λ i F + u i
This model specifies that the response to the i th item on the scale for a given respondent is explained in terms of the underlying attribute shared with all the other scale items, the factor F , which presumably is Machiavellianism, plus some unique contribution, u i , of this item that is, well, unique to that item. The complete unidimensional measurement model in this situation is the set of 20 such equations, one equation for each item.
For example, consider the sixth item on the scale, “Honesty is the best policy in all cases”. The associated model for this item follows.
m 06 =λ 06 F + u 06
This regression coefficient, λ i , the weight of the underlying factor of the common attribute
that underlies the responses to the measured variable, is a factor pattern coefficient. The factor factor pattern
analysis of this complete 20-equation model provides an estimate of λ
i for each item, that is, an coefficient that estimate of how strongly the responses to the item directly depend on the one shared common relates the
coefficient: Model
factor. measured item to a
factor.
The relation between the factors and measured variables in the measurement model is a causal specification. Each factor is theorized to partially account for, or cause, the response to the item. For example, someone with a high Machiavellian attitude would then respond Agree or Strongly Agree to items that are consistent with Machiavellianism. The extent of the causal impact depends on the quality of the item in terms of its clarity and to the extent that it reflects the Machiavellian attitude.
The analysis also portrays the extent that the responses depend on the unique qualities of uniqueness: the item, the uniqueness. Each unique portion of the response to an item is the uniqueness term, Variance of an item not shared with the underlying basis of the response to the item not shared with the other items on the scale. any other items. By definition the unique component of each item is uncorrelated with the unique components
of the remaining items. It consists, in part, of random response error. Each administration of random
the item potentially changes the response error: random response error, much like flipping the same coin 10
Component of a
times and getting 6 heads, and then flipping the same coin again and getting 4 heads on 10 response that is flips. Random response error is relatively large for ambiguous items. The respondent may infer attributable to
unexplained
one meaning on one reading of an ambiguous item and another meaning on another reading. randomness. The resulting response to such an item is less a function of the underlying attitude and more a function of random response.
254 Factor/Item Analysis
systematic
The uniqueness component also consists of systematic error, the contribution to invalidity
response error: Stable component
regarding the measurement of the underlying attitude. The item may measure something
of a response that is attributable to
stable, but it measures something other than the attitude of interest. Consider the item “I like
the wrong content.
chocolate ice cream”. Presumably this item analyzed as the 21st Mach IV item would result in its
invalidity: The
corresponding responses demonstrating a low value of λ and a high uniqueness component. The
result of systematic
responses to this ice cream item likely have little to do with Machiavellianism, so the responses
error.
would not be related in any meaningful sense with the responses to the actual Mach IV items. The factor analysis presumably would quantify the lack of this relationship with an estimated value of the factor pattern coefficient, λ , close to zero.
The factor analysis differentiates between the attributes of each response shared with the remaining items, the factor, and the attribute unique to each item. By definition the items correlate only to the extent that one or more common attributes generate the responses to the items. In particular, the measurement model, which regresses the measured variables, the items, onto the factors, imposes the correlational structure of the measured variables. The factor analysis reverses this process, inferring the underlying factor structure from the correlations among the variables of interest.