Offending estimates in covariance struct 002

Psychological Bulletin
1987, Vol. 101, No. 1,126-135

Colwright 1987 by the American Psychological Assooation, Inc.
0033-2909/87/$00.75

Offending Estimates in Covariance Structure Analysis:
Comments on the Causes of and Solutions to Heywood Cases
William R. Dillon

Ajith Kumar

Department of Marketing
Bernard M. Baruch College

Department of Marketing
State University of New York at Albany

Narendra Mulani
Department of Marketing
Rutgers University Graduate School

In this article we discuss, illustrate, and compare the relative efficacy of three recommended approaches for handling negative error variance estimates (i.e., Heywood cases): (a) setting the offending estimate to zero, (b) adopting a model parameterization that ensures positive error variance
estimates, and (c) using models with equality constraints that ensure nonnegative (but possibly zero)
error variance estimates. The three approaches are evaluated in two distinct situations: Heywood
cases caused by lack of fit and misspecification error, and Heywood cases induced from sampling
fluctuations. The results indicate that in the case of sampling fluctuations the simple approach of
setting the offending estimate to zero works reasonably well. In the case of lack of fit and misspecification error, the theoretical difficulties that give rise to negative error variance estimates have no
ready-made methodological solutions.

Overview

mum likelihood estimation and generalized least squares), generates automatic starting values for parameter estimates, and
provides improved diagnostics and goodness-of-fit indices.

Psychologists and other behavioral scientists are using structural covariance analysis with increasing regularity (Bagozzi,
1981; Bentler & Bonett, 1979; Bentler & Speckart, 1981; Fredricks & Dossett, 1983; Judd & Krosnick, 1982). Structural
analysis of covariance matrices, which subsumes confirmatory
factor analytic and structural equations models as special cases,
is a general method for analyzing multiple measurements. Its
purpose is to detect and assess latent sources of variation and
covariation in the observed measurements. This technique explicitly recognizes the role of latent variables in accounting for

the interrelations among the observed variables. Ideally, the latent variables contain all of the essential information on the
linear interrelations among a set of variables and are derived
from substantive theory. The theoretical covariance structure,
which reflects the presumed relations among the observed variables if the theory it is based on is reasonable, is compared with
the sample data to determine the fit of the hypothesized model
and ultimately the appropriateness of the theory.
The most frequently used approach for parameter estimation
in structural covariance analysis is through the widely distributed computer program LISREL (JOreskog & S6rbom, 1982).
The LISREL series, now in its sixth commercial version, is also
available in sPss-x (SPSS, 1983). The newest version, LISRELVI,
offers several different options for parameter estimation (maxi-

Structural Covariance Models
In presenting structural covariance models with observable
and unobservable variables it is usually convenient to use a path
diagram. Several conventions are followed in drawing path diagrams. Observable exogenous variables are denoted by Xs and
observable endogenous variables by Ys. All observable variables are represented by squares and unobservable variables by
circles. Unobservable exogenous variables are denoted by ~s
and unobservable endogenous variables by ns. The effects of
endogenous variables on endogenous variables are denoted by

B coefficients and the effects of exogenous variables on endogenous variables by 3, coefficients. The correlations between unobservable exogenous variables are denoted by r The error
term for each equation relating a set of exogenous and endogenous explanatory variables to an endogenous criterion variable
is denoted by L The regression coefficient relating each observable variable to its unobservable counterpart is denoted by ~,.
Errors in the measurement of exogenous variables are denoted
by ~ and errors in the measurement of endogenous variables by
,. The correlation between exogenous variables is depicted by
curved lines with arrowheads at both ends. This signifies that
one variable is not considered a cause of the other. Paths, in the
form of unidirectional arrows, are drawn from the variables
taken as causes (exogenous, independent) to the variables taken
as effects (endogenous, dependent). The meaning of the various
terms will become clearer in the discussions and examples that
follow.

We thank David Rindskopf for his valuable comments on an earlier
version of this article.
Correspondence concerning this article should be addressed to William R. Dillon, Department of Marketing, Box 508, Bernard M. Baruch
College, 17 Lexington Avenue, New York, New York 10010.
126


OFFENDING ESTIMATES IN COVARIANCE STRUCTURE ANALYSIS

Perspective and Objectives
Although confirmatory factor analysis and structural equation models with unobservable variables have enhanced our
ability to investigate the interrelations among a set of variables,
and specifically to confront theory with data, there are several
problems and areas of confusion in the application of these
techniques. Consider, for example, the standard confirmatory
factor analysis model (JOreskog & S6rbom, 1982), which can
be written as
x = A~ + 6,
(1)
where x is a vector of q observed variables, ~ is a vector of m
unobservable latent factors such that m < q, A is a q • m matrix
of weights or factor ioadings relating the observed variables to
their latent counterparts, and ~ is a vector of q unique factors
representing random measurement error and indicator specificity. Under suitable conditions the variance-covariance matrix for x. denoted by Z, can be represented as
= A,I,A' + O~,

(2)


where 9 is the m • m covariance matrix o f f and On is the diagonal q • q matrix of unique variances or uniquenesses, that is,
E(~6'). A ditficulty common to all methods of finding maximum
likelihood estimates is that the likelihood function may not have
any true maximum within the region for which all the unique
variances are positive. Unfortunately, nonpositive unique variance estimates are frequently encountered (cf. JOreskog, 1967);
in fact, according to Lee (1980), "It is well known that in practice about one-third of the data yield one or more nonpositive
estimates of the unique variances" (p. 313).
When nonpositive unique variance estimates are obtained
the solution is said to be improper. In exploratory maximum
likelihood factor analysis (i.e., ,I, = I is assumed), solutions were
said to be improper when one or more unique variance estimates in On were less than an arbitrarily small positive value,
say .005 (cf. J0reskog, 1967, p. 451). Unique variance estimates
can also be negative, as the maximum likelihood estimation
procedure used in LISREL VI does not impose any constraints
on the permissible values that parameter estimates can assume; l thus, variances can be negative and correlations greater
than one. When negative unique variance estimates are obtained, the solution is commonly referred to as a Heywoodcase
(Harman, 1971, p. 117). Thus, a Heywood case necessarily
yields an improper solution, but not every improper solution is
a Heywood case.

Because the commonsense expectation is for error variances
to be positive and present to some degree in one's observed
measures, improper solutions, particularly Heywood cases,
have been a source of concern and consternation. Several strategies for handling Heywood cases have been suggested. In this
review we discuss, illustrate, and compare the relative efficacy
of three recommended approaches for handling negative variance estimates: (a) setting the offending estimate equal to zero,
(b) adapting a model parameterization that ensures positive error variance estimates (Bentler, 1976), and (c) using models
with equality constraints that ensure nonnegative (but possibly
zero) error variance estimates (Rindskopf, 1983). The three approaches are evaluated in two distinct situations: Heywood

127

cases caused by lack of fit and misspecification error, and Heywood cases induced from sampling fluctuations. The results indicate that in the case of sampling fluctuations the simple approach of setting the negative variance estimate to zero works
reasonably well. No adequate solution has, however, been proposed to handle the theoretical difficulties that give rise to negative variance estimates in the case of lack of fit and misspecification error.
We begin by providing some background on the possible reasons why Heywood cases are encountered. We then discuss the
three recommended strategies for handling negative variance
estimates. In the Method section we describe the two settings in
which the three remedial approaches are evaluated. Then in
the penultimate section, we report the results of each analysis
setting. Finally, we conclude by discussing the implications of

the findings and provide some general recommendations.
H e y w o o d Cases
Until recently there have been few attempts to explain why
Heywood cases occur. In this section we review the conceptual
and empirical reasons for this kind of irregularity. As we demonstrate, the problem is particularly insidious because in many
instances there is little evidence to suggest which restrictions
actually cause the offending results.

Conceptual Explanations
One common explanation of why Heywood cases occur is
that the common factor model does not fit the empirical data.
Though lack of fit can cause Heywood cases, it is not the only
cause. For example, in the context of exploratory factor analysis, Driel (1978) identified and empirically explicated three major causes of Heywood cases: (a) sampling fluctuations, in combination with a population value close to the boundary of an
interpretable region (e.g., a negative error variance estimate
when the true value approaches zero because of fluctuations in
sampling); (b) the inappropriate fitting of the common factor
model to empirical data (e.g., the patterns of signs and the magnitudes of the elements of the correlation matrix are not consistent with a single factor model); and (c) the indefiniteness (underidentification) of the model (e.g., a factor containing a number of small loadings indicates there are alternative nonunique
solutions, some of which may be ditficult to interpret).
Under the classical factor analytical model these three causes
are hopelessly indistinguishable. Thus, in order to analyze Heywood cases, Driel adopted a nonclassical approach, in which

the usual constraints on On and AIA' were relaxed, z With the
nonclassical approach the three potential causes of Heywood
cases could be distinguished by examining the confidence interval around the negative uniqueness estimate obtained by using
the corresponding estimated standard error. For example, the

The only constraint imposed in LISRELV1is that Z is positive definite (J6reskog & S6rbom, 1982).
2 In classical confirmatory analysis O~ is assumed to be positive definite (i.e., all eigenvalues are strictly positive) and AIA' is assumed to be
positive semidefinite (i.e., all eigenvalues are strictly nonnegative with
at least one eigenvalue being zero).

128

W. DILLON, A. KUMAR, AND N. MULANI

Heywood case would be attributed to sampling variation when
the confidence interval for the offending estimate covers zero,
and the magnitude of the corresponding estimated standard error is roughly the same as the other estimated standard errors;
the Heywood case would be attributed to lack of fit when the
confidence interval for the offending estimate does not cover
zero; finally, the Heywood case would be attributed to indefiniteness when the confidence interval for the offending estimate

covers zero but the relative magnitude of the corresponding estimated standard error is so large it results in an elongated confidence interval.
E m p i r i c a l Identification as a P o t e n t i a l E x p l a n a t i o n

The concept of empirical underidentification (Kenny, 1979)
can be used to understand why Heywood cases and such related
problems as improper solutions, factor loadings and factor correlations outside the permissible range, and large variances of
parameter estimates may occur. Empirical identification is intimately related to the concept of identification. As Koopmans
(1949) noted, in establishing the identifiability of a model one
often imposes certain seemingly innocuous assumptions that
may not be true. When these assumptions are wrong, in a strict
sense one cannot say that the model is identified, but only that
it may be identified under certain conditions. 3 As Rindskopf
(1984) recently discussed, the conditions generally require that
certain parameters not be zero, or that parameters not equal
one. Thus, whenever factor loadings are close to zero, or factor
correlations are close to one, or factor correlations are close to
zero, there may potentially be a problem. What constitutes
"small," "large:' "close to zero," or "close to one" is not important, because the exact values of parameters that can either
cause or prevent improper solutions and related problems is
unfortunately a matter of judgment and can change from situation to situation:

Examples of empirical underidentification can be found in
Kenny (1979), McDonald and Krane ( 1977, 1979a) and Rindskopf (1984). A simple yet insightful example of how empirical
underidentification can cause a number of problems can be
seen in the following three-variable (Xt,)(2, and X3), one-factor
model:

X3
and

2; = AA' + O~,

(3)

where all of the above terms are as previously defined. In general, a three-variable, one-factor model is thought to be exactly
identified. However, as Kenny (1979) and McDonald and Krane
(1979) have shown, if one of the factor loadings is zero, the
model is not identified. If one of the factor Ioadings is close to
zero, some parameter estimates will have large standard errors.
The problem is rather insidious because the variable whose
standard error is adversely affected is not the one with the small

factor loading. Instead the factor loading standard errors and
the error variances for the other variables are the parameter estimates affected. This can be seen by examining the following
relations.

hi = c o v ( X l , X2)cov(Xl, X3) ,
~2 ),3

(4)

x22 = cov(X~, X2)cov(X2, X3)

(5)

Xl ~3
and
X3z = coy(X3, X 0 c o v ( X 2 , )(3)

(6)

Xl X2
Note that if one of the factor loadings is close to zero then in
the covariance matrix of the observed variables two covariances
will be near zero; hence, in practice the covariance matrix can
strongly indicate which factor loading is close to zero. However,
the standard error for the factor loading close to zero will not
be large, as the above relations show. For example, if we arbitrarily take ~k3 a s close to zero, the estimates of ~,j and >,2 will be
very unstable with correspondingly large standard errors because these terms involve the >,3 parameter estimate (see
Equations 4 and 5). As a by-product, the error variances for
these variables will also have large standard errors and estimates
of error variances for some of them could be negative. It is also
obvious that a s ~k3 approaches zero, any values for ),~ and ~,2that
reproduce the observed correlations are possible, regardless of
whether the estimates fall within the permissible range. In other
words, only the product of ~,t ~2 is well determined and hence
either ),l or ~2 can be greater than one, accompanied by a negative error variance. This appears to be the most common type
of Heywood case.
The problem can become more acute as the complexity of the
model increases. Consider, for example, the structural equation
model shown in Figure 1. The model corresponds to a secondorder factor model. To implement this model with the LISREL
VI computer program we need to specify the following:
MO

NY=7
NE=3
NX=0
N K = 1 L Y = f u , fr
TE = di, fr BE = ze P S = di, f r GA = fu, fr
P H = sy, f r

PA

LY

1t
1

0
0

0
0

0
0
0
0
0
PA

It
1
1
0
0
TE

0
0
0
It
1

1 1 1 1 1 1 1

3This line of reasoning is also implicit in the work of Koopmans and
Reiersol (1950), Reiersol (1950), and Anderson and Rubin (1956).
4 In discussing these problems, Rindskopf (1984) offers the following
guidelines: Factor loadings less than. 1 in absolute value should be considered small, whereas one smaller than .2 in absolute value should be
considered suspect; a correlation greater than .95 should be considered
close to one, whereas one greater than .90 should be considered suspect;
similar remarks apply to correlations that are within .05-. 10 of zero.

OFFENDING ESTIMATES IN COVARIANCE STRUCTURE ANALYSIS

129

and xl/3, XI/l, and XII2 are the variances of ~'3, ~'~, and f2, respectively. For identification purposes, we would typically set 3'1 =
1. Under this condition we find that

COV(TI3 rl2)
"}/2 -- COV(.O3,r]I) ,

var(~l) = COV('r/l'02)

COV('I7 3 '171)
C0V073 712) '

and

cov(n3n2)
'Y3 = COV(TI 1 r]2),

Figure 1. Second order factor model (Squares: observable variables; circles: unobservable variables. Xs: observable exogenous variables. 3':
effects of exogenous variables on endogenous variables. ~. error term for
each equation relating a set of exogenous and endogenous explanatory
variables to an endogenous criterion variable, n: unobservable endogenous variables. 0: unique variances. ),: regression coet~cient relating
each observable variable to its unobservable counterpart. 6: unobservable exogenous variables. Unidirectional arrows indicate paths from
variables taken as causes to those considered effects.)

PA

PS

1

1
GA

1

PA
1t

1

1

In the LY matrix, one free parameter in each c o l u m n should
be set equal to one (or any other number) for identification
purposes. Similarly, one element of the G A matrix should be
fixed because the elements now represent factor loadings (second order) rather than structural parameters. We use a superscript t to distinguish those parameters restricted for purposes
of identification.
The structural equations corresponding to this model are
n3 = "r3~l + ~'3,

(7)

rll = "Yl~l + ~'1,

(8)

n2 = 'Y2~l + ~'2,

(9)

so that if3, r and if2 are uniquely determined. However, if 3"3
is close to zero, then cov(n3n0 and cov(n3n2) will also be near
zero, the standard error for 3'2 will be large, and certain of the
u n i q u e and residual variances, that is, ~ , % , and var (~), respectively, can be negative so that a whole range of symptoms
results.
H e y w o o d Cases: Possible S o l u t i o n s
As we indicated, until recently few options existed for dealing
with negative variance estimates and related irregularities. In
this section we discuss three approaches for handling Heywood
cases and c o m m e n t on the ease or difficulty of implementing
them. As a framework for this discussion and the comparative
analysis that follows, consider the five-manifest indicator, twocomponent oblique factor analytic model shown in Figure 2.
This is a simple model, yet it permits a demonstration of the
relative advantages and disadvantages of each of the three reco m m e n d e d approaches.

S e t t i n g Error Variance E s t i m a t e s to Z e r o
Several researchers (cf. JOreskog, 1967; Lawley & Maxwell,
1971) have suggested a simple, practical approach to handling
negative error variance estimates: When negative error variance
estimates occur simply fix them at zero. Though this approach

and
where
var(n3) = "r~var(~h) + if3,

var(n3 = -r~var(6~) + 1~,
vat(n2) = -r2var(h) + %,
cov(n3n~) = "y3"Y2var(~l),
COV(n3 n2) = ~3'Y1 var(~i),

and
COV(nl//2) = "Yl'Y2var(61),

Figure 2. Two-component oblique factor analytic model. (r correlation
between unobservable exogenous variables. Curved lines with arrowheads at each end depict correlations between exogenous variables. For
other definitions see Figure 1.)

130

W. DILLON, A. KUMAR, AND N. MULANI

that requires fixing error variance estimates at zero can be handled by the LISRELVI program, it suffers from a number of wellknown drawbacks. First, and perhaps most serious, maximum
likelihood theory has not been proven to be valid at boundary
minima. Second, as Bentler (1976) discussed, setting negative
error variances to zero results in a mixed factor-component
model. Bentler uses the term mixed factor-component model to
describe the case where one or more of the manifest indicators
of a latent construct are assumed to be measured without error.
Setting a negative variance estimate to zero implies, at least conceptually, that the observable variable is measured without error, which is consistent with a principal-components analysis
specification. The problem with mixed factor-component
models is that the parameter estimates obtained with maximum
likelihood methods are, in a strict sense, nonunique when
viewed as nonrestricted estimates. 5

Reparameterization of the Model
Recently, several researchers have developed different conceptualizations of linear structural models than that of JOreskog's LISREL model (cf. Bentler, 1976; Bentler & Weeks, 1980;
Lee, 1980; Lee & Tsui, 1982; McDonald, 1980). These alternative conceptualizations allow the user to impose rather general
types of constraints on model parameters. In theory, at least,
the nature of the constraints permitted is such that all error
variance estimates must be strictly positive. Unfortunately
these methods are not currently available to the typical user of
structural equation models nor, in general, can they be implemented with the LISRELVI computer program. An exception is
Bentler's (1976) structural factor analysis model. As we shall
demonstrate, the LISREL model, in certain instances, can be
viewed as a restricted version of this model and thus can be
implemented with the LISRELVI computer program.
The covariance matrix ofx under Bentler's (1976) structural
factor analysis model has the form
2; = Z ( A M A ' + I)Z.

(10)

This structure has several interesting features. First, the structural form of the covariance matrix shown in Equation 10 cannot reduce to a principal-components analysis model of the
form 2; = AA'. Second, it can reduce to the structural covariance matrix shown in Equation 2 by specifying A = ZA, r = M,
and O, = Z 2. Interestingly, Z cannot be taken as null, because of
the positive variances in X; in addition, even if some elements
in Z are negative, all error variances must be strictly positive
because Os = Z 2 > 0. Because it may not be clear how to operationalize Bentler's model with the LISRELVI computer program,
we shall give details on its implementation.
Figure 3 shows Bentler's structural factor analysis model for
the oblique two-component factor analytic model shown in Figure 2. There are two ~s and five ns in the figure, where the xs are
mapped directly onto the ns. Thus, the structural equation is
n = B r / + F~ + L

(11)

Under standard assumptions, squaring both sides of Equation
11 yields

Figure 3. Bentler's specificationfor the two-component oblique
factor analytic model. (See Figures 1 and 2 for definitions.)

Br/n'B' = F ~ ' F ' + ~'~",

(12)

where, for notational convenience, B = (I - 3). After premultiplication (postmultiplication) by B-I(B-~'), and taking expectations we find that
2; = B-I(I'~I'F')B-l' + B - q B -1'.

(13)

Thus, in relation to Equation 10 we see that B -t plays the role
of Z, F the role of A, and r = M. Interestingly, in relation to
Equation 2, A = B-~F and O, = B-aB -~', so that unique variances must be strictly positive. In other words, if the unique
variance of a manifest variable is close to zero, the factor loading (standardized) corresponding to that variable will be close
to one. Denoting the elements ofB -~ and F by b and 3', respectively, the factor loading is now given by the product b~, and the
unique variance by b:. Because b3' has to be close to one, b can
be arbitrarily close to, but never equal to, zero. Hence b2 has to
be strictly positive, although close to zero.
To implement this model with the LISREL VI computer program, specify the following:
NY
NE
LY
TE
BE

=
=
=
=
=

5;
5;
ld;
3e;

di, fr;

NX
NK
GA
PH

= 0;
= 2;

=fu, fr;
= sy, fr;
PS = di, fi.

5 In more technical terms, the estimation in such instances forces the
analog Hessian to be positive definite and thus is inherently incapable
of evaluatingwhether the solution is actually nonunique(Bentler, 1976).

OFFENDING ESTIMATES IN COVARIANCE STRUCTURE ANALYSIS

131

Placing Constraints on the Unique Variance
Rindskopf (1983) has demonstrated a method for preventing
Heywood cases. His method involves the'parameterization presented by Bentler and Weeks (1980) and combines a suggestion
made by Bentler (1976) with a class of models proposed by
Werts, Linn, and J0reskog ( 1971 ) in which the unique variances
were viewed as factors with the same nominal status as other
latent variables to allow error correlation.
In the usual parameterization of structural models with LISREL the coefficients for residuals and unique variances are usually fixed at one, whereas their variances are considered free
and thus need to be estimated. As we have discussed, in the
conventional model the error variances need not be positive.
Using Bentler and Weeks's parameterization, however, Rindskopf demonstrates a simple solution to this problem: Fix the
variance of the residual or unique variance at one and estimate
the coefficient. According to Rindskopf, "Regardless of whether
the coefficient is positive or negative, the square of the coefficient will be positive, so the variance will be positive" (1983,
p. 75). We demonstrate how this can be implemented with the
LISRELVI computer program next.
To implement Rindskopf's method we simply need to introduce as many ~s as there are independent variables and common factors, whether observed or latent, and whether they represent unique or residual variables or not. In the case of our
illustrative model, two of the ~s are reserved for the latent factors, whereas each of the remaining five ~s corresponds to one
of the unique factors. Thus, the factor loading matrix has as
many rows as there are observed independent variables and as
many columns as there are latent factor variables in the model.
We see, therefore, that the matrix contains more than one type
of effect: It contains the loadings that map the observed variables onto their latent counterparts as well as the effects of the
unique variables, where the squares of these unique variable
effects give the error variance estimates. To fix the variance of
the unique variables at one we need to simply set the relevant
elements in 9 equal to 1.0. Note that ,I, also contains more than
one type of effect: It contains the variances and covariances of
the common factors as well as the unique variances.
Figure 4 shows Rindskopf's method in the context of the
oblique two-component factor analytic model shown in Figure
2. To distinguish the various effects we have used "rs for the
unique variance effects. In terms of the LISRELVI computer program we need to specify the following:
NX=5;
NK=7;

TD=3e;
P H = s y , fi;

Figure4. Two-componentoblique factor analytic model with error variances represented as factors (Rindskopf's method). (See Figures 1 and
2 for definitions.)

and

1

=
(7 • 7)

~b21

1

0

0

1

0

0

0

1

0

0

0

0

1

0

0

0

0

0

1

0

0

0

0

0

0

Method
We investigated the relative efficacy of the three recommended approaches to handling Heywood cases in two distinct situations. In the
first setting, an improper solution caused by lack of fit and misspecification error was considered. In the second setting, improper solutions induced from sampling fluctuations in combination with a true value
close to the boundary of an interpretable region were considered. We
provide further details in the next section.

Lack of Fit: A Contrived Empirical Setting
Fishbein and Ajzen (1974) reported data on attitudes toward religion.
Table 1 presents the correlations among the five attitude measures?

where

~k1 0
A
=
(5 • 7)

"YI

0

0

0

0-7

X2

0

0

"r2

0

0

0

0

~k3

0

0

"~'3 0

0

0

~,4

0

0

0

3'4

0

0

~,5

0

0

0

0

~'5

6 Although correlation matrices are used for all analyses, parameter
estimates obtained using correlation input are consistent with those obtained using covariance input because the maximum likelihood estimation is "scale-free," that is, parameter estimates for the covariance matrix can be obtained from the corresponding estimates for the correlation matrix by appropriate scale transformations. However this
property does not hold for the standard errors of the estimates.

132

W. DILLON, A. KUMAR, AND N. MULANI

These data have been used to test various conceptualizations of the
structure and dimensionality of attitudes (cf. Bagozzi & Burnkrant,
1979; Dillon & Kumar, 1985). In one conceptualization, attitudes are
hypothesized to be multidimensional, represented by an oblique twocomponent factor analytic model similar to that shown in Figure 2, in
which the self-rating (Xt) and semantic differential (X2) indicators load
on the affective component and the Guttman (X3), Liken (X4), and
Thurstone (X~) indicators load on the cognitive component. To create
an improper solution the epistemic correlation (r between the affective and cognitive components is set equal to .50. For these data, this is
a sufficiently small value to cause lack of fit and irregularities in the
8x2 parameter estimate (specifically ~x2 = -.248).

Sampling Fluctuations: A Monte Carlo Simulation
A small Monte Carlo study was designed in the context of the oblique
two-component factor analytic model of Figure 2. The loading for the
X~ indicator was set at .9, which is the upper bound of indicator reliability obtained in practice. A loading of .7 was set for all other indicators.
The correlation between factors was set at .50, a moderate value as some
correlation between factors is usually posited in confirmatory factor
analysis applications. A population correlation matrix was constructed
using the rules of path analysis. Using the IMSL subroutines GGNSM
and GGUBS(IMSL, 1980),7 a multivariate normal population based on
the specified population correlation matrix was constructed along with
500 samples of size 62. Given the small number of indicators per factor
and the small sample size, we expected to obtain a significant number
of improper solutions.
Analyses

Lack-of-Fit Results
Setting the offending estimate to zero. This approach recommends setting the offending estimate, in this case 0x2, equal to
zero and reestimating the model. N o t surprisingly the revised
model still does not fit the data, •
= 17.58, p < .01, G F I =
.912, AGFI = .781. However, no other offending estimates appear.
Reparameterization--Bentler's model. When fitting Bentler's model, the LISREL VI c o m p u t e r program indicated that the
free parameter corresponding to "r2 may not be identified. In
our experience, when the LISREL VI computer program warns
of a possible unidentified parameter that appears to be algebraically identified, its assessment is probably correct, with empirical underidentification the likely culprit. In the present case the
Table 1

Correlationsfor the Self-Reported Behaviors Sample
Indicator

SR

SD

G

L

T

SR
SD
G
L
T

1.000
.800
.519
.652
.584

1.000
.644
.762
.685

1.000
.790
.744

1.000
,785

1.000

Table 2

Parameter Estimates for the Two-Component Attitude
Model Using Rindskopf's Method
LISREL

Parameter

estimate

SE

hi
)~2
~,a
),4
~,s
71
3'2
~'3
74

0.726
0.902
0.777
0.842
0.777
0.600
0.000 ~
0.517
0.373
0.517
0.500b
17.58
0; because )`i can be any real number, the minimum
is attained at )`i = 0. With Bentler's formulation b 2 ~ 0 and
b3, ~ 1, which apparently allows the value of b to fluctuate in
the neighborhood of zero, as 3' can be arbitrarily large (so that
b3, ~ 1), leading to problems of empirical underidentification.
Unfortunately, the empirical and simulated results we report
are not very encouraging. In general, it may be extremely
difficult to detect the exact cause of the offending estimate.
Herein lies the problem, because if the exact cause of the
offending estimate was known, then corrective action could be
taken directly, without having to rely on constrained or reparameterized forms of the basic factor analytic model. In closing,
we can discuss several situations for which some guidance, albeit incomplete, can be given. First, if the model provides a reasonable fit, the respective confidence interval for the offending
estimate covers zero, and the magnitude of the corresponding
estimated standard error is roughly the same as the other estimated standard errors, the Heywood case is likely due to sampling fluctuations, and the model can be reestimated with the
offending estimate set at zero. Setting the offending estimate to
zero was evaluated very favorably in both the empirical and
simulation settings. Though this approach has been criticized
on the basis of statistical concerns, it nevertheless seems to work
well in practice. Moreover, Gerbing and Anderson (in press)
have recently shown that this approach will clean up the other
estimates adversely affected by the (original) offending estimate.
If empirical underidentification is the likely culprit, then
different starting values will likely give different numerical estimates for the (offending) parameter in question. In this case, the
model needs to be respecified so as to identify all parameters.
Consider next the case in which a good-fitting model is obtained
but there is a large (significant) offending estimate. In such cases
the problem is likely one of model overfitting (i.e., overfactoring) and a more parsimonious model should be specified. This

OFFENDING ESTIMATES IN COVARIANCE STRUCTURE ANALYSIS
typically occurs in multitrait-multimethod analysis using covariance structure models (cf. Farh, Hoffman, & Hegarty, 1984).
Finally, in cases suffering from lack of fit and misspecification
error in which large (significant) negative error variance estimates are obtained, there is no satisfactory solution for handling the offending estimate, except to critically examine the
theory and data on which the model is based. In such cases there
are no ready-made methodological solutions because the problem is neither methodological nor statistical. Negative variance
estimates surface in such cases because the covariance model is
ad hoc and atheoretical; in general, post hoc remedial measures
can rarely if ever correct for theoretical deficiencies in a model.
References
Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor
analysis. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 5, 111-150.
Bagozzi, R. P. (1981). Attitudes, intentions, and behavior: A test of some
key hypotheses. Journal of Personality and Social Psychology,, 41,
607-627.
Bagozzi, R. P., & Burnkrant, R. E. (1979). Attitude organization and
the attitude-behavior relationship. Journal of Personality and Social
Psychology, 37, 913-929.
Bentler, P. M. (1976). Multistructure statistical model applied to factor
analysis. Multivariate Behavioral Research, 11, 3-25.
Bentler, P. M., & Bonett, D. G. (1979). Models of attitude-behavior
relations. PsychologicalReview, 86, 452-464.
Bentler, P. M., & Speckart, G. (1981). Attitudes "cause" behaviors: A
structural equation analysis. Journal of Personality and Social Psychology, 40, 226-238.
Bentler, P. M., & Weeks, D. G. (1980). Linear structural equations with
latent variables. Psychometrika, 45, 289-308.
Dillon, W. R., & Kumar, A. (1985). Attitude organization and the attitude-behavior relation: A critique of Bagozzi and Burnkrant's reanalysis of Fishbein and Ajzen. Journal ofPersonality and Social Psychology, 49, 33-46.
Driel, O. P. van (1978). On various causes of improper solutions in maximum likelihood factor analysis. Psychometrika, 43, 225-243.
Farh, J., Hoffman, R. C., & Hegarty, W. H. (1984). Assessing environmental scanning at the subunit level: A multitrait-multimethod analysis. Decision Sciences, 15, 197-220.
Fishbein, M., & Ajzen, I. (1974). Attitudes towards objects as predictors
of single and multiple behavioral criteria. PsychologicalReview, 81,
59-74.
Fredricks, A. J., & Dossett, D. L. (1983). Attitude-behavior relations: A
comparison of the Fishbein-Ajzen and the Bentler-Speckart models.
Journal of Personality and Social Pscholog)~,45, 501-512.
Gerbing, D. W., & Anderson, J. C. (in press). Improper solutions in the

135

analysis of covariance structures: Their interpretability and a comparison of alternative specifications. Psychometrika.
Harman, H. H. ( 1971 ). Modernfactor analysis. Chicago: University of
Chicago Press.
IMSL. (1980). International Mathematical and Statistical Libraries.
Houston, TX: Author.
JOreskog, K. G. (1967). Some contributions to maximum likelihood
factor ana