Clarifying Categorical Concepts in a Web Survey Public Opin Q 2013 Redline 89 105

Public Opinion Quarterly, Vol. 77, Special issue, 2013, pp. 89–105

Clarifying CategoriCal ConCepts in a
Web survey
Cleo Redline*

introduction
one reason that survey questions may not be clear (e.g., Belson 1981; Schober
and Conrad 1997) is that people categorize phenomena differently than
Cleo Redline is a senior research scientist with the national Center for education Statistics
(nCeS), department of education, Washington, dC, USA. This work was conducted while she
was a graduate student in the Joint Program for Survey Methodology at the University of Maryland.
This work was supported by grants from the U.S. national Science Foundation [1024244 to Stanley
Presser and C.R.] and the U.S. Bureau of the Census [YA1323-RFQ-06-0802 to Roger Tourangeau
and C.R.]. data were collected in a study conducted under a national Science Foundation Major
Research instrumentation Grant [0619956 to Jon A. Krosnick]. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author and do not necessarily
reflect the view of the nSF, the Bureau of the Census, or the nCeS. The author would like to thank
the editors, three anonymous reviewers, Roger Tourangeau, Katharine G.  Abraham, Frederick
G. Conrad, Kent l. norman, and norbert Schwarz for their many helpful comments and suggestions. *Address correspondence to Cleo Redline, nCeS, 1990 K Street, Room 9068, Washington,
dC 20006, USA; e-mail: cleo.redline@ed.gov.
doi:10.1093/poq/nfs067

© The Author 2013. Published by Oxford University Press on behalf of the American Association for Public Opinion Research.
All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

Downloaded from http://poq.oxfordjournals.org/ at Jordan University of Science and Technology on July 26, 2016

abstract Past research has shown that vague or ambiguous categorical
concepts can be clarified through the use of definitions, instructions, or
examples, but respondents do not necessarily attend to these clarifications.
The present research investigates whether the presence of instructions or
their placement modifies respondents’ concepts so that they are better
aligned with research objectives. eight questions, modeled after major federal surveys, were administered in one panel of a twelve-month Web panel
survey to a nationally representative multistage area probability sample
of addresses in the United States (n = 913 completed interviews). There
is some evidence to suggest that, as predicted, respondents anticipate the
end of a question and are more likely to ignore instructions placed after a
question than before. Respondents answer more quickly when instructions
come after the question, suggesting that they spend less time processing
the instructions in this position, and their answers appear to be less consistent with research intentions. As predicted, implementing instructions in a
series of questions is more effective than the other approaches examined.


90

Redline

Downloaded from http://poq.oxfordjournals.org/ at Jordan University of Science and Technology on July 26, 2016

researchers intend (lakoff 1987; Smith 1995). To improve survey questions,
researchers provide additional clarification, i.e., definitions, instructions, and
examples (e.g., Conrad et al. 2006; Conrad, Schober, and Coiner 2007; Martin
2002; Tourangeau et al. 2010). For instance, allowing interviewers to give definitions has been shown to improve respondents’ understanding of researchers’
intentions (Schober and Conrad 1997; see also Conrad and Schober 2000).
Several factors may affect respondents’ use of such additional clarification.
First, analysis of interviewer-respondent interaction has shown that respondents
anticipate the end of a question and are more likely to interrupt information
that comes after a question than information that comes before (e.g., HoutkoopSteenstra 2002; oksenberg, Cannell, and Kalton 1991; Van der Zouwen and
dijkstra 2002). (See also Cannell et al. [1989] as discussed in Schaeffer [1991]).
This suggests that providing clarification before a question rather than after
should lead to fewer interruptions and higher-quality data. However, the relationship between interruptions and data quality has proven inconsistent (dykema,
lepkowski, and Blixt 1997). one reason may be that when respondents’ situations are straightforward (Conrad and Schober 2000; Conrad, Schober, and
Coiner 2007; Schober and Conrad 1997; Schober and Bloom 2004), not hearing

inapplicable clarification does not affect the quality of their answers.
Second, it has never been firmly established that respondents skip clarifications at the end of a question when the clarification is presented visually,
although eye-movement research has shown that respondents spend more time
at the beginning of a question than at the end (Graesser et al. 2006). Providing
clarification after a question may be genuinely more problematic in the auditory channel than in the visual (Martin et al. 2007); respondents can see that
additional text follows the question in the visual channel, and they can read
and reread it as needed (Just and Carpenter 1980).
Third, although clarification can improve respondents’ understanding of the
intended meaning of a concept (e.g., Conrad, Schober, and Coiner 2007), it
is possible that when this clarification is long and complex, it taxes working memory or is ignored. Thus, it may be better to ask a series of questions
rather than one question with clarification (Conrad and Couper 2004; Conrad
and Schober 2000; Couper 2008, p. 289; Fowler 1995, pp. 13–20; Schaeffer
and Presser 2003; Sudman, Bradburn, and Schwarz 1996, p. 31; Suessbrick,
Schober, and Conrad 2000; Tourangeau, Rips, and Rasinski 2000, pp. 38–40,
61). decomposition is one strategy in which the subcategories of a behavioral
frequency report are requested in individual questions (e.g., Belli et al. 2000;
dykema and Schaeffer 2000; Means et al. 1994; Menon 1997). As Schaeffer
and dykema (2011) note, decomposing a general category into a set of more
specific categories and more specific questions is also a technique for implementing a definition and promoting clarity (see also Schaeffer and dykema
[2004]). one approach to decomposition would be to decompose a category

into more specific subcategories and provide these as instructions. Another
approach would be to ask a series of questions.

Clarifying Categorical Concepts

91

Methods
deVeloPMenT And PRediCTionS

eight items, patterned after items from major federal surveys, were administered in a Web survey. The items asked about categorical concepts, including
number of residents, shoes owned, coats owned, hours worked, number of
trips, furniture purchases, number of bedrooms, and number of other rooms.
The categories were decomposed into subcategories, and instructions were
developed that directed respondents to exclude some of the subcategories that
they might be likely to include in the category of interest. For example, as
shown in figure 1, in the shoe question, respondents were instructed to exclude
boots, sneakers, athletic shoes, and bedroom slippers. To increase the magnitude of the effect of the instructions, an attempt was made to exclude at
least one subcategory that most people would possess and be likely to include
unless told otherwise, so that the mean would be lower if respondents followed

the instructions. For example, the instructions for the shoe question directed
respondents to exclude sneakers because it was assumed that most people own
sneakers and think of sneakers as shoes. The instructions were also intentionally lengthy in order to provide a condition under which multiple questions
might excel.
The multiple questions were developed by transforming the instructions into
a series of questions, as recommended by Fowler (1995, p. 17). The method
allowed for the manipulation of question structure while preserving the order
and wording of the subcategories (except for inserting a phrase at the beginning of each of the follow-up questions to refer back to the first question in the
series).1 Because the mean should be lower if respondents followed instructions and excluded subcategories, the mean for the control condition should
be higher than the mean for the instruction condition, which in turn should be
higher than the mean for the multiple-question condition.

1. The items underwent a number of revisions based on the advice of two outside expert reviewers
and a small pretest (n = 12). (See appendix A for question wording.)

Downloaded from http://poq.oxfordjournals.org/ at Jordan University of Science and Technology on July 26, 2016

This paper describes a Web survey experiment designed to investigate
where and how to decompose general categorical concepts. The experiment
examines whether decomposing categories into subcategories and presenting

the subcategories as instructions affect how respondents interpret the general
category; whether placing the instructions before the question is better than
putting them after; and whether transforming the instructions into a series of
questions yields answers more consistent with research objectives than asking
one question with instructions.

92

Redline

exPeRiMenTAl CondiTionS

FolloW-UP QUeSTionS

Two follow-up questions were also developed for the shoes and hours worked
items. These questions asked respondents to further decompose their reports
about shoes and hours worked into individual subcategories (see figure  2).
Respondents were unable to return to the previous experimental items and
change their answers. These follow-up questions provide a second measure for
determining which of the conditions elicited the most consistent reporting (see

Conrad and Schober [2000] for a similar method).
dATA ColleCTion And SAMPle

This experiment was administered in the eleventh wave (August 1 through
october 31, 2009) of a twelve-month Web panel survey to a national multistage area probability sample of addresses in the United States. in the summer
of 2008, interviewers from Abt SRBi visited the selected addresses, randomly
selected an adult member, conducted a brief interview, and offered a free laptop
and high-speed connection in exchange for agreeing to complete a thirty-minute, secure internet survey once a month, beginning in october 2008. The target population was adults, eighteen years or older, residing in the United States.

Downloaded from http://poq.oxfordjournals.org/ at Jordan University of Science and Technology on July 26, 2016

Three main conditions were compared in this experiment: no instructions (the
control condition), instructions, and a multiple-question approach. A  factorial experiment was embedded within the instructions condition, with two
orders of presenting the instructions (after/before) crossed with two font styles
(same font as the question/italics). Thus, the overall design had six conditions;
however, font was not shown to have any effect. some of the results collapse
across the font condition, and in some analyses the font condition is retained
to reflect the design.
Figure 1 shows how the information appeared on the screen for the six conditions, combining those that varied font. Questions always began on a new
line, so that questions and instructions were one continuous paragraph in the

instruction conditions, and each new question started on a new line in the
multiple-question condition. The multiple questions were presented on the
same screen to ensure that respondents had the same opportunity to review
subcategories in this condition as in the other conditions. The multiple questions were presented dynamically, such that only the relevant follow-up questions were administered based on the prior responses. For example, if a person
reported having five persons in a household, all of whom were adults, they
were not then asked how many children were in the household. Prior questions and responses remained on the screen, as subsequent questions were
presented. The wording of these subsequent questions was not modified based
on prior responses.

Clarifying Categorical Concepts

93

The overall unweighted response rate for the initial 1,000 recruits in the twelvemonth panel survey was 42.5 percent (AAPoR RR4) (Sakshaug et al. 2009).
of the 1,000 respondents who were recruited into the twelve-month panel, 913
responded in the eleventh wave, yielding a cumulative response rate of 38.8
percent (42.5 percent x 91.3 percent) (Callegaro and diSogra 2008).

Downloaded from http://poq.oxfordjournals.org/ at Jordan University of Science and Technology on July 26, 2016


figure 1. overview of the experimental Conditions using the Wording
of the shoe Question to illustrate.

94

Redline

The items reported here were part of a larger questionnaire that included
three other experiments and asked twenty-seven questions in all. The items
reported about here appeared in fixed locations, as questions 1, 8, 9, 10, 12,
13, 17, 18, 26, and 27. Respondents were randomly assigned to one of the
treatments for all items.

results
MeAn ReSPonSeS

Some respondents reported extreme numbers in response to the items.
Responses that were above the upper first percentile for each item were
dropped.2,3 in the multiple-question condition, responses were calculated from
respondents’ answers to the set of questions. For example, answers to the shoe

question were derived by subtracting the number reported in the second question from the number reported in the first. Resulting negative values were set
to missing.4
Table 1 displays the mean response for each of the items for the three main
experimental conditions, and table 2 displays select test statistics. A one-way
2. The cutoff and the number of responses removed for each item were: residents > 8 (9 values);
shoes > 100 (5 values); coats > 35 (9 values); hours worked > 70 (9 values); trips > 24 (9 values);
furniture > 8 (9 values); bedrooms > 6 (5 values); and rooms > 12 (9 values). The analysis was run
with and without the outliers. The conclusions remain the same.
3. Top coding with the cutoff values did not change the conclusions.
4. This was deemed the most conservative approach because leaving them negative or setting
them to zero advanced the hypothesis of lowering the mean for the wrong reason. The number
of negative values set to missing for each item were: residents (1 value); coats (12 values); hours
worked (4 values); trips (4 values); rooms (8 values).

Downloaded from http://poq.oxfordjournals.org/ at Jordan University of Science and Technology on July 26, 2016

figure 2. follow-up item to the shoe Question.

Clarifying Categorical Concepts


95

table 1. Mean response (and sample sizes) for eight items by
experimental Conditions
Main conditions

Residents
Shoes
Coats
Hours worked
Trips
Furniture
Bedrooms
Rooms

no
instructions
Mean (n)

instructions
Mean (n)

Multiple
questions
Mean (n)

instructions
after
Mean (n)

instructions
before
Mean (n)

3.0 (174)
13.8 (176)
6.0 (175)
21.4 (177)
2.9 (174)
0.7 (177)
3.0 (177)
4.5 (174)

2.4 (572)
10.3 (576)
4.1 (572)
24.3 (572)
2.2 (575)
0.6 (574)
2.7 (573)
3.4 (569)

2.0 (148)
7.0 (151)
2.6 (140)
20.2 (147)
1.3 (149)
0.5 (152)
1.8 (153)
2.0 (145)

2.5 (293)
10.2 (297)
4.3 (294)
24.5 (292)
2.2 (295)
0.6 (296)
2.8 (294)
3.6 (292)

2.3 (279)
10.4 (279)
3.8 (278)
24.0 (280)
2.2 (280)
0.6 (278)
2.5 (279)
3.2 (277)

note.—Responses that were greater than the upper first percentile for each individual item
were dropped.

table 2. F-statistics from select tests, by item

item
Residents
Shoes
Coats
Hours worked
Trips
Furniture
Bedrooms
Rooms

Main conditions

order: after vs.
before

Before vs. multiple
questions

F-statistic

F-statistic

F-statistic

26.33***
11.36***
30.43***
3.02*
13.47***
1.16 n.s.
44.73***
61.49***

3.54†
0.04 n.s.
1.80 n.s.
0.70 n.s
0.00 n.s.
0.00 n.s.
9.70**
4.80*

5.73*
7.59**
8.66**
3.41†
12.10***
1.18 n.s.
33.41***
30.17***

note.—F-statistics from the test of the main conditions are from one-way AnoVAS; all are
based on two numerator degrees of freedom. F-statistics from the test of after vs. before are from
two-way AnoVAS of the embedded experiment, order crossed with font; all are based on one
degree of freedom. F-statistics from the test of before vs. multiple questions are from one-way
AnoVAS; all are based on one degree of freedom.
***p 

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

KONSTRUKSI MEDIA TENTANG KETERLIBATAN POLITISI PARTAI DEMOKRAT ANAS URBANINGRUM PADA KASUS KORUPSI PROYEK PEMBANGUNAN KOMPLEK OLAHRAGA DI BUKIT HAMBALANG (Analisis Wacana Koran Harian Pagi Surya edisi 9-12, 16, 18 dan 23 Februari 2013 )

64 565 20

STRATEGI PUBLIC RELATIONS DALAM MENANGANI KELUHAN PELANGGAN SPEEDY ( Studi Pada Public Relations PT Telkom Madiun)

32 284 52

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

An Analysis of illocutionary acts in Sherlock Holmes movie

27 148 96

Enriching students vocabulary by using word cards ( a classroom action research at second grade of marketing program class XI.2 SMK Nusantara, Ciputat South Tangerang

12 142 101

The Effectiveness of Computer-Assisted Language Learning in Teaching Past Tense to the Tenth Grade Students of SMAN 5 Tangerang Selatan

4 116 138

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Strategi Public Relations Pegadaian Syariah Cabang Ciputat Raya Dalam Membangun Kepuasan Layanan Terhadap Konsumen

7 149 96

Existentialism of Jack in David Fincher’s Fight Club Film

5 71 55