Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji joeb.82.1.21-26
Journal of Education for Business
ISSN: 0883-2323 (Print) 1940-3356 (Online) Journal homepage: http://www.tandfonline.com/loi/vjeb20
The Effect of Evaluation Location on Peer
Evaluations
Curt J. Dommeyer
To cite this article: Curt J. Dommeyer (2006) The Effect of Evaluation Location on Peer
Evaluations, Journal of Education for Business, 82:1, 21-26, DOI: 10.3200/JOEB.82.1.21-26
To link to this article: http://dx.doi.org/10.3200/JOEB.82.1.21-26
Published online: 07 Aug 2010.
Submit your article to this journal
Article views: 26
View related articles
Citing articles: 4 View citing articles
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=vjeb20
Download by: [Universitas Maritim Raja Ali Haji]
Date: 11 January 2016, At: 23:19
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
The Effect of Evaluation Location
on Peer Evaluations
CURT J. DOMMEYER
CALIFORNIA STATE UNIVERSITY, NORTHRIDGE
NORTHRIDGE, CALIFORNIA
ABSTRACT. A comparison of peer
evaluations conducted outside the classroom to those conducted inside revealed
that the ones conducted outside were more
critical and less supportive of the students
being rated. Moreover, the evaluations conducted outside the classroom provided
more copious and critical answers to an
open-ended question. It is suspected that
these results were due to the greater privacy
and time allotted to the evaluators outside
the classroom.
Key words: collaborative learning, peer
evaluation, rating assessment location
Copyright © 2006 Heldref Publications
G
roup projects are commonly used
in business schools to provide a
collaborative learning experience for students. Numerous advantages to group
projects have been cited in the literature:
(a) they can be more motivating and
challenging than individual assignments
(Denton, 1994; Dommeyer, 1986); (b)
they provide an environment in which
students can interact with and learn from
each other (Freeman, 1996; Johnson &
Johnson, 1984); and (c) they can reduce
the number of reports that an instructor
must grade (Dommeyer; Pfaff & Huddleston, 2003).
However, those who have studied
group members’ experiences with group
projects have found that a common
complaint of group members is that they
must deal with team members who do
not contribute fairly to the team’s goals
(McLaughlin & Fennick, 1987; Strong
& Anderson, 1990; Williams, Beard, &
Rymer, 1991). The team members who
contribute less than their fair share are
often referred to as social loafers or free
riders (Beatty, Haas, & Sciglimpaglia,
1996; Brooks & Ammons, 2003; Grieb
& Pharr, 2001).
Because instructors cannot see firsthand how each group member is contributing to the group’s goals, peer evaluations have become a popular way to
assess each team member’s contribution
(Beatty et al., 1996). Research on the
effects of peer evaluations reveals that
they can improve relationships among
teammates, can result in a more even
distribution of the workload, and can
enhance the quality of the final group
report (Chen & Hao, 2004; Cook, 1981;
Druskat & Wolff, 1999). The preponderance of research on peer evaluations
has focused on their development (Beatty et al.; Deeter-Schmelz & Ramsey,
1998; Haas, Haas, & Watruba, 1998;
Johnson & Smith, 1997), validity
(Clark, 1989), reliability (Clark;
Falchikov, 1995; Macpherson, 1999),
potential for bias (Haas et al., 1998;
Fox, Ben-Nahum, & Yinon, 1989; Park
& Kristol, 1976), timing (Brooks &
Ammons, 2003; Druskat & Wolff,
1999) and frequency of use (Brooks &
Ammons).
The Present Study
One feature of peer evaluations that
apparently has been overlooked by previous researchers is the location in which
the evaluations are conducted (i.e., inside
the classroom vs. outside of the classroom). With rare exceptions, the majority of authors who have reported on their
use of peer evaluations have failed to
mention where they were conducted,
suggesting for them that the location of
the evaluations is an inconsequential
aspect of the evaluations. Thus, no study
to date has reported on the effect of location on peer evaluations. The purpose of
the present study is to determine how
September/October 2006
21
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
peer evaluations are affected by the location in which they are conducted.
Although administering peer evaluations inside the classroom may be convenient for both the instructor and the student, this location may affect how
teammates are rated since teammates are
often seated next to or near each other
during the ratings. Consequently, raters
may fear that their evaluations may be
viewed by their teammates during the
evaluations or when the forms are being
returned to the professor. While some
students may feel negatively about the
performance of some of their teammates,
they may be reluctant to divulge this negative information if they fear that the
confidentiality of the ratings will be compromised. However, if students complete
the evaluations outside of the classroom,
they can do the evaluations in a private
location without fear that their teammates will see the evaluations. When
evaluations are done privately, students
should feel more willing to express their
true feelings and might be more likely to
provide ratings or comments that are critical rather than supportive of their teammates. With this reasoning in mind, I
propose the following hypothesis:
H1: Peer evaluations conducted outside of
the classroom will be more critical and
less supportive of teammates than those
conducted inside the classroom.
Another feature of peer evaluations
conducted outside the classroom is that
they are not constrained by any classroom time limits. Evaluations conducted
outside the classroom are more likely to
be conducted in an environment where
the students have as much time as they
want to reflect on their ratings and opinions. Thus, one might expect that evaluations conducted outside the classroom
would be more complete and, in the case
of open-ended questions, more copious.
With this logic in mind, I proposed the
following hypothesis:
H2: Peer evaluations conducted outside of
the classroom will exhibit higher item
completion rates and, in the case of openended questions, more copious answers
than those conducted in the classroom.
METHOD
In the fall of 2003, the fall of 2004,
and the spring of 2005, students in
22
Journal of Education for Business
upper division marketing research classes at a large California state university
were placed on teams of two or three
students to conduct survey research projects. The projects took the entire
semester and accounted for 33% of each
student’s grade. At the beginning of
each semester, the instructor told all students that they were expected to make
fair contributions to the team project
and that peer evaluations would be used
to assess individual contributions. The
instructor also informed students that
any team member who received poor
peer evaluations would receive a grade
on the project that would be below the
team grade. Those in the fall terms completed the peer evaluations inside the
classroom (n = 131) while those in the
spring term completed the peer evaluations outside of the classroom (n = 84).
The form used to collect the peer evaluations consisted of 45 descriptive
phrases and one open-ended question.
Some of the phrases described positive
attributes (e.g., “was enthusiastic about
working with me” and “had good
ideas”) while others described negative
attributes (e.g., “would not respond to
my e-mails” and “made me feel frustrated”). After reading each phrase, students
rated each teammate on a 10-point scale,
ranging from 1 (not very descriptive) to
10 (very descriptive). The open-ended
question asked the students to explain
how they felt about their teammate(s).
I applied factor analysis with varimax
rotation to the data so that the 45
descriptive phrases could be summarized in a meaningful way. Each person
on a two-person team evaluated his or
her teammate. Consequently, the 57
respondents from two-person teams
resulted in 57 peer evaluations. Each
person on a three-person team submitted two peer evaluations, one for each of
his or her two teammates. There were
158 respondents from three–person
teams, yielding 316 peer evaluations.
Thus, the factor analysis initially
included a total of 373 peer evaluations.
Because I applied the factor analysis
only to peer evaluations that did not
have any missing data, the factor analysis was based on 350 evaluations. The
items comprising each factor are shown
in Table 1 along with the factor loadings. For the sake of parsimony, only the
highest factor loading for each item is
shown in Table 1. One item was not correlated strongly with any of the factors,
so it was eliminated from the analysis.
The factor analysis revealed five significant factors (i.e., factors that have an
eigenvalue greater than one). Factor 1
consists of 14 negative items with positive loadings and two positive items
with negative loadings. A high score on
this factor indicates that a person is a
social loafer and not a team player. I
labeled this factor Incompetent. Factor 2
consists of 16 items describing positive
features, all of which have positive loadings. A high score on this factor indicates that a person is a team player and
very supportive of the team’s activities.
I labeled this factor Supportive. Factor 3
consists of six items–four negative
items with positive loadings and two
positive items with negative loadings. A
person scoring high on this factor is one
who does not respond to e-mails or telephone calls and who is not available for
meetings or team activities. I labeled
this factor Unavailable. Factor 4 consists of four items that indicate that a
person is domineering, controlling, and
uncomfortable to work with. I labeled
this factor Domineering. Finally, Factor
5 consists of two items, both of which
indicate that a person is an outstanding
team performer (i.e., one who does
more than a fair share on the project). I
labeled Factor 5 Super Worker.
I used the items comprising each of
the factors to create the five scales displayed in Table 2. Before a scale score
was calculated, I reverse coded negatively loaded items so that a high score
on a negative item was equivalent to a
low score on a positive item and vice
versa. I then derived a total scale score
by simply adding each scale’s item
scores. Once I determined a total scale
score, I divided it by the number of
items comprising the scale. Thus, each
of the five scales in Table 2 could have
an average score ranging anywhere
from 1 (not very descriptive) to 10 (very
descriptive). The scores on these scales
indicate how a student felt about his or
her teammate(s).
Because a student was on either a
two-person or three-person team, I had
to adjust some of the scale scores to take
team size into consideration. If a student
TABLE 1. Factor Loadings on Rating Items After Varimax Rotation (n = 350)
Factor
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
Item
1
1. ...deserves a lower grade on the team project than I do.
2. ...contributed less to the project than I did.
3. ...always had to be told what to do.
4. ...did not do much work on the project.
5. ...did poor quality work on the project.
6. ...left the most difficult parts of the project for me to do.
7. ...had to be reminded to do work on the project.
8. ...deserves the same grade on the project as I do.
9. ...was a procrastinator, i.e., continually delayed doing project activities.
10. ...is a person I’d prefer not to work with again on a future team project.
11. ...made me feel frustrated.
12. ...was not dependable
13. ...had lots of excuses for why she/he could not work on the project.
14. ...was an “equal partner” on the team project.
15. ...was reluctant to work on the team project.
16. ...complained a lot about the project.
2
3
4
5
.824
.823
.791
.789
.765
.750
.744
–.713a
.690
.683
.679
.659
.657
–.606a
.536
.502
17. ...had good ideas
18. ...showed an interest in my accomplishments on the project.
19. ...encouraged me to give my best effort on the team project.
20. ...took pride in the work we did.
21. ...was easy to communicate with.
22. ...seemed intelligent.
23. ...worked effectively towards the goals of the team.
24. ...was fun to work with.
25. ...was a great person to work with.
26. ...completed assigned responsibilities effectively.
27. ...was enthusiastic about working with me.
28. ...completed assigned responsibilities on time.
29. ...readily accepted feedback from me.
30. ...is a person I would enjoy working with again on a team project.
31. ...helped me to understand how to do parts of the project.
32. ...was sensitive to my feelings.
.743
.731
.719
.690
.687
.680
.677
.677
.676
.649
.645
.633
.621
.617
.594
.552
33. ...would not return my telephone calls.
34. ...would not respond to my emails.
35. ...was not available for meetings.
36. ...always attended meetings.
37. ...was available when needed.
38. ...was never around when I needed his/her help.
.714
.686
.575
–.511a
–.482a
.474
39. ...would not allow me to contribute a fair share.
40. ...made me feel guilty about not doing enough work on the project.
41. ...acted like a “boss” who continually demanded things from me.
42. ...made me feel uncomfortable during our meetings.
.747
.745
.673
.667
43. ...deserves a higher grade on the team project than I do.
44. ...did more than a fair share on the project.
Eigenvalue after varimax rotation
Percentage of variance explained
.831
.659
11.84
26.91
10.00
22.73
3.86
8.76
3.07
6.98
1.55
3.51
Note. Only the largest factor loading on each row is displayed. Factor scales: 1 = Incompetent; 2 = Supportive; 3 = Unavailable; 4 = Domineering;
5 = Super Worker. aItem was reverse coded when calculating total score on scale derived from this factor.
was on a two-person team, the average
scale scores that the student gave teammates were determined as described in
the previous paragraph. However, if a
student was on a three-person team, the
student gave two sets of scale scores
(i.e., one for each teammate). To obtain
a single set of scale scores for students
on three-person teams, I determined
their final scale scores by averaging the
scale scores they gave to each of their
two teammates. Consequently, scores
for students on two-person teams indicate how the students felt about their
teammates, while scores for a person on
a three-person team reflect that student’s attitudes towards the two teammates combined.
September/October 2006
23
of positive and negative comments, or
(c) all negative comments. Table 4
reveals that the evaluation location had
a significant effect on the types of comments made: evaluations conducted outside the classroom were less likely to
fall into the all positive category and
more likely to fall into both the both
positive and negative and all negative
categories (p < .10). These results provide additional support for H1.
Because students conducting the
evaluations outside the classroom are
not subject to classroom time constraints, in H2 I predicted that evaluations conducted outside the classroom
would be more likely than those collected inside to exhibit higher item
completion rates and to provide more
copious answers to the open-ended
question. The results, as displayed in
Table 5, show that the location of the
TABLE 2. Reliability Analysis for Scales Derived from Factor Analysis
Scale
Incompetent
Supportive
Unavailable
Domineering
Super worker
Items compromising scalea
Crobach’s α
(n = 350)
1 to 16
17 to 32
33 to 38
39 to 42
43 to 44
.97
.96
.89
.77
.53
a
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
Refer to Table 1 to see items. Items 8, 14, 36, and 37 were reverse coded when calculating the scale
score because each item has a negative factor loading.
To assess the reliability of the five
scales, I calculated Cronbach’s alpha for
each of the scales (see Table 2). Four of
the five scales have decent to excellent
reliability coefficients. Only the Super
Worker scale has a poor reliability coefficient. This is no doubt a result of the
fact that this scale is comprised of only
two items, and these two items are not
highly correlated. I will examine average scores on the Super Worker scale,
but any conclusions based on this scale
may be premature.
RESULTS
All statistical tests described in this
section were conducted using a .10
alpha level.
In H1, I predicted that evaluations
conducted outside of the classroom
would be more critical and less supportive than those done inside the classroom. I investigated this hypothesis by
examining how the evaluation location
affected both the scale scores and the
responses to the open-ended question.
Average scores on the five scales for
both those who evaluated their peers
while inside the classroom and those
who did the evaluations outside the
classroom are displayed in Table 3.
Although the two evaluation locations
did not differ significantly on their average scores on the Incompetent and
Super Worker scales, evaluations conducted outside of the classroom resulted
in significantly higher average scores
on the Unavailable and Domineering
scales than those conducted inside.
Moreover, evaluations conducted outside the classroom resulted in significantly lower average scores on the Sup24
Journal of Education for Business
portive scale. These latter three results
support H1.
The open-ended question asked students to indicate how they felt about
their teammate(s). A respondent’s
answers fell into one of three categories:
(a) all positive comments, (b) a mixture
TABLE 3. The Effect of Evaluation Location on Mean Scale Scores
Scalea
(No. of items)
Inside
classroom
(n ≥ 125)
Outside
classroom
(n ≥ 80)
2.76
3.15
+0.39
.197
8.47
7.66
–0.81
.003
2.05
2.80
+0.75
.002
1.52
1.94
+0.42
.023
3.72
3.68
–0.04
.892
Incompetent
(16)
Supportive
(16)
Unavailable
(6)
Domineering
(4)
Super Worker
(2)
Difference in
mean score
pb
a
Refer to table 1 to see items. Items 8, 14, 36, and 37 were reverse coded when calculating the
scale score because each item has a negative factor loading. bDetermined with the independentsamples t test.
TABLE 4. The Effect of Evaluation Location on Response to the OpenEnded Question in Percentages
Type of comment
All positive statements
Both positive and
negative statements
All negative statements
Total
Note. χ2(2, N = 189) = 4.88, p = .087.
Inside classroom
(n = 112)
Outside classroom
(n = 77)
69.6
54.5
12.5
17.9
100
22.1
23.4
100
TABLE 5. The Effect of Evaluation Location on Item Completion Rate in
Percentages
Type of item
(No. of items)
Inside
classroom
(n = 131)
Outside
classroom
(n = 84)
Open-ended
(1)
85.5
Answered
91.7
Answered
Rating scale
(45)
99.82
Answered
99.87
Answered
Difference in
%
6.2
.05
p
.20a
.78b
Determined with the Fisher’s Exact Test. bDetermined with the independent-samples t test.
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
a
evaluations had no significant effect on
the item completion rates. However,
response to the open-ended question
revealed that those doing the evaluations outside the classroom provided,
on average, more words to their
answers than those providing their
answers inside the classroom, 50.75
word average outside versus 37.65
word average inside, t(187) = 2.00, p <
.025. These latter results provide partial
support for H2.
DISCUSSION
Although this is a small study, the
results suggest that location has an
effect on peer evaluations. The evaluations conducted outside the classroom
were more critical and less supportive
of teammates than were those conducted inside the classroom, and the evaluations conducted outside the classroom
resulted in more copious answers to the
open-ended question. These results
probably occurred because of the
greater privacy and time afforded to the
students when they did the evaluations
outside the classroom. But it is really
unclear how students conducted the
evaluations outside the classroom. Did
they really do them privately? It is possible that some outside evaluations were
actually completed while the students
were in the company of their teammates. One must also wonder if, in fact,
students had more time to conduct the
evaluations when they were done outside the classroom. If the typical student
is taking several courses, working 30 or
more hours a week, and having an
active social life, he or she may feel
more time pressure outside the classroom than in it.
To have a better understanding of
how location affects peer evaluations,
future researchers might inquire on their
survey instruments how students are
feeling during the evaluations. Do students fear that their teammates may see
evaluations? Do they feel pressured for
time? Are their evaluations independent
of the evaluations they receive from
their teammates? Do they care about the
evaluations? Do they think the evaluations will accurately reflect each teammate’s contribution to the project?
Answers to these types of questions will
provide a better understanding of the
location’s effect on the evaluations.
While this study shows that location
affects peer evaluations, it does not
reveal a strong effect. Of the three scales
that showed a significant difference,
none of the differences in mean scores
between the two locations exceed a
value of .81 on a 10–point scale (see
Table 3). Consequently, instructors can
be assured that, regardless of which
location they use to gather the peer evaluations, the results they get will be fairly close to each other. However, it is not
clear at this point which location is generating the more valid evaluations. I suspect that the evaluations conducted outside the classroom are less susceptible
to peer interference and time constraints, but these assumptions need to
be verified.
It may be unnecessary to conduct
peer evaluations outside of the classroom if instructors can provide conditions inside the classroom that allay any
fears and time problems that students
may experience. When conducting evaluations inside the classroom, the
instructor, for example, could do the following to put the students at ease: (a)
tell students that the only person who
will be allowed to view the evaluations
is the instructor; (b) prior to administering the evaluations, tell students to sit
far away from any of their teammates;
(c) give students ample time to provide
their evaluations; and (d) have students
place the peer evaluations in a sealed
envelope prior to submitting them to the
instructor. If all of these conditions are
met, one would think that the evaluations conducted inside the classroom
would be very similar to those collected
outside of it.
Future researchers may wish to
examine how peer evaluations are
affected by other factors that, up until
this time, have been largely overlooked.
For example, how are peer evaluations
affected by confidentiality procedures?
Should students be allowed to read how
other students have rated them? If students know that their peer ratings are
not confidential, will this knowledge
affect the ratings they give their teammates? Future researchers may also
wish to investigate how peer evaluations
are affected by the manner in which the
evaluations are to be used. Some
instructors, for example, use peer evaluations to reward good performers and to
punish poor performers, while others
use the evaluations only to punish poor
performers. How do students want the
peer evaluations to be used and how
does their use affect the evaluations?
Classroom experiments that manipulate
the confidentiality and use of peer ratings could answer these questions.
NOTE
The author thanks the JEB editor and reviewers
for their constructive and supportive comments.
He also thanks his wife, Susan, for entering the
data in the computer program and for assisting
him with word processing.
Correspondence concerning this article should be
addressed to Curt Dommeyer, Department of Marketing, College of Business and Economics, 18111
Nordhoff Street, Northridge, California 91330.
E-mail: [email protected]
REFERENCES
Beatty, J. R., Haas, R. W., & Sciglimpaglia, D.
(1996). Using peer evaluations to assess individual performances in group class projects.
Journal of Marketing Education, 18(2), 17–27.
September/October 2006
25
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
Brooks, C. M., & Ammons, J. L. (2003). Free riding in group projects and the effects of timing,
frequency, and specificity of criteria in peer
assessments. Journal of Education for Business, 78, 268–272.
Chen, Y., & Hao, L. (2004). Students’ perceptions
of peer evaluation: An expectancy perspective.
Journal of Education for Business, 79,
275–282.
Clark, G. L. (1989). Peer evaluations: An empirical test of their validity and reliability. Journal
of Marketing Education, 11(3), 41–58.
Cook, R. W. (1981). An investigation of student
peer evaluation on group project performance.
Journal of Marketing Education, 3(1), 50–52.
Deeter-Schmelz, D. R., & Ramsey, R. (1998). Student team performance: A method for classroom assessment. Journal of Marketing Education, 20, 85–93.
Denton, H. G. (1994). Simulating design in the
world of industry and commerce: Observations
from a series of case studies in the United Kingdom. Journal of Technology Education, 6(1),
1045–1064.
Dommeyer, C. J. (1986). A comparison of the
individual proposal and the team project in the
marketing research class. Journal of Marketing
Education, 8(1), 30–38.
26
Journal of Education for Business
Druskat, V. U., & Wolff, S. B. (1999). Effects and
timing of developmental peer appraisals in selfmanaging work groups. Journal of Applied Psychology, 84(1), 58–74.
Falchikov, N. (1995). Peer feedback marking:
Developing peer assessment. Innovations in
Education & Training International, 32(2),
175–187.
Fox, S., Ben-Nahum, Z., & Yinon, Y. (1989). Perceived similarity and accuracy of peer evaluations. Journal of Applied Psychology, 74(5),
781–786.
Freeman, K. A. (1996). Attitudes toward work in
project groups as predictors of academic performance. Small Group Research, 27(2),
265–282.
Grieb, T., & Pharr, S. W. (2001, Fall). Managing
free-rider behavior in teams. Journal of the
Academy of Business Education, 2(Fall), 37–47.
Haas, A. L., Haas, R. W., & Wotruba, T. R. (1998).
The use of self-ratings and peer ratings to evaluate performances of student group members.
Journal of Marketing Education, 20(3),
200–209.
Johnson, D. W., & Johnson, R. T. (1984). Structuring groups for cooperative learning. The
Organizational Behavior Teaching Review,
9(4), 8–17.
Johnson, C. B., & Smith, F. I. (1997). Assessment
of a complex peer evaluation instrument for
team learning and group processes. Accounting
Education, 2, 21–40.
Macpherson, K. (1999). The development of critical thinking skills in undergraduate supervisory management units: Efficacy of student peer
assessment. Assessment and Evaluation in
Higher Education, 24, 273–284.
McLaughlin, M., & Fennick, R. (1987). Collaborative writing: Student interaction and response,
Teaching English in the Two-Year College, 14,
214–216.
Park, R. C., & Kristol, D. S. (1976). Student peer
evaluations. Journal of Chemical Education,
53, 177–178.
Pfaff, E., & Huddleston, P. (2003). Does it matter
if I hate teamwork? What impacts student attitudes toward teamwork. Journal of Marketing
Education, 25, 37–45.
Strong, J. T., & Anderson, R. E. (1990). Free-riding
in group projects: Control mechanisms and preliminary data. Journal of Marketing Education,
12, 61–67.
Williams, D. L., Beard, J. D., & Rymer, J.
(1991). Team projects: Achieving their full
potential. Journal of Marketing Education,
13(2), 45–53.
ISSN: 0883-2323 (Print) 1940-3356 (Online) Journal homepage: http://www.tandfonline.com/loi/vjeb20
The Effect of Evaluation Location on Peer
Evaluations
Curt J. Dommeyer
To cite this article: Curt J. Dommeyer (2006) The Effect of Evaluation Location on Peer
Evaluations, Journal of Education for Business, 82:1, 21-26, DOI: 10.3200/JOEB.82.1.21-26
To link to this article: http://dx.doi.org/10.3200/JOEB.82.1.21-26
Published online: 07 Aug 2010.
Submit your article to this journal
Article views: 26
View related articles
Citing articles: 4 View citing articles
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=vjeb20
Download by: [Universitas Maritim Raja Ali Haji]
Date: 11 January 2016, At: 23:19
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
The Effect of Evaluation Location
on Peer Evaluations
CURT J. DOMMEYER
CALIFORNIA STATE UNIVERSITY, NORTHRIDGE
NORTHRIDGE, CALIFORNIA
ABSTRACT. A comparison of peer
evaluations conducted outside the classroom to those conducted inside revealed
that the ones conducted outside were more
critical and less supportive of the students
being rated. Moreover, the evaluations conducted outside the classroom provided
more copious and critical answers to an
open-ended question. It is suspected that
these results were due to the greater privacy
and time allotted to the evaluators outside
the classroom.
Key words: collaborative learning, peer
evaluation, rating assessment location
Copyright © 2006 Heldref Publications
G
roup projects are commonly used
in business schools to provide a
collaborative learning experience for students. Numerous advantages to group
projects have been cited in the literature:
(a) they can be more motivating and
challenging than individual assignments
(Denton, 1994; Dommeyer, 1986); (b)
they provide an environment in which
students can interact with and learn from
each other (Freeman, 1996; Johnson &
Johnson, 1984); and (c) they can reduce
the number of reports that an instructor
must grade (Dommeyer; Pfaff & Huddleston, 2003).
However, those who have studied
group members’ experiences with group
projects have found that a common
complaint of group members is that they
must deal with team members who do
not contribute fairly to the team’s goals
(McLaughlin & Fennick, 1987; Strong
& Anderson, 1990; Williams, Beard, &
Rymer, 1991). The team members who
contribute less than their fair share are
often referred to as social loafers or free
riders (Beatty, Haas, & Sciglimpaglia,
1996; Brooks & Ammons, 2003; Grieb
& Pharr, 2001).
Because instructors cannot see firsthand how each group member is contributing to the group’s goals, peer evaluations have become a popular way to
assess each team member’s contribution
(Beatty et al., 1996). Research on the
effects of peer evaluations reveals that
they can improve relationships among
teammates, can result in a more even
distribution of the workload, and can
enhance the quality of the final group
report (Chen & Hao, 2004; Cook, 1981;
Druskat & Wolff, 1999). The preponderance of research on peer evaluations
has focused on their development (Beatty et al.; Deeter-Schmelz & Ramsey,
1998; Haas, Haas, & Watruba, 1998;
Johnson & Smith, 1997), validity
(Clark, 1989), reliability (Clark;
Falchikov, 1995; Macpherson, 1999),
potential for bias (Haas et al., 1998;
Fox, Ben-Nahum, & Yinon, 1989; Park
& Kristol, 1976), timing (Brooks &
Ammons, 2003; Druskat & Wolff,
1999) and frequency of use (Brooks &
Ammons).
The Present Study
One feature of peer evaluations that
apparently has been overlooked by previous researchers is the location in which
the evaluations are conducted (i.e., inside
the classroom vs. outside of the classroom). With rare exceptions, the majority of authors who have reported on their
use of peer evaluations have failed to
mention where they were conducted,
suggesting for them that the location of
the evaluations is an inconsequential
aspect of the evaluations. Thus, no study
to date has reported on the effect of location on peer evaluations. The purpose of
the present study is to determine how
September/October 2006
21
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
peer evaluations are affected by the location in which they are conducted.
Although administering peer evaluations inside the classroom may be convenient for both the instructor and the student, this location may affect how
teammates are rated since teammates are
often seated next to or near each other
during the ratings. Consequently, raters
may fear that their evaluations may be
viewed by their teammates during the
evaluations or when the forms are being
returned to the professor. While some
students may feel negatively about the
performance of some of their teammates,
they may be reluctant to divulge this negative information if they fear that the
confidentiality of the ratings will be compromised. However, if students complete
the evaluations outside of the classroom,
they can do the evaluations in a private
location without fear that their teammates will see the evaluations. When
evaluations are done privately, students
should feel more willing to express their
true feelings and might be more likely to
provide ratings or comments that are critical rather than supportive of their teammates. With this reasoning in mind, I
propose the following hypothesis:
H1: Peer evaluations conducted outside of
the classroom will be more critical and
less supportive of teammates than those
conducted inside the classroom.
Another feature of peer evaluations
conducted outside the classroom is that
they are not constrained by any classroom time limits. Evaluations conducted
outside the classroom are more likely to
be conducted in an environment where
the students have as much time as they
want to reflect on their ratings and opinions. Thus, one might expect that evaluations conducted outside the classroom
would be more complete and, in the case
of open-ended questions, more copious.
With this logic in mind, I proposed the
following hypothesis:
H2: Peer evaluations conducted outside of
the classroom will exhibit higher item
completion rates and, in the case of openended questions, more copious answers
than those conducted in the classroom.
METHOD
In the fall of 2003, the fall of 2004,
and the spring of 2005, students in
22
Journal of Education for Business
upper division marketing research classes at a large California state university
were placed on teams of two or three
students to conduct survey research projects. The projects took the entire
semester and accounted for 33% of each
student’s grade. At the beginning of
each semester, the instructor told all students that they were expected to make
fair contributions to the team project
and that peer evaluations would be used
to assess individual contributions. The
instructor also informed students that
any team member who received poor
peer evaluations would receive a grade
on the project that would be below the
team grade. Those in the fall terms completed the peer evaluations inside the
classroom (n = 131) while those in the
spring term completed the peer evaluations outside of the classroom (n = 84).
The form used to collect the peer evaluations consisted of 45 descriptive
phrases and one open-ended question.
Some of the phrases described positive
attributes (e.g., “was enthusiastic about
working with me” and “had good
ideas”) while others described negative
attributes (e.g., “would not respond to
my e-mails” and “made me feel frustrated”). After reading each phrase, students
rated each teammate on a 10-point scale,
ranging from 1 (not very descriptive) to
10 (very descriptive). The open-ended
question asked the students to explain
how they felt about their teammate(s).
I applied factor analysis with varimax
rotation to the data so that the 45
descriptive phrases could be summarized in a meaningful way. Each person
on a two-person team evaluated his or
her teammate. Consequently, the 57
respondents from two-person teams
resulted in 57 peer evaluations. Each
person on a three-person team submitted two peer evaluations, one for each of
his or her two teammates. There were
158 respondents from three–person
teams, yielding 316 peer evaluations.
Thus, the factor analysis initially
included a total of 373 peer evaluations.
Because I applied the factor analysis
only to peer evaluations that did not
have any missing data, the factor analysis was based on 350 evaluations. The
items comprising each factor are shown
in Table 1 along with the factor loadings. For the sake of parsimony, only the
highest factor loading for each item is
shown in Table 1. One item was not correlated strongly with any of the factors,
so it was eliminated from the analysis.
The factor analysis revealed five significant factors (i.e., factors that have an
eigenvalue greater than one). Factor 1
consists of 14 negative items with positive loadings and two positive items
with negative loadings. A high score on
this factor indicates that a person is a
social loafer and not a team player. I
labeled this factor Incompetent. Factor 2
consists of 16 items describing positive
features, all of which have positive loadings. A high score on this factor indicates that a person is a team player and
very supportive of the team’s activities.
I labeled this factor Supportive. Factor 3
consists of six items–four negative
items with positive loadings and two
positive items with negative loadings. A
person scoring high on this factor is one
who does not respond to e-mails or telephone calls and who is not available for
meetings or team activities. I labeled
this factor Unavailable. Factor 4 consists of four items that indicate that a
person is domineering, controlling, and
uncomfortable to work with. I labeled
this factor Domineering. Finally, Factor
5 consists of two items, both of which
indicate that a person is an outstanding
team performer (i.e., one who does
more than a fair share on the project). I
labeled Factor 5 Super Worker.
I used the items comprising each of
the factors to create the five scales displayed in Table 2. Before a scale score
was calculated, I reverse coded negatively loaded items so that a high score
on a negative item was equivalent to a
low score on a positive item and vice
versa. I then derived a total scale score
by simply adding each scale’s item
scores. Once I determined a total scale
score, I divided it by the number of
items comprising the scale. Thus, each
of the five scales in Table 2 could have
an average score ranging anywhere
from 1 (not very descriptive) to 10 (very
descriptive). The scores on these scales
indicate how a student felt about his or
her teammate(s).
Because a student was on either a
two-person or three-person team, I had
to adjust some of the scale scores to take
team size into consideration. If a student
TABLE 1. Factor Loadings on Rating Items After Varimax Rotation (n = 350)
Factor
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
Item
1
1. ...deserves a lower grade on the team project than I do.
2. ...contributed less to the project than I did.
3. ...always had to be told what to do.
4. ...did not do much work on the project.
5. ...did poor quality work on the project.
6. ...left the most difficult parts of the project for me to do.
7. ...had to be reminded to do work on the project.
8. ...deserves the same grade on the project as I do.
9. ...was a procrastinator, i.e., continually delayed doing project activities.
10. ...is a person I’d prefer not to work with again on a future team project.
11. ...made me feel frustrated.
12. ...was not dependable
13. ...had lots of excuses for why she/he could not work on the project.
14. ...was an “equal partner” on the team project.
15. ...was reluctant to work on the team project.
16. ...complained a lot about the project.
2
3
4
5
.824
.823
.791
.789
.765
.750
.744
–.713a
.690
.683
.679
.659
.657
–.606a
.536
.502
17. ...had good ideas
18. ...showed an interest in my accomplishments on the project.
19. ...encouraged me to give my best effort on the team project.
20. ...took pride in the work we did.
21. ...was easy to communicate with.
22. ...seemed intelligent.
23. ...worked effectively towards the goals of the team.
24. ...was fun to work with.
25. ...was a great person to work with.
26. ...completed assigned responsibilities effectively.
27. ...was enthusiastic about working with me.
28. ...completed assigned responsibilities on time.
29. ...readily accepted feedback from me.
30. ...is a person I would enjoy working with again on a team project.
31. ...helped me to understand how to do parts of the project.
32. ...was sensitive to my feelings.
.743
.731
.719
.690
.687
.680
.677
.677
.676
.649
.645
.633
.621
.617
.594
.552
33. ...would not return my telephone calls.
34. ...would not respond to my emails.
35. ...was not available for meetings.
36. ...always attended meetings.
37. ...was available when needed.
38. ...was never around when I needed his/her help.
.714
.686
.575
–.511a
–.482a
.474
39. ...would not allow me to contribute a fair share.
40. ...made me feel guilty about not doing enough work on the project.
41. ...acted like a “boss” who continually demanded things from me.
42. ...made me feel uncomfortable during our meetings.
.747
.745
.673
.667
43. ...deserves a higher grade on the team project than I do.
44. ...did more than a fair share on the project.
Eigenvalue after varimax rotation
Percentage of variance explained
.831
.659
11.84
26.91
10.00
22.73
3.86
8.76
3.07
6.98
1.55
3.51
Note. Only the largest factor loading on each row is displayed. Factor scales: 1 = Incompetent; 2 = Supportive; 3 = Unavailable; 4 = Domineering;
5 = Super Worker. aItem was reverse coded when calculating total score on scale derived from this factor.
was on a two-person team, the average
scale scores that the student gave teammates were determined as described in
the previous paragraph. However, if a
student was on a three-person team, the
student gave two sets of scale scores
(i.e., one for each teammate). To obtain
a single set of scale scores for students
on three-person teams, I determined
their final scale scores by averaging the
scale scores they gave to each of their
two teammates. Consequently, scores
for students on two-person teams indicate how the students felt about their
teammates, while scores for a person on
a three-person team reflect that student’s attitudes towards the two teammates combined.
September/October 2006
23
of positive and negative comments, or
(c) all negative comments. Table 4
reveals that the evaluation location had
a significant effect on the types of comments made: evaluations conducted outside the classroom were less likely to
fall into the all positive category and
more likely to fall into both the both
positive and negative and all negative
categories (p < .10). These results provide additional support for H1.
Because students conducting the
evaluations outside the classroom are
not subject to classroom time constraints, in H2 I predicted that evaluations conducted outside the classroom
would be more likely than those collected inside to exhibit higher item
completion rates and to provide more
copious answers to the open-ended
question. The results, as displayed in
Table 5, show that the location of the
TABLE 2. Reliability Analysis for Scales Derived from Factor Analysis
Scale
Incompetent
Supportive
Unavailable
Domineering
Super worker
Items compromising scalea
Crobach’s α
(n = 350)
1 to 16
17 to 32
33 to 38
39 to 42
43 to 44
.97
.96
.89
.77
.53
a
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
Refer to Table 1 to see items. Items 8, 14, 36, and 37 were reverse coded when calculating the scale
score because each item has a negative factor loading.
To assess the reliability of the five
scales, I calculated Cronbach’s alpha for
each of the scales (see Table 2). Four of
the five scales have decent to excellent
reliability coefficients. Only the Super
Worker scale has a poor reliability coefficient. This is no doubt a result of the
fact that this scale is comprised of only
two items, and these two items are not
highly correlated. I will examine average scores on the Super Worker scale,
but any conclusions based on this scale
may be premature.
RESULTS
All statistical tests described in this
section were conducted using a .10
alpha level.
In H1, I predicted that evaluations
conducted outside of the classroom
would be more critical and less supportive than those done inside the classroom. I investigated this hypothesis by
examining how the evaluation location
affected both the scale scores and the
responses to the open-ended question.
Average scores on the five scales for
both those who evaluated their peers
while inside the classroom and those
who did the evaluations outside the
classroom are displayed in Table 3.
Although the two evaluation locations
did not differ significantly on their average scores on the Incompetent and
Super Worker scales, evaluations conducted outside of the classroom resulted
in significantly higher average scores
on the Unavailable and Domineering
scales than those conducted inside.
Moreover, evaluations conducted outside the classroom resulted in significantly lower average scores on the Sup24
Journal of Education for Business
portive scale. These latter three results
support H1.
The open-ended question asked students to indicate how they felt about
their teammate(s). A respondent’s
answers fell into one of three categories:
(a) all positive comments, (b) a mixture
TABLE 3. The Effect of Evaluation Location on Mean Scale Scores
Scalea
(No. of items)
Inside
classroom
(n ≥ 125)
Outside
classroom
(n ≥ 80)
2.76
3.15
+0.39
.197
8.47
7.66
–0.81
.003
2.05
2.80
+0.75
.002
1.52
1.94
+0.42
.023
3.72
3.68
–0.04
.892
Incompetent
(16)
Supportive
(16)
Unavailable
(6)
Domineering
(4)
Super Worker
(2)
Difference in
mean score
pb
a
Refer to table 1 to see items. Items 8, 14, 36, and 37 were reverse coded when calculating the
scale score because each item has a negative factor loading. bDetermined with the independentsamples t test.
TABLE 4. The Effect of Evaluation Location on Response to the OpenEnded Question in Percentages
Type of comment
All positive statements
Both positive and
negative statements
All negative statements
Total
Note. χ2(2, N = 189) = 4.88, p = .087.
Inside classroom
(n = 112)
Outside classroom
(n = 77)
69.6
54.5
12.5
17.9
100
22.1
23.4
100
TABLE 5. The Effect of Evaluation Location on Item Completion Rate in
Percentages
Type of item
(No. of items)
Inside
classroom
(n = 131)
Outside
classroom
(n = 84)
Open-ended
(1)
85.5
Answered
91.7
Answered
Rating scale
(45)
99.82
Answered
99.87
Answered
Difference in
%
6.2
.05
p
.20a
.78b
Determined with the Fisher’s Exact Test. bDetermined with the independent-samples t test.
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
a
evaluations had no significant effect on
the item completion rates. However,
response to the open-ended question
revealed that those doing the evaluations outside the classroom provided,
on average, more words to their
answers than those providing their
answers inside the classroom, 50.75
word average outside versus 37.65
word average inside, t(187) = 2.00, p <
.025. These latter results provide partial
support for H2.
DISCUSSION
Although this is a small study, the
results suggest that location has an
effect on peer evaluations. The evaluations conducted outside the classroom
were more critical and less supportive
of teammates than were those conducted inside the classroom, and the evaluations conducted outside the classroom
resulted in more copious answers to the
open-ended question. These results
probably occurred because of the
greater privacy and time afforded to the
students when they did the evaluations
outside the classroom. But it is really
unclear how students conducted the
evaluations outside the classroom. Did
they really do them privately? It is possible that some outside evaluations were
actually completed while the students
were in the company of their teammates. One must also wonder if, in fact,
students had more time to conduct the
evaluations when they were done outside the classroom. If the typical student
is taking several courses, working 30 or
more hours a week, and having an
active social life, he or she may feel
more time pressure outside the classroom than in it.
To have a better understanding of
how location affects peer evaluations,
future researchers might inquire on their
survey instruments how students are
feeling during the evaluations. Do students fear that their teammates may see
evaluations? Do they feel pressured for
time? Are their evaluations independent
of the evaluations they receive from
their teammates? Do they care about the
evaluations? Do they think the evaluations will accurately reflect each teammate’s contribution to the project?
Answers to these types of questions will
provide a better understanding of the
location’s effect on the evaluations.
While this study shows that location
affects peer evaluations, it does not
reveal a strong effect. Of the three scales
that showed a significant difference,
none of the differences in mean scores
between the two locations exceed a
value of .81 on a 10–point scale (see
Table 3). Consequently, instructors can
be assured that, regardless of which
location they use to gather the peer evaluations, the results they get will be fairly close to each other. However, it is not
clear at this point which location is generating the more valid evaluations. I suspect that the evaluations conducted outside the classroom are less susceptible
to peer interference and time constraints, but these assumptions need to
be verified.
It may be unnecessary to conduct
peer evaluations outside of the classroom if instructors can provide conditions inside the classroom that allay any
fears and time problems that students
may experience. When conducting evaluations inside the classroom, the
instructor, for example, could do the following to put the students at ease: (a)
tell students that the only person who
will be allowed to view the evaluations
is the instructor; (b) prior to administering the evaluations, tell students to sit
far away from any of their teammates;
(c) give students ample time to provide
their evaluations; and (d) have students
place the peer evaluations in a sealed
envelope prior to submitting them to the
instructor. If all of these conditions are
met, one would think that the evaluations conducted inside the classroom
would be very similar to those collected
outside of it.
Future researchers may wish to
examine how peer evaluations are
affected by other factors that, up until
this time, have been largely overlooked.
For example, how are peer evaluations
affected by confidentiality procedures?
Should students be allowed to read how
other students have rated them? If students know that their peer ratings are
not confidential, will this knowledge
affect the ratings they give their teammates? Future researchers may also
wish to investigate how peer evaluations
are affected by the manner in which the
evaluations are to be used. Some
instructors, for example, use peer evaluations to reward good performers and to
punish poor performers, while others
use the evaluations only to punish poor
performers. How do students want the
peer evaluations to be used and how
does their use affect the evaluations?
Classroom experiments that manipulate
the confidentiality and use of peer ratings could answer these questions.
NOTE
The author thanks the JEB editor and reviewers
for their constructive and supportive comments.
He also thanks his wife, Susan, for entering the
data in the computer program and for assisting
him with word processing.
Correspondence concerning this article should be
addressed to Curt Dommeyer, Department of Marketing, College of Business and Economics, 18111
Nordhoff Street, Northridge, California 91330.
E-mail: [email protected]
REFERENCES
Beatty, J. R., Haas, R. W., & Sciglimpaglia, D.
(1996). Using peer evaluations to assess individual performances in group class projects.
Journal of Marketing Education, 18(2), 17–27.
September/October 2006
25
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:19 11 January 2016
Brooks, C. M., & Ammons, J. L. (2003). Free riding in group projects and the effects of timing,
frequency, and specificity of criteria in peer
assessments. Journal of Education for Business, 78, 268–272.
Chen, Y., & Hao, L. (2004). Students’ perceptions
of peer evaluation: An expectancy perspective.
Journal of Education for Business, 79,
275–282.
Clark, G. L. (1989). Peer evaluations: An empirical test of their validity and reliability. Journal
of Marketing Education, 11(3), 41–58.
Cook, R. W. (1981). An investigation of student
peer evaluation on group project performance.
Journal of Marketing Education, 3(1), 50–52.
Deeter-Schmelz, D. R., & Ramsey, R. (1998). Student team performance: A method for classroom assessment. Journal of Marketing Education, 20, 85–93.
Denton, H. G. (1994). Simulating design in the
world of industry and commerce: Observations
from a series of case studies in the United Kingdom. Journal of Technology Education, 6(1),
1045–1064.
Dommeyer, C. J. (1986). A comparison of the
individual proposal and the team project in the
marketing research class. Journal of Marketing
Education, 8(1), 30–38.
26
Journal of Education for Business
Druskat, V. U., & Wolff, S. B. (1999). Effects and
timing of developmental peer appraisals in selfmanaging work groups. Journal of Applied Psychology, 84(1), 58–74.
Falchikov, N. (1995). Peer feedback marking:
Developing peer assessment. Innovations in
Education & Training International, 32(2),
175–187.
Fox, S., Ben-Nahum, Z., & Yinon, Y. (1989). Perceived similarity and accuracy of peer evaluations. Journal of Applied Psychology, 74(5),
781–786.
Freeman, K. A. (1996). Attitudes toward work in
project groups as predictors of academic performance. Small Group Research, 27(2),
265–282.
Grieb, T., & Pharr, S. W. (2001, Fall). Managing
free-rider behavior in teams. Journal of the
Academy of Business Education, 2(Fall), 37–47.
Haas, A. L., Haas, R. W., & Wotruba, T. R. (1998).
The use of self-ratings and peer ratings to evaluate performances of student group members.
Journal of Marketing Education, 20(3),
200–209.
Johnson, D. W., & Johnson, R. T. (1984). Structuring groups for cooperative learning. The
Organizational Behavior Teaching Review,
9(4), 8–17.
Johnson, C. B., & Smith, F. I. (1997). Assessment
of a complex peer evaluation instrument for
team learning and group processes. Accounting
Education, 2, 21–40.
Macpherson, K. (1999). The development of critical thinking skills in undergraduate supervisory management units: Efficacy of student peer
assessment. Assessment and Evaluation in
Higher Education, 24, 273–284.
McLaughlin, M., & Fennick, R. (1987). Collaborative writing: Student interaction and response,
Teaching English in the Two-Year College, 14,
214–216.
Park, R. C., & Kristol, D. S. (1976). Student peer
evaluations. Journal of Chemical Education,
53, 177–178.
Pfaff, E., & Huddleston, P. (2003). Does it matter
if I hate teamwork? What impacts student attitudes toward teamwork. Journal of Marketing
Education, 25, 37–45.
Strong, J. T., & Anderson, R. E. (1990). Free-riding
in group projects: Control mechanisms and preliminary data. Journal of Marketing Education,
12, 61–67.
Williams, D. L., Beard, J. D., & Rymer, J.
(1991). Team projects: Achieving their full
potential. Journal of Marketing Education,
13(2), 45–53.