239205628 Science Education Journal

This article was downloaded by: [PERI Pakistan]
On: 7 January 2010
Access details: Access Details: [subscription number 778684090]
Publisher Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 3741 Mortimer Street, London W1T 3JH, UK

Assessment in Education: Principles, Policy & Practice

Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713404048

The Role of Assessment in Science Curriculum Reform
Graham Orpwood

To cite this Article Orpwood, Graham(2001) 'The Role of Assessment in Science Curriculum Reform', Assessment in
Education: Principles, Policy & Practice, 8: 2, 135 — 151
To link to this Article: DOI: 10.1080/09695940125120
URL: http://dx.doi.org/10.1080/09695940125120

PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.

Assessment in Education, Vol. 8, No. 2, 2001

The Role of Assessment in Science
Curriculum Reform
GRAHAM ORPWOOD
York/Seneca Institute for Science, Technology and Education (YSISTE), 70 The Pond
Road, Toronto, Canada M3J 3M6

The argument of this article is that changes in curriculum need to be closely
linked to changes in assessment and that this is true as much of the forms of assessment as
it is of its content. Using science as the case in point, the changes in the goals of science

education in the 1960s towards a greater emphasis on inquiry skills were matched some 20
years later with a change in assessment to include performance assessment. Now the new
goals of science education are focused on the need to link science to the broader social
context, but assessment practices have yet to catch up with this change. Given the relatively
greater importance on assessment in the present era, the new curriculum emphasis may well
be ignored unless new approaches to assessment are not designed and implemented soon.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

ABSTRACT

Developing valid and reliable assessment instruments is complex at the best of times.
However, at times of major change in the curriculum, additional challenges and
dilemmas present themselves to test developers and force questions about the
sometimes competing roles of assessment in the larger educational context. In times
gone by, such competing roles might have been of only academic interest. At the
present, however, assessment—whether international, national or local—has become
of such importance, both educationally and politically, that clarifying the roles and
purposes of assessment has become a priority.
In the past 10 years, I have had the opportunity of participating in or closely

observing several science curriculum development projects and also two science
assessment projects. The curriculum development projects have been in the Canadian context and speciŽcally in the province of Ontario, but in the course of
undertaking these projects I have also had reason to analyse science curriculum
developments elsewhere in the world. The assessment projects with which I have
been associated have included the Third International Mathematics and Science
Study (TIMSS)—a large international study for which I acted as science coordinator—and the Assessment of Science and Technology Achievement Project
(ASAP)—an Ontario project to develop curriculum and assessment resources for
classroom teachers. Despite the differences in purpose of these two projects, they
shared the challenge of developing valid science assessment instruments in a period
of signiŽcant curriculum change.
ISSN 0969-594X print/ISSN 1465-329X online/01/020135-17 Ó2001 Taylor & Francis Ltd
DOI: 10.1080/09695940120062629

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

136

G. Orpwood

Curriculum development should, of course, include the development of appropriate assessment both for the classroom teacher to use, and for any external assessment that is used as a summative or external assessment. However, in my

experience, many curriculum guides or policy directions are determined by one
institution, learning materials (such as textbooks) by another and assessment (both
classroom and external) by yet other individuals or examination boards. Among
these various players, there may or may not be consistency of understanding or
commitment concerning the curriculum and the assessments (by whomever given)
may or may not have high validity. This consistency and validity (or their absence)
form the central theme of this article, which takes science as its case in point.
However, the central issue may be equally applicable to other subject areas also.
The argument of the article begins with a brief contextual account of some of the
changes that have been taking place in science curriculum during the past 50 years,
at least in the English-speaking world. I shall argue that, while some of these have
constituted what can be called ‘normal’ curriculum change, others—using the new
Ontario science curriculum as a case in point—warrant the label of curriculum
‘revolutions’. Next, I examine the role of assessment during these periods of
curriculum revolution and identify one corresponding revolution in this area.
Finally, through reecting on these experiences, I shall argue that leadership in
assessment in support of curriculum change must come through research and the
professional development of teachers, rather than through large-scale international
assessment projects.
Science Curriculum Change

From the long-term perspective, the science curriculum can be seen always to be in
a state of ux. While governments or ofŽcial curriculum agencies or examination
boards may not issue a new curriculum every year, teachers are always Žnding new
ways to present the curriculum to students and, therefore, students always experience a ‘new’ curriculum. Sometimes, the change is minor and simply constitutes
changing instructional routines, but at other times, especially after a new ‘ofŽcial’
curriculum has been issued, the changes called for at the classroom level may be
more signiŽcant.
The International Association for the Evaluation of Educational Achievement
(IEA) has developed a useful framework for distinguishing three different senses in
which the term ‘curriculum’ is used (Robitaille et al., 1993):
· the intended curriculum—as set out or mandated in ofŽcial statements of the
curriculum;
· the implemented curriculum—as actually taught or delivered in schools;
· the attained curriculum—as achieved by the students.
This is a useful framework as it enables us to conceptualise important relationships
among the three levels. In general (and to over-simplify the complexities of the
relationships among these three), governments (or other ofŽcial agencies) control
the intended curriculum, teachers the implemented curriculum and students the

Assessment in Science Curriculum Reform


137

attained curriculum. The Žrst two levels are never entirely synchronised and, at
times, there may be very signiŽcant slippage between them. Those involved with
assessment want to make claims about the third level—the ‘attained curriculum’.
However, since this is not directly observable, we have to use ‘indicators’ of
achievement in the form of assessment instruments, from which the attained
curriculum can be inferred. The Žrst part of this article is focused on ways in which
the intended curriculum has changed over the years. Later, I shall consider how
approaches to assessment have reected these changes.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

Normal Curriculum Change
Of course, the science content of the intended curriculum—the ‘what’ should be
taught and learned that is the substance of all curricula—is constantly undergoing
reŽnement as science itself evolves and, as Spencer’s century-old question ‘what
knowledge is of most worth?’, is constantly given new answers. As genetics,
microbiology and ecology become recognised as critically important elements of

biology, the traditional botany and zoology that characterised curricula of the 1950s
have given way more and more to these newer aspects of life science. Earth and
space science has moved from Geography to Science in several jurisdictions as the
importance of the scientiŽc (as opposed to the social) aspect of these areas has
increased. Chemistry increasingly focuses its attention on matters of structure,
mechanism and energy change, from the more traditional attention to the
classiŽcation and properties of materials. Curiously, school physics appears to have
retained a more traditional view of appropriate content with the term ‘modern
physics’ being used to characterise aspects of the subject discovered largely in the
period of 1890–1920. Even the inclusion of technology in some science courses can
be seen as yet another adjustment to the course content. If Kuhnian terminology
can be employed in this context, these changes in content can be seen as aspects of
‘normal’ curriculum change.

Science Curriculum Revolutions
However, overlaying this normal evolutionary change of the science curriculum, the
past 50 years have seen at least two more important changes—changes that I believe
warrant the term ‘curriculum revolutions’ [1]. These revolutionary changes—like
the paradigm shifts Kuhn described to explain the growth of science—were intended
to change, in a fundamental way, how the science curriculum was to be understood,

taught and learned. They focused less on the content of the science curriculum, and
more on the goals for or purposes of teaching and learning science. ScientiŽc
knowledge still represents the core of the curriculum, but the question ‘Why are we
learning this stuff?’ is given a new set of answers.
Roberts’ (1982) concept of curriculum emphasis helps capture the nature of the
change involved. For Roberts, the content of science teaching is always presented in
a context—he calls it a ‘curriculum emphasis’—which communicates to the student

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

138

G. Orpwood

(often implicitly) the purpose of learning the science content. Roberts has also
described a series of seven such curriculum emphases that have characterised science
curricula during this century. While elements of most of these can be found in some
classrooms today, it was not always so. In the 1950s, barely three of Roberts’ seven
emphases were to be found, and two of these—‘correct explanations’ and ‘Žrm
foundations’—both see acquiring scientiŽc knowledge as an end in itself, the only

worthwhile outcome of the curriculum. In one case—correct explanations—because
it (science) is ‘true’ and in the other—Žrm foundations—because it sets a foundation
for the further study of science. In this context, ‘normal change’ of the curriculum
can be seen as the adjustment of which scientiŽc knowledge should be learned and
at which stage. By contrast, ‘revolutionary change’ involves the introduction of one
or more brand new or radically different curriculum emphases, a phenomenon
we observed Žrst some 30 or 40 years ago, and are observing, once again, at the
present time. The new emphasis not only adds a dimension to the curriculum. It
also changes radically the selection of science content seen to be important and
changes the ways in which students are expected to interact with that science
content.
The Žrst of these periods of revolutionary change began in the late 1950s and
1960s in both the British and American (i.e. USA) education systems, as well as
elsewhere in the world. During this period, science curricula became focused on the
nature and processes of the scientiŽc discipline itself. This was the period in
England of the NufŽeld science projects (and those that followed in the same
tradition) at both secondary and primary school levels. In the USA, similar
emphases were being incorporated into science curriculum projects such as PSSC
physics, ChemStudy chemistry, BSCS biology (at the secondary school level)
and Science: a process approach (SAPA), the Elementary Science Study (ESS) and

the Science Curriculum Improvement Study (SCIS, at the elementary school
level).
A whole literature sprang up, which both articulated the rationale underlying
these curriculum projects, and advocated them to teachers and schools (Hurd &
Gallagher, 1968; Hurd, 1969, 1970). For example, in a withering critique of
traditional school science, Schwab (1965) characterised it as ‘a rhetoric of conclusions’, which ignored the underlying processes of inquiry that, he argued, more truly
represented the essence of science. Spurred on by the release of the Russian Sputnik
in 1957, the US government poured millions of dollars into the new science
curricula with a concern to generate more and better scientists to support the
national security imperatives.
In studying science using these curricula, students were expected not just to learn
the concepts and theories of science, but also to acquire an understanding of how
science functions as a discipline and the skills associated with scientiŽc investigation.
Hodson (1993, p. 106) has summarised the purposes of science education following
this period in terms of students ‘learning science, learning about science, and doing
science’. Learning scientiŽc concepts, laws and theories was still seen as important,
but equally important was the context in which the content was to be set. ‘Doing
science’ meant acquiring the skills, strategies and habits of mind associated with

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010


Assessment in Science Curriculum Reform

139

scientiŽc investigation. ‘Learning about science’ referred to understanding how
science functioned as a discipline, its practice, its methods, its logic and its epistemology. Two new emphases—scientiŽc skill development and the structure of
science (Roberts, 1982)—had been born, at least on paper.
This science curriculum revolution inuenced science curriculum talk (Orpwood,
1998)—in curriculum guides, textbooks and professional development workshops—
for nearly four decades. However, classroom teachers were, for the most part,
unprepared to teach these new emphases. Few had had any personal experience of
‘hands-on’ scientiŽc inquiry or received any formal background study in the philosophy of science. Despite the enormous amounts of money and effort that the
curriculum projects put into in-service teacher education, the innovations were
rarely fully taken up in schools, at least in the form that their developers had in mind
(e.g. Stake & Easley, 1978).
Over the subsequent decades, another chapter of the research literature was
devoted to explaining why the curriculum revolution had failed to take root fully in
American schools. There were many factors involved, but one was the failure of
assessment in school science to match the changes in direction adopted by the
curriculum. To this point we shall return after considering the second great curriculum revolution of the past half-century.
The second period of revolutionary change began slowly in the early 1980s and
has now (in the late 1990s) gathered signiŽcant momentum in many countries of the
world. If the Žrst revolution focused attention inward towards the structure and
processes of science itself, the second balances this with attention outward towards
society and the complex relationships among science, technology, society and the
environment. A signiŽcant literature has now described the development and
rationale for the many versions of this new focus for science education (e.g. Hurd,
1975; Aikenhead, 1980; Solomon, 1981; Bybee, 1985; Fensham, 1988; Cheek,
1992a; Solomon & Aikenhead, 1994; Yager, 1996; Black & Atkin, 1996, to name
but a few). Now, in addition to acquiring basic scientiŽc knowledge and the skills of
scientiŽc investigation, students are being expected to understand how science is
related to technology, and how both science and technology impact on society and
the environment. This new curriculum emphasis has even acquired its own
acronym, STS (for science, technology and society) [2].
Once again, national standards and curriculum guides have begun to embrace this
second revolution (e.g. American Association for the Advancement of Science,
1995; National Research Council, 1996; Council of Ministers of Education,
Canada, 1997; Government of Ontario, 1998, 1999). Another factor is inuencing
this revolution in a way that it did not in the 1960s. In the past, goals or aims were
usually stated quite independently from the science content. The result was that
textbooks, teachers and assessors were free to embrace or ignore them. Now, since
many of the newer curriculum guides are stated in the form of outcomes, which
incorporate both goals and content, the new emphasis has become an integral part
of the curriculum speciŽcations (Orpwood & Barnett, 1997). By way of illustration,
I will describe the new Grades 1–8 science and technology curriculum in the
Canadian province of Ontario.

140

G. Orpwood

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

The Ontario Curriculum in Science and Technology: a case in point
The document that eventually became The Ontario Curriculum, Grades 1–8, Science
and Technology (Government of Ontario, 1998) was developed by a consortium of
teachers and school districts led by science educators at York University as a product
of the Assessment of Science and Technology Project (ASAP). It had a series of
features that were new for the province. It represented the Žrst curriculum in
Ontario in 30 years that clearly articulated expectations in science for each grade of
the elementary school. It integrated the study of science with that of technology, the
Žrst time technology education was speciŽcally mandated in Ontario. It introduced
the study of earth and space sciences into the science curriculum (these areas having
been regarded previously as physical geography). It was set out in the form of
outcomes for what students should know and be able to do by the end of each
year. All of these could be regarded as ‘normal’ changes to the curriculum, even
though these were major changes and presented signiŽcant challenges for classroom
teachers.
However the three goal statements represented the element of the curriculum that
was revolutionary. These are that students are intended:
· to understand the basic concepts of science and technology;
· to develop the skills, strategies and habits of mind required for scientiŽc inquiry
and technological design;
· to relate scientiŽc and technological knowledge to each other and to the world
outside the school (Government of Ontario, 1998, p. 4).
These goals emerged from a complex project design that combined analysis of the
following factors:
·
·
·
·
·
·

an up-to-date view of the nature of science and technology;
curriculum trends nationally and internationally;
research on children’s capacity to learn;
the experience of classroom teachers;
consideration of the needs of Canadian society;
a deliberated consensus about all of these and the needs of Ontario’s children.

This is not the place for an exhaustive account of all of these factors. However, two
can serve to demonstrate the origins of the goals adopted by this curriculum. For
example, the project sought to link the concepts of ‘science’ and ‘technology’, as
school subjects to the array of concepts that each embodies in the real world (see
Orpwood & Bloch, 1998, pp. 7–9). Both are, Žrst, ‘systems of knowledge’—science
seeking to describe and explain the natural and physical world, and technology
seeking to meet human needs, through inventing modifying devices, structures,
systems or processes. Secondly, both science and technology are processes of
investigation and exploration—science through the processes of inquiry and technology through those of design. Thirdly—and this represented a new element for

Assessment in Science Curriculum Reform

141

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

many—science and technology are both social enterprises, which exist in social
economic, political and environmental contexts. Omission of these contexts means
that only a partial view of both science and technology is presented.
The changing needs of students in the new millennium were another major
component that emerged from the research that examined trends nationally and
internationally (Orpwood & Barnett, 1997). The project held deliberations involving
a wide variety of stakeholders that led to a clear consensus about the desirable aim
of the curriculum. We should ensure that every student received the opportunity to
develop basic scientiŽc literacy and technological capability. These, in turn, involved
three elements:
· understanding the core concepts of science and technology;
· acquiring the skills important for life and work in the twenty-Žrst century;
· being able to relate the knowledge and skills acquired in school to real-life
situations.
The goals that emerged from this process (which occupied many hundreds of people
and lasted two full years) are not coincidentally very similar to those appearing in
many other new curricula in jurisdictions around the world. Indeed, analysing these
curricula was a component of ASAP. However, they do differ signiŽcantly from
those of the Žrst curriculum revolution and even more from those from before that
time.
Incidentally, at the same time that ASAP was undertaking its curriculum
development work, the Council of Ministers of Education, Canada, completed the
development of its own framework for science curriculum known as the PanCanadian Science Framework (Council of Ministers of Education, Canada, 1997).
There was signiŽcant interaction between the two projects, since the principal
architect of the ASAP document (Marietta Bloch) was also a member of the
Pan-Canadian development team. The goals articulated by the Pan-Canadian
framework are entirely compatible with those in Ontario and, thus, this document
belongs to the new generation of ‘second revolution’ curriculum frameworks (Aikenhead, 2000).
The three goals, once articulated, form the conceptual glue that binds the rest of
the document together. The content is organised in Žve strands, which effectively
integrate the science and technology content knowledge:
·
·
·
·
·

life systems;
matter and materials;
energy and control;
structures and mechanisms;
earth and space systems.

For each of these strands, at each of the eight grades, the three goals are interpreted
in the form of three overall expectations and three sections of speciŽc expectations.
The goals are clearly integrated with the content so as to ensure their not being
omitted during the implementation.

142

G. Orpwood

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

Assessment in the Context of Curriculum Change
If the curriculum went though ongoing ‘normal’ change and discrete periods of
‘revolutionary’ change, then it would be reasonable to expect that the cousin activity
of assessment would experience parallel types of change and that current forms of
assessment are co-ordinated well with the current curricula. However, I shall argue
that does not turn out to be the case.
If the teaching of science knowledge for its own sake—Roberts’ (1982) ‘correct
explanations’ and ‘Žrm foundations’ emphases—represents the basic curriculum
paradigm of the pre-1960s period, then measuring how much scientiŽc knowledge
a student has acquired represents the corresponding assessment paradigm.
Throughout the world, science assessment both in classrooms, and in national or
international projects focused on students demonstrating their scientiŽc knowledge
chiey by responding to questions that required recall of memorised information,
solving problems through memorised algorithms, and analysis of contrived data
or situations that parallel those encountered in school science. This pattern of
school science assessment mirrors in many ways patterns experienced in university
examinations.
The pattern described here is not restricted to the use of multiple choice items
though these are popular in North America because of the ease and reliability of
scoring. The essay-type constructed response items and the short-answer format
(more familiar to teachers and students in Europe) are equally likely to call for recall
or the simple processing of memorised information. The point I am making here is
that ‘normal’ science assessment comprises a very limited range of student cognitive
activities, regardless of the types of assessment item used.
Given this paradigm, validity issues in science assessments usually amount to
analysis of the distribution of items of various science content areas compared to the
distribution of the science content topics in the curriculum. One of the problems of
assessment thus consists of developing an assessment that is balanced with respect
to the many science topics covered, while still maintaining an assessment of
reasonable length. In classroom tests, teachers handle this by having frequent ‘unit
tests’ covering small areas of the curriculum. School examinations handle it by a
judicious selection from the topics covered—leading, of course, to the students
having to try to second-guess which topics will be ‘on the exam’ with those who
guess best being more successful than those whose predictions are less accurate.
In large-scale achievement tests, such as TIMSS, the problem is magniŽed, since
the range of topics covered by curricula in the many countries is very broad. This
problem can be resolved in part through a complex test design involving a very large
pool of items and the use of multiple test booklets (Adams & Gonzalez, 1996). Even
so, the problem of test development in TIMSS was signiŽcant. With a blueprint
based loosely on the science curricula of participating nations (McKnight et al.,
1993), it involved making many compromises based on such factors as Želd-test
results, national preferences and the avoidance of large item-by-country interactions
(Garden & Orpwood, 1996). While, from a technical (reliability and Item Response
Theory (IRT) scaling) perspective, the TIMSS written achievement tests ‘worked’

Assessment in Science Curriculum Reform

143

effectively, they have continued to attract criticism from observers in a variety of
countries (e.g. Fensham, 1998). The over-arching criticism has been a challenge to
their validity, particularly in these times of curriculum change.
There was much more to TIMSS than the simple assessment of students’ science
content knowledge and I shall return to further discussion of TIMSS later. First,
however, I want to discuss a revolution in assessment that corresponds to the Žrst
curriculum revolution described earlier.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

Assessment for the 1960s Curriculum Revolution
While many of the curriculum projects that incorporated the new goals (concerning
the nature of science and the acquisition of science inquiry skills) attempted to
develop their own measures of achievement, these rarely became commonplace in
schools or in large-scale assessments. Rather, teachers and national/international
assessment projects continued to use traditional assessment measures—measures
that, in the main, called for recall of memorised scientiŽc knowledge.
There were perhaps four major reasons for this. First, the assessment technology
that would permit valid assessment of students’ abilities to conduct investigations in
science had not been designed in the 1960s. The Žrst signiŽcant ‘performance
assessments’ (as they have now become known) were designed in England in the
early 1980s by the Assessment of Performance Unit (APU, 1983) fully 20 years after
the goal of instilling ‘inquiry skills’ in students had Žrst been introduced into the
curriculum. Even after its initial successes in the UK, the APU saw its funding cut
and it was even longer before the sort of assessment using performance tasks became
familiar in North America.
Secondly, even when newer, more authentic assessments had been developed, the
psychometric community—particularly in the United States—who were anxious to
maintain the reliability of the multiple choice and other objectively scored tests,
expressed scepticism over such new measures of assessment. It is only in the last
decade that signiŽcant research on the characteristics of performance assessments
has become commonplace in the educational literature.
The third reason had to do with public credibility: universities and the public
thought they knew what traditional tests measured, and new, unproven forms of
assessment lacked the familiarity and thus the credibility of the traditional ones. This
reason is still, as we shall note later, a problem for science educators who try to make
their assessments match the intended goals of the curriculum.
The Žnal reason, I would submit, was a professional inertia amongst teachers
themselves, particularly at the secondary school level, for whom assessment has
often tended to mimic the examinations experienced at university. Thus, when a
national assessment was developed or reviewed by a committee of teachers, the
items most likely to be considered acceptable were those of the most traditional
variety.
These factors led to a signiŽcant delay in the paradigm shift in assessment
corresponding to the curriculum revolution of the 1960s. It was the 1980s before
performance assessment even made its Žrst signiŽcant appearance and the 1990s

144

G. Orpwood

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

before it became at all widespread. Even in TIMSS, the performance assessment
component (Harmon et al., 1997), which had initially been described as ‘integral’ to
the study, was later treated as a national option, was reported (at the international
level) separately from the paper-and-pencil assessment, and was dropped entirely
from the TIMSS-R replication study taking place in 1999.
It appears that the goals that formed the essence of the science curriculum
revolution of the 1960s are still not being assessed with the same degree of attention
as those that focus on simple recall of scientiŽc information. Stake & Raizen (1997),
commenting on this situation in a recent review of curriculum innovations in the
United States, observe that:
most reformers in the eight projects we studied agreed that the reconceptualization of science education is incomplete if it leaves out the reconceptualization of assessment. Yet systemic educational reform calls for the use of
rigorous, objectively scored, standardized tests as bottom-line criteria.
(p. 138)
They also point out the political dilemma:
it is difŽcult to assure parents, taxpayers, and sceptical teachers that the
new curricula and teaching strategies will provide students the information
that achievement testing has traditionally required. Reformers who claimed
that back in the 1960s failed to be persuasive (Stake & Easley, 1978).
(State & Raizen 1997, p. 132)

Assessment for the 1990s Curriculum Revolution
If assessment of the curriculum goals that characterised the 1960s revolution has
been delayed and still seeks credibility, that for the STS revolution in science
curriculum has barely surfaced at all beyond the research level. Some researchers
have recognised the problem (e.g. Aikenhead et al., 1987; Bybee, 1991; Cheek,
1992b) and some projects have attempted to tackle it (e.g. American Chemical
Society, 1988; Aikenhead & Ryan, 1992). Cheek (1992b) reports that STS components are contained in the work of the New York State Education Department, the
South Australia Senior Secondary Assessment and several examination boards in the
UK. In Canada, Alberta Education’s assessment branch has also attempted to
ensure that the STS components of the curriculum are truly reected in their
provincial assessments.
However, for most classroom teachers and large-scale assessment projects there
remains little guidance or exemplary work that addresses how to assess student
achievement in the context of an STS-orientated curriculum. In times gone by, this
might not have mattered if teachers were convinced that an STS emphasis was
something they felt was right to integrate into their science programmes. However,
with the increasing importance of assessment and the measuring of students’
achievement of the intended outcomes, the absence of STS from classroom and

Assessment in Science Curriculum Reform

145

Item A1
Nuclear energy can be generated by ®ssion or fusion. Fusion is not currently being used
in reactors as an energy source. Why is this?
A.
B.
C.
D.

The scienti®c principles on which fusion is based are not yet known
The technological processes for using fusion safely are not yet developed.
The necessar y raw materials are not yet readily available.
Waste products from the fusion process are too dangerous.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

FIG. 1. Item A1.

large-scale assessment is likely to have a profound inuence on whether the STSrelated outcomes in the curriculum are taken seriously.
The test development experience of TIMSS once again provides an illustration of
some of the difŽculties associated with trying to include assessment items that
address the STS emphasis in the science curriculum. The main TIMSS item pools
for 9- and 13-year-olds—TIMSS populations 1 and 2—contained few items (5% at
most) that addressed STS issues and no more than this number that focused on the
nature of scientiŽc investigation. However, the Mathematics and Science Literacy
(MSL) component of the third TIMSS population (school-leavers) represented the
most systematic attempt to include such items, in a category of the test labelled
‘Reasoning and Social Utility’ (RSU) [3]. While, in the end, RSU was not used as
an independent reporting category, several STS items were included in this aspect
of the MSL achievement tests. Some of them, such as Item A1, call for students to
recall previously learned scientiŽc or technological information (Fig. 1).
Others, such as Item A7, call for the application of scientiŽc principles to a social
situation (Fig. 2).
A third type of item, of which there were very few examples in TIMSS, but which
perhaps illustrates the STS emphasis more faithfully, is exempliŽed by Item A11
(Fig. 3).
This item was based on a real-life scenario (described in a newspaper article) and
the original item only contained part (B). This version of the item was challenged by
the TIMSS subject-matter specialists as ‘containing no science’ and thus part (A)
was added. The second part of the item is clearly an attempt to assess students’ STS
understanding in that it invites consideration of the social and economic consequences of the introduction of a new technology.

Item A7
Some high-heeled shoes are claimed to damage ¯oors. The base diameter of these very
high heels is about 0.5 cm and that of ordinary heels about 3 cm. Brie¯y explain why
the very high heels may cause damage to ¯oors.
FIG. 2. Item A7.

146

G. Orpwood

Item A11
It takes 10 painters 2 years to paint a steel bridge from one end to the other. The paint
that is used lasts about 2 years, so when the painters have ®nished painting at one end
of the bridge, they go back to the other end and start painting again.
A. Why must steel bridges be painted?
B. A new paint that lasts 4 years has been developed and costs the same as the old
paint. Describe two consequence s of using the new paint.
FIG. 3. Item A11.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

The item very nearly failed to survive in the MSL achievement test because of a
variety of additional factors including:
· difŽculties with scoring—the notion of a ‘correct answer’ is dependent on the
socio-political context;
· because it was not considered by educators in many countries as appropriate for
a science achievement test, even one that focused on science literacy;
· the implicit introduction of ‘values’ into a science assessment—the response one
gives to part (B) of this item requires one to adopt a value-laden ‘position’, and
think through the assumptions and consequences of that position.
Nevertheless, the item did remain in TIMSS and the results, which are currently
being analysed for another paper, show some interesting patterns of response across
the world and even within countries.
However, the difŽculties encountered in the development and use of this item
remain and would appear to be endemic to STS assessment. Aikenhead and his
colleagues at the University of Saskatchewan have suggested that ‘a new generation
of standardized instruments’ (Aikenhead et al., 1987) is required or, in the language
of this article, a new revolution in assessment. Their work in developing the Views
on Science-Technology-Society (VOSTS) instrument certainly represents a challenge to the normal conception of assessment in science. In their words, ‘VOSTS
requires students to write an argumentative response—a reaction to a statement
about a STS topic. Rather than analyzing “right” and “wrong” answers, we let
students’ arguments deŽne various positions or viewpoints on each STS topic’.
While the original VOSTS was not suitable for use in large-scale assessments, it has
since been adapted to describe students’ views on STS in Ontario (Crelinsten et al.,
1993). VOSTS overcame the ‘problem of values’ by allowing the student to adopt
any position, but assessed the quality of the argument.
The new OECD study, Programme for International Student Assessment, known
as PISA, is also attempting to push the bounds of assessment in the area of STS.
However, it is resisting the inclusion of items that require values to be analysed.
Rather, it is presenting students with scenarios from real life and asking them to
demonstrate their abilities at using ‘scientiŽc processes’ in the analysis of the issues
involved (for more information see the PISA Framework document, Programme for
International Student Assessment, 1999).

Assessment in Science Curriculum Reform

147

Task 1LS/PT02 (for grade 1, Life Systems strand)
GAME TIME
Design and make a game for a child who is not able to see. Name your game and
describe the rules so that others can play it.
What other senses will people who play your game have to use?
Name the materials you used to make the game and describe why you chose them.
Draw a picture of the game and label the parts.
Describe the rules of your game.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

FIG. 4. Task 1LS/PT02.

The Assessment of Science and Technology Achievement Project (ASAP) has
developed a wide range of assessment tasks for classroom use (Orpwood et al.,
1999) corresponding to the full range of the expectations contained in the Ontario
science and technology curriculum described earlier. The focus of the collection of
500 tasks covering eight grades is on ‘what students can do with what they know’,
rather than on the traditional ‘what they know’. In the area of STS, some of the
tasks put students into real-world situations and ask them to reect on the situation
in some important respect. In this respect, the ASAP collection bears similarities to
the PISA science assessment. A few sample tasks can serve to illustrate the point
(Fig. 4).
The Grade 1, Life Systems unit is entitled ‘Characteristics and Needs of Living
Things’ and the task is focused on several expectations from the ‘skills of inquiry and
design’ section of the curriculum including asking questions about the needs of
living things, planning investigations and communicating results. In addition, the
task addresses the STS expectations of comparing the ways in which humans use
their senses to meet their needs, and describing ways in which people adapt to the
loss or limitation of sensory ability.
Not all the tasks are ‘hands-on’ in the sense of requiring students to undertake
practical work in a laboratory setting. Consider the following task, for example, from
the Grade 7 Life Systems unit on ‘Interactions within Ecosystems’ (Fig. 5).
These tasks call for students to think holistically about a real-world situation,
taking into account the competing demands of apparently conicting positions. It

Task 7LS/EA04 (for Grade 7, Life Systems strand)
A construction company is about to bulldoze a wood lot with a pond nearby so that
a new housing development can be built. Devise a plan so that the new houses get built
and yet the environment, and the plants and animals in it, get protected. We want this
to be a win-win situation. How can the new houses be built and yet the environment
still protected?
FIG. 5. Task 7LS/EA04.

148

G. Orpwood

calls for the creative development of solutions to a problem that clearly has no right
or wrong answers. In both cases, students must have developed prior knowledge
and skills, and in both the responses will demonstrate their abilities at these. The
focus here is on integrated, open-ended thinking of a kind not usually sought in
science assessments. Scoring responses to such a question will be hard, especially
if reliability considerations are paramount. Yet both would appear to be entirely
appropriate given the expectations of the curriculum. While these examples are
not presented as ideal examples of STS assessment items, they represent the
direction that the needed assessment revolution must pursue if the latest curriculum revolution is to be reected adequately in classroom assessments.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

Concluding Thoughts: what counts as science assessment?
Clearly, new directions—arguably revolutions—are emerging in science curricula in
various parts of the world. The assessments required to determine students’ achievement of the new goals of science curricula, however, have been slow to catch up.
While recent progress in the use of performance assessments have focused attention
on what students can ‘do’ in science, as well as on what they ‘know’, the new
challenges presented by the STS revolution in science education has not been
systematically addressed by most assessments. Indeed, the problem of ‘what counts’
as science assessment has in many cases not developed much from the prerevolutionary era when measuring the quantity of students’ knowledge of science
was the major focus.
Of course, the new STS revolution appears in many varieties. It is not the case
that all versions of STS curriculum are focused on the same speciŽc goals or
integrate STS with science content in the same way or to the same extent, as
Aikenhead (1994) has pointed out. For example, the Žve items sampled above all
reect some aspect of STS in that all of them link science topics with STS content.
However, each of them does so in a different way. Some (e.g. A7 and A11a) call for
students simply to apply their scientiŽc knowledge, albeit in an STS context. Others
(such as A1) call for students to recall speciŽc STS information. Yet others (e.g.
item A11b) require students to demonstrate little knowledge of science content, but
rather to be able to reason about the impact of science and technology in a social
context.
Those of us who advocate STS in science education have a responsibility to clarify
more precisely what we expect students to be able to demonstrate in an assessment
context if we expect STS to appear more consistently in science assessments of any
kind. A framework that enables analyses of the varieties of STS objective that are
incorporated in a curriculum and thus the types of assessment that are appropriate,
is needed. Aikenhead’s framework provides a useful start, but it focuses on the
percentage of a complete assessment that is STS. As the items shown here demonstrate, the issue is not simply one of ‘how much’ of an assessment is STS, but also
‘what types’ of student performance are called for and how these relate to the intent
of the STS curriculum. Any move towards a more comprehensive framework for the
assessment of STS must take these complexities into account.

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

Assessment in Science Curriculum Reform

149

International and other large-scale assessments face a particular dilemma. On the
one hand, their validity is sometimes determined (as was the case in TIMSS) not
only in reference to the content of the intended curricula, but also partly in relation
to the implemented curricula. Even in countries that intend the curriculum to
include STS, implementation may lag way behind the intended curriculum changes.
It is hard, therefore, for such international projects to provide ‘leadership’ in terms
of promoting new forms of assessment having higher validity in some countries,
while also remaining ‘acceptable’ to all participants. At the same time, the political
status and high-proŽle consequences of these large-scale international studies such
as TIMSS may encourage the maintenance of the status quo, or even slow down the
spread of curriculum revolutions across and within the countries that participate.
Leadership is therefore required from all quarters to ensure that innovations such
as performance assessment and STS assessment are not allowed to be regarded as
‘second-class’ or entirely ‘optional’ ways of assessing achievement in science education. In the case of large-scale assessments, this requires new models for addressing validity to be introduced such as that proposed (but not implemented) for
TIMSS by Shavelson et al. (American Educational Research Association, 1993). He
proposed what became known within TIMSS as the ‘ower and petals’ model,
involving a core cluster of items as an assessment for all countries, and other clusters
of items, which would be taken by those countries selecting to do so. Such a model
might have gone some way to resolving the dilemma of validity across the many
countries participating in TIMSS.
At the same time, the professional inertia that resists change in assessment at the
classroom, local and national levels needs to be addressed. Here, I believe that the
key move is to integrate assessment with the professional development of teachers,
as is already the practice of the Ontario provincial assessment programme and in the
next phase of ASAP currently under way. Teachers will work on developing new
forms of assessment for their classrooms as part of an ongoing series of professional
development workshops, and thereby address together the challenges of a new STS
curriculum and of assessing it in an appropriate way.
Finally, academic leadership must be shown through greater collaboration between the curriculum and psychometric research communities. One of the casualties
of academic specialisation is that those of us schooled in the issues of curriculum,
teaching and learning are not often also up-to-date with developments in assessment, while those whose expertise lies in assessment have not had time or interest
to understand the complexities of the revolutions that have taken place in the
curriculum. Dialogue across this divide is required if the revolutions of the intended
science curriculum are to be reected in the real and reported achievements of
students, in whose interests the entire enterprise is undertaken.
NOTES
[1]
[2]
[3]

The term ‘revolution’ is also used in this way by Atkin et al. (1996).
Sometimes, ‘Environment’ is added as an additional element to STS, making the acronym
STSE (see Council of Ministers of Education, Canada, 1997, for example).
Orpwood & Garden (1998) describe the test development for the MSL component of
TIMSS in detail.

150

G. Orpwood

Downloaded By: [PERI Pakistan] At: 07:57 7 January 2010

REFERENCES
ADAMS, R. & GONZALEZ E. (1996) The TIMSS test design, in: M. MARTIN & D. KELLY (Eds)
Third International Mathematics and Science Study, Technical Report, Volume 1: design and
development (Chestnut Hill, Boston College).
AIKENHEAD, G. (1980) Science in Social Issues: implications for teaching (Ottawa, Science Council
of Canada).
AIKENHEAD, G. (1994) What is STS science teaching? in: J. SOLOMON & G. AIKENHEAD (Eds) STS
Education: international perspectives on reform, pp. 47–59 (New York: Teachers College
Press).
AIKENHEAD, G. (2000) STS science in Canada: from policy to student evaluation, in: D. KUMAR
& D. CHUBIN (Eds) Science, Technology, and Society: a source book on research and practice, pp.
49–89 (Kluer, Plenum Press).
AIKENHEAD, G. & RYAN, A. (1992) The development of a new instrument: ‘Views on sciencetechnology-society’ (VOSTS), Science Education, 76, pp. 477–491.
AIKENHEAD, G., FLEMING R.