Stephen Chappuis, Jan Chappuis,
and Rick Stiggins
Long before No Child Left Behind (NCLB), high-stakes tests were common in schools. Cut scores on tests have dictated promotion from one grade level to the next, and teachers have used them to
assign passing or failing grades. High school students
continue to take course placement exams, subject-area finals, exit exams, and college entrance tests.
Making decisions that affect individuals and groups
of students on the basis of a single measure is part of
our past and current practice.
In the past, few educators, policymakers, or
parents would have considered questioning the accuracy of these tests. Most assumed that a low score or
grade was probably justly assigned and that a decision made about a student as a result was as defensible as the evidence on which it was based.
But NCLB has exposed students to an unprecedented overflow of testing. In response to the
accountability movement, schools have added new
levels of testing that include benchmark, interim,
and common assessments. Using data from these
assessments, schools now make decisions about individual students, groups of students, instructional
programs, resource allocation, and more. We’re
betting that the instructional hours sacrificed to
testing will return dividends in the form of better
instructional decisions and improved high-stakes test
scores.
Given the rise in testing, especially in light of a
heightened focus on using multiple measures, it’s
increasingly important to address two essential
components of reliable assessments: quality and
balance.
Keys to Quality
Although it may seem as though having more assessments will mean we are more accurately estimating
student achievement, the use of multiple measures
does not, by itself, translate into high-quality
evidence. Using misinformation to triangulate on
student needs defeats the purpose of bringing in
more results to inform our decisions.
© ROSS M. HOROWITZ/GETTY IMAGES
Five keys to assessment quality provide the larger
picture into which our multiple measures must fit
(Stiggins, Arter, Chappuis, & Chappuis, 2006). Only
assessments that satisfy these standards—whether
teachers’ classroom assessments, department or
grade-level common assessments, or benchmark or
interim tests—will be capable of informing sound
decisions.
Clear Purpose
The assessor must begin with a clear picture of why
he or she is conducting the assessment. Who will use
the results to inform what decisions? The assessor
might use the assessment formatively—as practice or
to inform students about their own progress—or
summatively—to feed results into the grade book.
The use of multiple measures
does not, by itself, translate
into high-quality evidence.
In the case of summative tests, the reason for
assessing is to document individual or group
achievement or mastery of standards and measure
achievement status at a point in time. The purpose is
to inform others—policymakers, program planners,
supervisors, teachers, parents, and the students
themselves—about the overall level of students’
performance.
Clear Learning Targets
The assessor needs to have a clear picture of what
achievement he or she intends to measure. If we
don’t begin with clear statements of the intended
learning—clear and understandable to everyone,
including students—we won’t end up with sound
assessments.
For this key to quality, it’s important to know
the learning targets represented in the written
curriculum. The four categories of learning
targets are
; Knowledge targets, which are the facts and
concepts we want students to know. In math, a