evaluated multiple times every year by
different observers.
The same cannot, however, be said
about value-added estimates. Too
often, states fail to address the potential
problems with using these measures.
Four Research-Based
Recommendations
It is easy to sympathize with educators
who balk at having their fates decided
in part by complex, seemingly imprecise
statistical models that few understand.
But it is not convincing to argue that
value-added scores provide absolutely
no useful information about teacher
performance. There is some evidence
that value-added scores can predict the
future performance of a teacher’s students (Gordon, Kane, & Staiger, 2006;
Rockoff & Speroni, 2010) and that high
value-added scores are associated with
modest improvements in long-term
student outcomes, such as earnings
(Chetty, Friedman, & Rockoff, 2011).
It is, however, equally unconvincing to
assert that value-added data must be the
dominant component in any meaningful
evaluation system or that the value-added estimates are essential no matter
how they are used (Baker et al., 2010).
By themselves, value-added data are
© MICHAEL AUSTIN/THEiSPOT
neither good nor bad. It is how we use
them that matters. There are basic steps
that states and districts can take to min-
imize mistakes while still preserving the
information the estimates provide. None
effectiveness (Koretz, 2002). Moreover, classroom observations. Observation
of these recommendations are sexy or
different models can produce different
scores can be similarly imprecise
even necessarily controversial. Yet they
results for the same teacher (Harris,
and unstable over time (Measures of
are not all being sufficiently addressed
Sass, & Semykina, 2010; McCaffrey,
Effective Teaching Project, 2012). Dif-
in new evaluation systems.
Lockwood, Koretz, & Hamilton, 2004), ferent protocols yield different results
as can different tests plugged into the
same model (Papay, 2011).
These are all important points, but
for the same teacher, as do different
observers using the same protocol
(Rockoff & Speroni, 2010).
Avoid mandating universally high
weights for value-added measures.
There is no “correct” weight to give
the unfortunate truth is that virtually
As states put together new obser-
value-added measures within a teacher’s
all measures can be subject to such
vation systems, most are attempting
overall evaluation score. At least,
criticism, including the one that value-
to address these issues. For instance,
there isn’t one that is supported by
added opponents tend to support—
many are requiring that each teacher be research. Yet many states are mandating