some teachers find themselves with
tremendously accurate. For example,
accelerated learners, whereas others,
until recently, many teacher evaluation
like Ms. Mauclair, may find them-
systems only provided binary ratings:
selves with more challenging students.
satisfactory or unsatisfactory, with a full
Existing models do not adequately
99 percent of teachers receiving satis-
control for this problem of nonrandom
factory (Weisberg, Sexton, Mulhern, &
assignment (Rothstein, 2008).
Keeling, 2009). Moreover, researchers
Students’ previous teachers can create
have found weak correlations between
a halo (or pitchfork) effect. Researchers
principals’ ratings of teacher perfor-
have discerned that the benefits for
mance and actual student achievement;
students of being placed in the class-
in general, principals appear to be fairly
rooms of highly effective teachers can
accurate in identifying top and bottom
persist for years. As a result, mediocre
performers, but they struggle to dif-
teachers may benefit from the afterglow ferentiate among teachers in the middle
of students’ exposure to effective
(Jacob & Lefgren, 2008).
teachers. Conversely, researchers have
found “little evidence that subsequent
effective teachers can offset the effects
of ineffective ones” (Sanders & Horn,
ratings for effective
1986, p. 247). As a result, the value-added ratings for effective teachers may
be diminished because of previous,
Teachers’ year-to-year scores vary
widely. Perhaps one of the most trou-
because of previous,
bling aspects of value-added measures
is that the ratings of individual teachers
typically vary significantly from year to
year (Baker et al., 2010). For example,
When faced with imperfect predictors
in one study, 16 percent of teachers
of college success, colleges have learned
who were rated in the top quartile one
to use a variety of measures to make
year had moved to the bottom two
decisions about which students to
quartiles by the next year, and 8 percent admit. The challenges posed by value-
of teachers in the bottom quartile had
added measurement would suggest that
risen to the top quartile a year later
schools take a similar approach. School
(Aaronson, Barrow, & Sander, 2003).
leaders should heed researchers’ consistent warnings against publicly
Still Better Than
releasing individual teacher ratings or
relying heavily on value-added measures
In general, the year-to-year correlation
to make high-stakes employment deci-
between value-added scores lies in the
sions. But value-added measures might
. 30 to . 40 range (Goldhaber & Hansen, reasonably be considered as one com-
2010). Although this correlation is not
ponent of teacher evaluation—when
large, researchers at the Brookings Insti-
taken with a healthy dose of caution and
tution note that it is almost identical to
considered alongside other measures. EL
the correlation between SAT scores and
college grade point average (. 35); yet we
continue to use SAT scores in making
decisions about college admissions
“because even though the prediction of
success from SAT/ACT scores is modest,
it is among the strongest available predictors” (Glazerman et al., 2010, p. 7).
Similarly, more traditional measures
Aaronson, D., Barrow, L., & Sander, W.
(2003). Teachers and student achievement
in the Chicago public high schools. Chicago:
Federal Reserve Bank of Chicago.
Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F.,
Linn, R. L., Ravitch, D., et al. (2010).
Problems with the use of student test scores to
evaluate teachers. Washington, DC: Economic Policy Institute.
Braun, H. I. (2005). Using student progress to
evaluate teachers: A primer on value-added
models. Princeton, NJ: Educational Testing
Casey, L. (2012, February 28). The true
story of Pascale Mauclair. Edwize.
Retrieved from www.edwize.org/the-true-
Clawson, L. (2012, March 4). New
York City’s flawed data fuel
right’s war on teachers. Daily Kos.
Retrieved from www.dailykos.com/
Glazerman, S., Loeb, S., Goldhaber, D.,
Steiger, D., Raudenbush, S., White-
hurst, G. (2010). Evaluating teachers:
The important role of value-added.
New York: Brookings. Retrieved
Goldhaber, D., & Hansen, M. (2010).
Assessing the potential of using value-added
estimates of teacher job performance for
making tenure decisions (Working paper
31). Washington, DC: National Center
for Analysis of Longitudinal Data in Education Research.
Goe, L., Bell, C., & Little, O. (2008).
Approaches to evaluating teacher
effectiveness: A research synthesis.
Washington, DC: National Comprehensive
Center for Teacher Quality.
Jacob, B. A., & Lefgren, L. (2008). Principals
as agents: Subjective performance measurement in education (Faculty research
working papers series No. RWP05-040).
Cambridge, MA: Harvard University John
F. Kennedy School of Government.
Marzano, R. J. (2000). A new era of school
reform: Going where the research takes us.
Aurora, CO: McREL.
Rothstein, J. (2008). Student sorting and
bias in value-added estimation: Selection
on observables and unobservables. Paper
presented at the National Conference on
Value-Added Modeling, Madison, WI.
Retrieved from www.wcer.wisc.edu/news/
Sanders, W. L., & Horn, S. P. (1998).
evaluation and research. Journal of Personnel Evaluation in Education, 12( 3),
database: Implications for educational
247–256. Retrieved from www.sas.com/
Weisberg, D., Sexton, S., Mulhern, J., &
Keeling, D. (2009). The widget effect:
Our national failure to acknowledge and
act on differences in teacher effectiveness.
Brooklyn, NY: New Teacher Project.
of teacher performance have not been