and then scoring these tests
using the 100-point (or
FIGURE 1. First Quarter Report for a Middle School Mathematics Student
percentage) scale makes it
impossible to gauge indi-
MEASUREMENT TOPICS
Score 0.5 1.0 1. 5 2.0 2. 5 3.0 3. 5 4.0
vidual students’ knowledge.
Number systems
2. 5
Estimation and mental computation
1. 5
How Complex Is
the Content?
Ratio/Proportion/Percent
2.0
Even if a teacher were
Patterns
3. 5
vigilant enough to design
Equations
2. 5
tests that addressed a single
topic, the tests still might
Data Analysis
1.0
not be useful in tracking
student progress. If the first
test addressed simpler content relative
to a topic, students would generally
animals and plants. The remaining
scores in the scale all reference these
An effective standards-
receive high scores. However, if the
three levels of content. That is, none of based grading and
second test addressed more complex
the other levels contains new content.
content, students might receive lower
A score of 3. 5 indicates competence on reporting system should
scores even though they had learned
quite a bit about the topic. What we
score 2.0 and 3.0 content and partial
success on score 4.0 content. A score
eliminate the overall
need is a device to determine the level
of 2. 5 indicates success on score 2.0
or “omnibus” grade.
of a test’s complexity. Once we do this, content and partial success on score 3.0
we can use the 100-point scale with
content, and so on.
some integrity in terms of tracking stu-
In working with schools and dis-
dents’ progress.
tricts, we’ve found that three levels of
Recommendation 2:
To make classroom assessments more content make it easy for teachers to
If you can’t get rid of the
comparable, we can use proficiency
design assessments without sacrificing
omnibus grade, provide scores
scales that delineate both the topic and precision of measurement. More specifi- on measurement topics
the level of complexity being measured. cally, teachers can design assessments
in addition to the grade.
Consider the left-hand side of Figure 2
that address one level of proficiency
(p. 38), which contains a generic form
only—for example, a test that covers
If public pressure demands that
of the scale; this quantifies student
only score 2.0 content—or they can
students receive an overall grade or
understanding along a continuum that
design tests that cover all three levels
percentage score, a school or district can
goes from lack of understanding of even of content. When a test addresses only
still employ the benefits of the approach
the most basic concepts to understand-
one level of content, the 100-point scale shown in Figure 1 by including the
ing complex content. The score of 3.0
makes some sense. If students demon-
bar graphs on a report card, along with
contains the target instructional goal for strate mastery on a test of 2.0 content,
traditional omnibus grades. The top
a topic and is the fulcrum of the scale.
they have reached score 2.0 status on
part of the report card might display
Figure 2 shows that the instruc-
the proficiency scale. If a test addresses traditional grades and the bottom part,
tional goal is for students to be able to
all levels of proficiency (that is, items
the bar graphs. Of course, if the 0– 4.0
describe and exemplify what different
involve 2.0 content, 3.0 content, and
scale is used, it must be translated into
plants and animals need to survive.
4.0 content), then the teacher scores
traditional letter grades. Here’s what this
Score 2.0 involves simpler content: in
each of these three sections with an eye might look like:
this case, recalling specific terminology toward students’ competency at that
3. 51 to 4.00 = A
and factual information about plants
particular level of item difficulty. (For a
3.00 to 3. 50 = A-
and animals. Score 4.0 contains more
more detailed discussion of scoring tests
2. 84 to 2.99 = B+
complex content relative to the topic:
using proficiency scales, see Marzano,
2. 67 to 2. 83 = B
in this case, comparing and contrasting 2010.)