Assessment Insights

Temps as Test Scorers: The Truth of the Matter

[fa icon="calendar"] September, 2015 / by Dr. Stuart Kahl

Dr. Stuart Kahl

Despite advances in computer-based or automated scoring of student work on academic assessments, there is still a need in large-scale testing programs (e.g., state educational assessments) for humans to score students’ responses to higher-order, constructed-response test questions and performance tasks. And every few years an issue is raised about the qualifications of persons engaged to accomplish this scoring. The testing companies typically hire thousands of temporary staff for this task, most through temp agencies. The job of these seasonal workers is to view images of student responses and assign a score to each.

What the Critics Don’t Know (or Won’t Say)

A reasonable question for a person to ask is, “Shouldn’t it be teachers who score the students’ test responses?” Unfortunately, the issue is seldom raised in the form of a question. Instead, the critics assume they know what they’re talking about and go right to the press or social media, broadcasting to tens of thousands what they consider a certainty—that a significant disservice is being done to our students because their work is being scored by temporary staff, many of whom are not and never were teachers. What many of these outspoken critics often have in common is limited knowledge about the actual scoring process and the quality of its outcomes. Therefore, the comments often exclude important facts, such as: The testing companies follow proven best practices; the resulting scores demonstrate high technical quality; and the processes and results are well documented in states’ technical reports.

What Makes Scoring by Temps Effective?

The truth of the matter is that temporary scorers—whom the state agencies typically require to have a college degree and successful post-secondary coursework in the relevant discipline—can do the job just fine. That’s because the specialized expertise that is required for their success is applied up front in the test development and piloting stages of a testing program. Curriculum experts, teachers, and former teachers from each state or testing program are actively involved in the development and/or review of the test questions and scoring rubrics. Such experts also participate in the identification of “benchmark” student responses—the student work that is used to exemplify different point values and to train and qualify scorers. Thus, the actual task of scoring is to categorize student responses—to identify the rubric descriptor and sample responses a particular student’s response best matches. The scorer’s job is to consistently apply pre-established guidelines, not to personally evaluate the quality of a response using his or her own standards of performance.

Beyond meeting general qualification requirements to be hired for a project; every scorer must train and qualify to score responses to each individual test question. Furthermore, every scorer has his or her “live” scoring continuously monitored through a variety of quality control approaches, usually involving second scorings and corrective action if scorer “drift” from pre-established scores is detected. Experience has shown that when critics of this type of scoring learn more about the process and observe or experience it first-hand, they become advocates of the use of temporary scorers, the scoring process, and the inclusion of constructed-response questions in tests.

Beyond Classroom Teachers as Scorers

Classroom teachers are unquestionably experienced scorers of student work. No matter how they choose to score responses to their own questions, they do so consistently across all the students in their classes, and they provide good feedback to those students. However, scoring the work of thousands of students from many schools across a state is another matter. Statewide testing requires many people applying proven best practices to yield consistent scores. With a well-prepared and closely monitored temp workforce, we can achieve reliable test results from these especially important types of test questions that can and do influence classroom instruction and student learning.

Read more about statewide testing and our areas of expertise.

Topics: Accountability

Dr. Stuart Kahl

Written by Dr. Stuart Kahl

As founder of Measured Progress, Dr. Stuart Kahl contributes regularly to the thought leadership of the assessment community. In recent years, his particular interests have included formative assessment, curriculum-embedded performance assessment, and new models for accountability assessment programs. The Association of Test Publishers (ATP) awarded Dr.Kahl the 2010 ATP Award for Professional Contributions and Service to Testing. He regularly publishes research papers and commentaries introducing and analyzing current issues and trends in education, and as a frequent speaker at industry conferences, Dr. Kahl also serves as a technical consultant to various education agencies.