Over the last decade, expectations of statewide tests have gotten a little out of hand. As Measured Progress founder Stuart Kahl facetiously puts it, politicians and policy makers want nothing less than “a single, summative, formative, adaptive, diagnostic, general achievement test that measures growth and yields immediate results that teachers can use right away to modify their instruction.” A single assessment of this kind surely doesn’t exist, but Dr. Kahl explores ways to approach that ambitious goal in a recent white paper, “How can state assessments better test deeper learning? Three models that can work.” Given states’ ongoing work to meet ESSA requirements and introduce innovation in their assessment systems, it’s a good time to consider new approaches. Read on for a few highlights from the paper.
Different needs, different tests
There would be a lot less frustration with state tests if educators and the general public recognized what a state’s summative test can and cannot do well. Educational stakeholders at the statewide, district, building, and classroom levels each need different kinds of information to inform their decisions. (See our downloadable guide to matching assessment to purpose.) District and state leaders can get excellent information from an end-of-year general achievement measure to help evaluate instructional programs, but teachers shouldn’t expect to get diagnostic information on individual students for immediate formative use from the same assessment. Teachers need ongoing information, of course; that’s why they employ formative practices to gather evidence of students’ understanding of specific, current learning targets. A test that addresses a sampling of a full year’s curricular content, such as a yearly statewide assessment, can’t be expected to diagnose gaps in a student’s learning at a deep enough level of detail to be actionable.
Further, statewide tests are created to meet many requirements—such as limited testing time, quick turnaround of results, and contained costs. The resulting tests that emphasize efficiency cover important foundational knowledge and skills, but can’t assess an important aspect of today’s college and career readiness standards: deeper learning. Efficient tests can’t tell us about students’ use of higher-order thinking skills in the application of foundational knowledge and skills.
This leaves us with at least two gaps.
- Teachers can’t get useful insights into students’ specific strengths and needs from annual statewide tests.
- Those tests can’t get at the deeper thinking that is so critical to student learning.
Two under-used approaches
Current thinking about balanced assessment systems and the flexibility permitted by the Every Student Succeeds Act (ESSA) suggest it’s time to consider innovative ways to address these gaps. Dr. Kahl encourages education leaders to consider two under-used approaches to statewide testing: matrix sampling and performance tasks.
Matrix sampling is a testing technique in which a large amount of content is broken down into small non-equivalent subsets of items, with each student taking only one of the small subsets. Several large-scale programs successfully used this technique in the past, demonstrating that the results on the item subsets can be aggregated to produce very reliable group results because the aggregate test provides excellent coverage of the target subject-area domain. This means that not every student needs to take every item in a comprehensive test in order for you to get valid and reliable results. Students spend less time taking tests, and teachers gain instructional time.
Performance tasks, as proposed in the paper, are not the extended-response tasks included in some assessments, but are tasks that are embedded into regular instructional units and yield teacher-scoreable student work. These tasks, which Dr. Kahl terms “curriculum-embedded performance assessments” (CEPAs) can work on two levels.
- At the classroom level, they can replace some other unit assessments and provide evidence of student capabilities, which teachers can use to inform instruction.
- At the state level, with review, field testing, and score auditing, CEPAs can contribute meaningful score points toward the total test, while getting at the deeper learning that efficient tests can’t address.
The white paper presents three models to deliver the efficient, high-level statewide assessments that policy makers are looking for. All three designs include matrix sampling, and two include CEPAs for both local and statewide use. Download the paper to learn how adopting these high-impact practices can help limit testing time and development costs, and also yield more meaningful results.
With these designs, state tests can accomplish their primary goal of program evaluation—and gather richer data. The designs represent a radical departure from many current statewide assessment models. But the time is right for change. Read the white paper.
For more on CEPAs, see also Hofman, Goodwin, and Kahl, 2015: Re-balancing assessment: Placing formative and performance assessment at the heart of learning and accountability.