Conservative Scholar Opposes Multiple Measures

Not that letters from academics usually make much difference, especially when they're on the other side ideologically from the folks making the decisions, but here's a letter from Hoover Institute researcher Erik Hanushek from last week that was sent along to me, in which he tells Chairman Miller what a bad idea multiple measures, writ large, are for school improvement. PDF here. Keep sending those letters and secret memos in.


Alex. Let me offer just a few quick words on multiple measures: (1) They are essential if we want to promote the development of 21st century skills for all of America’s children. (2) They are affordable if we reconsider how we use the over $5 billion that has been spent on NCLB tests (see recent GAO report). (3) They are possible if we draw on the assessment expertise of our nation’s most accomplished teachers who are ready to lead the way (see www.teacherleaders.org).

Barnett Berry

thanks for your comment, barnett --
i know that you and lots of other smart folks think MM are the way to go. however, i'm not worried about their affordability or achievability, but rather the dilution or muting of the strong signals that the current, highly imperfect AYP rating system sends.

what did you think about the merrow piece on pbs about great teachers' problems with NCLB? are the teacher leaders you know as universally opposed? do the teacher leaders you know work in the most troubled schools?

Thanks Alex. I always appreciate your intellectual honesty! With new technologies and new investments in teachers’ skills I am certain that we can create a multi-dimensional accountability system including both large-scale standardized testing as well as district (Nebraska) and locally (NY Performance Assessment Consortium) developed assessments. We need to imagine, invest, and enable teacher leaders – like so many on Teacher Leaders Network.

John Merrow research into NCLB included a 3-day virtual conversation with 23 Teacher Leaders Networkers – most of whom who work in high needs schools. Anthony Cody is one who was featured by Merrow. We will have a brief policy paper released soon – that more fully captures his voice and others on the impact of NCLB on student learning and the teaching profession. I suspect many TLNers will be weighing in on your blog soon.


I think the best place to start in responding to this eltter is with Hanushek's own words. He writes that NCLB has led to "dramatic changes" in our schools, and that "it simply is no longer possible to write off a group of students as uneducable."

That was not my experience. I attended a state-mandated training a few years ago focused on meeting the mandates of NCLB, and was told not to waste my time on students who were "below basic," because they were unlikely to rise to the level of proficient, and raising them from "below basic" to "basic" would not improve our scores.

NCLB has created a statistical game where data is attended to in order to yield maximum gain on scores, but the needs of individual students are neglected. I believe standardized tests have a role as a small part of an array of assessments. But research shows the most powerful assessment that can be done is classroom assessment, performed by the teacher on a routine basis. This assessment tells the teacher where her students are, and that can guide instrction, and also be used to give direct feedback to the students, resulting in growth. This is assessment for learning. Teachers skilled in this sort of assessment know you need to give students different avenues to express what they understand. Some students can express themselves best in writing -- others through speaking to the class. Others can do excellent visual presentations. If all you do is give multiple choice tests you rob knowledge of life, and turn it into a quiz game. As a teacher I have much better things to do than test preparation.

I wonder if these people actually ever go to schools and talk to students. If these students are being well-served by NCLB, after six years, they should know it. It would be fascinating to hear the dialogue that would ensue if these folks were to go to an urban school during the month the tests are given and find out how the students that are the supposed beneficiaries feel about them.

I would like to take Anthony Cody's point one step further. He says "Some students can express themselves best in writing -- others through speaking to the class. Others can do excellent visual presentations." Valuing assessments that consider these types of skills would not only allow different students to more accurately demonstrate their skills, it would also encourage fostering the types of higher level 21st century communications skills that will be critical for students competing in an increasingly crowded labor market.

In his letter, Hanushek states that "basic skills are just what enable most other learning to proceed," but when NCLB-driven evaluation revolves around basic skills, this is often where learning stops in schools necessarily focused on AYP and test scores. If we want to encourage higher levels of learning, why not allow for evaluations that allow them to surface? Otherwise, like Anthony warns above about "below basic" students, high-achieving students are also being cheated out of opportunities to grow and show what they're capable of.

Anthony Cody's NCLB training experience jibes with the latest study, reported on in the Aug. 1 EdWeek ("Study: Low fliers gain less under NCLB"). The study by U.of Chicago economists Neal & Schanzenbach, supports Cody's perception that schools are focusing on students in the middle--the so-called "bubble kids"--in order to boost scores and meet AYP.

This teaching-to-the-middle approach, of course leaves many behind.

Interestingly, the multi-dimensional accountability system Mr. Berry and many of the other supporters of multiple measures cites as an example of success -- Nebraska's STARS system --was struck down by the Nebraska state legislature earlier this year for failing to provide "meaningful accountability." The Omaha World-Herald reports that the state law now requires a single, statewide standard for evaluating student proficiency.

Let me just share that in my own state of Ohio, one of the state measures does indeed account for the progress of students from "below basic" to "basic," as well as two levels of achievement above the proficient mark. Nothing in NCLB has prevented this additional means of evaluating schools. I might also add that this year particularly as results are reported, I have noted especially good press coverage of some of these reporting nuances. While the "absolutes" of AYP and SI or DI status are still there (even though these are a bit fuzzy when you take into account the safe harbor provisions, 2 or 3 year averaging, at-risk or hold provisions, minimum n sizes and all), it is becoming pretty clear that school success is not a matter of "a single test."

At the same time that one large urban moved up a category based on their "Performance Index" (the one that takes into account all five levels of scores), and made AYP in all categories, largely based on safe harbor scores, a suburban district with an excellent history, dropped down based on not making AYP (despite absolute scores that probably exceeded those of the urban that made if by virtue of safe harbor).

In fact, all of these possibilities are derived from the same set of tests that NCLB madates (and a few that the state added in social studies and science). I understand that there will be an additional "growth measure" soon--based on students' growth from year to year--still derived from the same tests.

The state has also initiated reporting on some additional measures (AP participation, etc). In short--there is nothing in NCLB to prohibit any of this. What I keep fearing from the multiple measures camps is building new tests to make every kid (and their school) look good.

Most schools don't have much experience in assessing kids (or teaching kids)in ways that are differentiated to learning styles. Why do with think that they will suddenly be proficient?

To Margo,
As a highly accomplished teacher, and a mom (grandmom), I understand your concerns and those of others about multiple measures. Consider this though: The current testing system, flaws and all, leaves the work of the majority of teachers totally unexamined. Standardized testing is not nearly rigorous enough to tell us what we really need to know to improve teaching and learning in our public schools.

Actually, there are many effective teachers out here whose work is not reflected in the test scores, and who regularly use differentiated instruction and perfomance-based assessments.

Like any reform, NCLB included, the use of multiple measures should be approached thoughtfully and implemented carefully--both of which are more likely to happen if highly accomplished teachers are included throughout the process.

RE: Russo's comment that multiple measures will "dilute or mute the strong signal" sent by a school's not making AYP. What, precisely, is that strong signal? That the school in question is failing? We already know that AYP protocols (such as subgroup size) can and have been negotiated and tweaked--just as state assessments have been modified--to blur critical information about student achievement in a particular school.

In the end, we pretty much already know which schools are doing a credible job and which aren't, and spending vast amounts of time and ink on re-writing the rules and tests doesn't help us achieve our real goal: figuring out how to make things a whole lot better for the kids whose educational opportunities are below par.

Which is why multiple measures over time are the ONLY way to get a clearer picture of student achievement. An annual test score is inadequate and limited (although it doesn't hurt kids to be tested occasionally, especially if a test is correctly framed as a measure of what kids know and can do...not a frightening punishment for school organization). The fact that different assessments often give us different information is proof positive that relying on a single score is folly--learning is too complex and multi-faceted, and kids too changeable, to believe that one number is truth. The goal of multiple forms of assessment is giving teachers a bigger, clearer picture, in order to diagnose difficulties and prescribe instruction to improve learning.

Eric Hanushek's comment that using multiple measures represents "backing off from clear, consistent and measurable standards" reveals his inability to conceptualize non-test assessments as valid and reliable. We already have examples of performance and constructed assessments for students that yield considerably more information than a bubble sheet. Michigan used to administer a fifth-grade writing test, where students submitted drafts plus polished pieces of writing, allowing a neutral outside assessor to evaluate student understanding of the writing process. This was a more complex--and useful, and cheat-proof--assessment than a multiple-choice test, and gave the students' teachers some valuable information on how their students compared to other 5th grade writers in Michigan, as well as some information about how to improve instruction.

Margo/Mom is correct when she says that many schools and teachers lack knowledge about creating useful assessments, but that is not a reason to turn away from an improved evaluation process. Her comments, Hanushek's and Russo's all implicate the real critical issue underlying the multiple measures discourse: trust. If we don't believe our schools (or states)are capable of creating better, more informative forms of assessment, then we'll stick doggedly to easily administered standardized tests--the ones that validate our beliefs that public schools are "failing."

Teacher expertise has always been levied to create multiple kinds of measures of student learning…some utilizing standardized tests and others reflecting the nature of curriculum and teacher knowledge of students. The high stakes test focus of NCLB has forced teachers away from this use of their expertise and into overly committing resources/time into small sets of discrete facts. It feels as if we’re trying to simplify the complexities of learning into 4 multiple choice answers.

Measuring student achievement in many ways allows data to strategically guide the classroom vs an environment where a single measure is used as a sledgehammer for accountability. Effective teaching practices call for us to evaluate what are students are learning, adjust to know/address what they don’t know and differentiate if they have already mastered the curriculum standards. I am compelled to match the purpose of what I need to know about what my student has learned to the way that I measure. Yes the typical high stakes test can measure the bottom layer of learning, but that is not the focus of our classrooms. It is not where we engage their intellects, capture student curiosity or prepare our students for careers.

Let me be clear. I believe in tangible, measurable assessments that provide feedback to me, to my parents, to my students, to my administration, to my school board and to the stakeholders of my community that proves evidence that my students are mastering the curriculum standards in my courses. We have long known that effective classroom assessments reflect the dynamic nature of the classroom learning. No single measure can capture it or provide insight about what is happening or needs to happen. Rather it is a combining of many kinds of measurements over time that yields information that makes a difference for students and guides a teacher.

The most effective school improvement efforts are those grassroots movements where teachers band together to review on many kinds of data about their students, crafting curriculum and instruction around that information, and coming together with trusted colleagues to continuous reflect/improve on student learning. These kinds of efforts have sustainability through buy-in from students, parents and teachers. It is why I must disagree with the sender of the letter who seems to think that this kind of coordinated, focused work is backing away from the ideals of maximizing student learning.

