About this blog Subscribe to this blog

Thompson: Cross Examination

Joel_klein

Much of Steven Brill’s New Yorker article on the "Rubber Room," has a ring of truth, but I do not know enough about those disputes to comment. Had Brill adequately cross examined other charges, however, the last third of his article would have been edited out.

Brill uses the stupid and inflammatory words of teachers against them, which is fair, as he also cross examines their factual assertions. But Brill seemed deaf to the false and inflammatory nature of his sources’ words.

Since it is so important that we follow the lead of Randi Weingarten and fairly and efficiently remove ineffective teachers, let me first take Deputy Chancellor Chris Cerf's blunderbuss statement and translate it into a constructive position.

Taking Cerf’s formulation of sacrificing one innocent person to convict ten guilty ones, the proper question draws on George Soros' metaphor. If you have ten bottles of water and only one is poisoned, all are  worthless.  Who would enter a profession where you have a 10% chance per year that the invalid use of a statistical model would damage or destroy your career due to no fault of your own?

Why did Brill fail to cross examine the statement of Dan Weisberg of the New Teacher Project that "the teacher has a 'value-added' that can be reduced to a number.  You take that, along with other measures, and you can really rate a teacher."  Neither did he cross examine the claim in "The Widget Effect" that these models allow for "an expedited one-day hearing" to determine whether the termination was fair.  In fact, the sum of the evidence for those extraordinary statements is one sentence, documented by one footnote, which I will explore.

But first, we should cross examine the source statement from "The Widget Effect" and see whether it supports Weisberg's and Brill's extreme recommendations. "Value added can be a useful supplement to a performance evaluation system where a credible model is available and may be appropriate for wider use as student assessments and value added models evolve," wrote Weisberg when facing cross examination by editors as opposed to guiding a reporter who is even more of a novice in the field of education. 

The TNTP wants to end a teacher's career today using models that may evolve into credible systems? 

The TNTP statement may say nothing, but it is documented by footnote #56 which cites some of education's best social scientists.  The research of Jesse Rothstein, Dan Koretz, Dan Goldhaber, Laura Hamilton, and other scholars at RAND and elsewhere directly refute the statements in the New Yorker, and the thrust of the TNTP's arguments. 

"The short answer to this question is, the claims of developers of Value-Added methods (VAM) notwithstanding, VAM methods as currently developed are of limited usefulness as a tool for any routine assessment purpose, but are well enough developed to be quite useful as research tools. In particular, it seems apparent that currently available VAM methods should not be used for high-stakes assessment purposes." wrote the RAND authors, and VAMS "will almost always make teachers from less affluent communities look like they are performing less effectively than teachers from more affluent communities. Similarly, it is usually true that teachers within the same school often do not have randomly assigned students, and as a result such comparisons, in the absence of statistical controls, cannot even be made within schools."

Rothstein's citation, of course, shows how a Value Added Model correlates 4th grade scores with 5th grade teachers, thus ingeniously refuting real world applications of VAMs for evaluation purposes. Steven Rivkin, however, has described the usefulness of VAMs for supplementing as opposed to driving evaluations:

"when principals have first-hand knowledge about the classroom (including information not available for statistical analysis, such as time devoted to the tested material and classroom composition) they "can contextualize the results. Such information can provide supportive evidence and strengthen the principal’s hand in efforts to remove ineffective teachers ..."

Marsha, Dane, and Hamilton write "RAND’s research studies and others raise concerns about the consequences of highstakes state testing and excessive reliance on test data (e.g., Hamilton, 2003). While some responses to testing and test results, such as individualization of instruction, have the potential to improve educational outcomes, others may be less productive, such as increased time spent on test-taking strategies, increased focus on problem styles and formats that appear on state tests, or targeting instruction on "bubble kids."

... Finally, there is a risk of excessive testing, due to the addition of progress tests and other assessments intended to prepare students for state tests. Reducing the number of assessments may be a useful reform strategy, as multiple assessments may take time away from instruction ... Despite the popularity of value-added modeling, little is known about how the information generated by these models is understood and used by educators."

Or we could summarize with this account of Koretz et. al work in Measuring up

"This monograph clarifies the primary questions raised by the use of VAM for measuring teacher effects, reviews the most important recent applications of VAM, and discusses a variety of statistical and measurement issues that might affect the validity of VAM inferences. The authors identify numerous possible sources of error and bias in teacher effects and recommend a number of steps for future research into these potential errors. They conclude that the research base is currently insufficient to support the use of VAM for high-stakes decisions about individual teachers or schools. It is important that policymakers, practitioners, and VAM researchers work together, so that research is informed by the practical needs and constraints facing users of VAM and that implementation of the models is informed by an understanding of what inferences and decisions the research currently supports."

And before considering other versions of the TNTP ideals, we should muse about this:

"Consider a framework that automatically fires the bottom x percent of teachers in terms of value added in a single year or average value added over a number of years. Unobserved differences among schools and classrooms almost certainly influence the estimates, and test error certainly introduces a degree of randomness. Consequently, mistakes will be made; however, outcomes could still improve compared with the system without such decision rules. Nonetheless, the implications of adding such risk and uncertainty may necessitate a substantial salary increase, and these monetary costs as well as costs associated with increased turnover would have to be weighed against any improvements in the composition of teachers."

And even if the bottom x perecent are not automatically fired, they are still indicted as suspect teachers placing them at the mercy of y percent of administrators caught up with the blame game and guilt by association. - John Thompson

Comments

TrackBack URL for this entry:
https://www.typepad.com/services/trackback/6a00e54f8c25c988340120a5396732970b

Listed below are links to weblogs that reference Thompson: Cross Examination:

Permalink

Permalink URL for this entry:
https://scholasticadministrator.typepad.com/thisweekineducation/2009/09/thompson-cross-examination.html

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

John,

This "fire the bad teachers" thing looks like a trend that will not go away. Do you see anything teachers themselves--and/or teachers' unions--can do to get out in front of this issue as opposed to just reacting to it?

Steve

Hi, John. Two points that bear out what's in your post:

1.) Aaron Pallas's review of value-added student performance data from NYC shows absolutely no consistency from one year to the next: "there really is no pattern to the results, and certainly not a pattern that demonstrates consistency or stability from one year to the next." http://gothamschools.org/2009/09/03/aaron-pallas-progress-measurement-on-reports-still-random/

2.) Barnett Berry released a report today that pokes holes in a number of the arguments about teacher staffing. Among the myths he takes on: If you topple the "barriers" posed by traditional certification, effective teachers will simply flood into struggling schools; If you dismiss incompetent teachers--a laudable goal in itself--struggling schools will have all the great teachers they need; Teacher tenure is the biggest barrier to firing bad teachers; Financial incentives are enough to lure great teachers into the schools that need them most. http://bit.ly/jFh4

The comments to this entry are closed.

Disclaimer: The opinions expressed in This Week In Education are strictly those of the author and do not reflect the opinions or endorsement of Scholastic, Inc.