Thompson: Asking the Right Questions
Its hard to know how to take California's RttT statement that "we additionally expect that evaluation systems will incorporate peer evaluation, using the Peer Assistance and Review model, as appropriate," but there is less doubt about Pennsylvania's words "districts have committed to using this tool (Value Added Models) in a manner that is consistent with due process rights." And since Tennessee is committed to firing up the 30% of its teachers who don't meet growth targets (making a mockery of Secretary Duncan's position on using multiple measures for evaluations), we must ask the proper legal questions. We must also understand why the the group of teachers who can't meet growth targets will inevitably include many of the best in the profession, tackling the toughest challenges while handicapped by policies beyond their control.
Tom Kane misses the point, saying that "the (value added) system doesn't have to be perfect to get started." But the question is when is the model valid enough to be legal. Imagine that Kane is testifying as an expert witness on the validity of VAMs in a termination case.
Kane summarizes his research on unconscious ways that selectivity can creep into the assignment of students and how it "suggest(s) principals are not introducing unaccounted-for effects in their assignment of students.".... but "more studies should be done to confirm the accuracy of assessments and improve the ability of the models to tease out the true teacher effects in student outcomes."
The defense attorney could then thank Dr. Kane for his conscientiousness in dealing with the all-important issue of the selectivity of students. Kane's own care in addressing the issue of selectivity is testimony to both his professionalism and the importance of factoring out selectivity when devising growth models.
Then Kane is asked how those growth models can tease out the effects of "the Big Sort." Surely the conscious choices of moving to the suburbs, magnets, and charters adds a selectivity that is orders of magnitude greater than any unconscious selectivity. Given the amount of "creaming" that occurs between 5th and 6th grade, how can the test score growth patterns in elementary be relevant to the value added targets for middle school teachers facing incomparable circumstances? How can it be determined that failure to raise test score growth is not the result of peer influences, or district policies made inevitable by the Big Sort? If the "lying eyes" of a teacher's colleagues, students, and their families say that the hardcore inner city middle school teacher is fantastic, how can statistical engineering based on elementary school conditions prove them wrong?
Although they do not realize it, the Education Trust's data on the growth of the Achievement Gap over time, as well as "the Matthew Effect," further explain why data from earlier years may have no relevance to secondary school performance issues. Students who learn how to be students and to read for comprehension are likely to improve their performance with or without a great teacher, while students who have just been drilled on decoding are doubly vulnerable when they reach their teens. And, of course, that is when the toxicity of the environments of so many poor children becomes crippling. As the toughest schools face a greater critical mass of the most traumatized kids, peer effects are compounded, further confounding the statistical engineering.
It is hard to see how society can improve the quality of teaching provided to secondary students by punishing their subsequent teachers for factors during the elementary or preschool years. The more that a class of teenagers face funerals, chaos at home and at school, addiction, incarceration and other ills, the more difficult it becomes to compare their performance data to their previous records. I am not arguing that it is more difficult to teach in the older grades. I am just explaining why statistical models devised and tested in elementary schools are not likely to be valid for secondary schools.
The big factor is self-segregation, and a great example can be found in Zip Code 73120. One of the richest areas in the state, the district has a per capita income of $30,000, with only 572 poor families and 3012 poor individuals. As 5th graders, the kids of 73120 attend schools with poverty rates ranging from 23%, 50%, 69%, and 76%, with the special education populations ranging from 5.5% to 14%. The number of suspensions are 0, 14, 33, and 86. The pass rates for 5th grade Reading range from 81% to 95%.
Between 5th and 6th grade the district's student population drops by nearly 1/5th as the numbers of special education and ELL students increase, but in 73120 the drop off is more remarkable as the majority of their students move to magnet, enterprise, and charter schools, including four that have gained national recognition. So, the 6th graders arrive at a school which has 2,020 disciplinary actions, which lost its middle class students after the campus policeman was hospitalized in a riot, and where 1/4th of their class are on IEPs. Are the 6th grade teachers to be blamed for a Reading pass rate of 44%? (this data is a compilation of both print and electronic records, some of which has not been published.)
And by the way, the challenges are much greater in a middle school with 100% poverty like mine.
As frustrated as I am at the recent rhetoric of Mass Insight, for instance, I suspect that that organization deals with these issues in an honorable manner, but that has no relevance to the bigger legal picture. Just because some schools or districts would not abuse growth models, does not mean that all systems would not use them to indict teachers in tougher environments as effective. And in many schools, being indicted for low test score growth would be no different than being charged and convicted as ineffective. Teachers in those schools would never have any peace of mind, as test score data prejudiced harried administrators, who also needed to defend themselves against unfair use of growth models. What self-respecting teacher would remain in a high-poverty school where he or she had a 15 to 20% chance per year of having his career damaged or destroyed just because of the limits of statistical engineering?
But, if that data and the evaluation process was controlled by a peer review committee, not just the administrators who are responsible for the school environment, we could skip the legal Battle of Verdun that will be inevitable in many districts as they implement their RttT plans.
Explanatory Footnote: Although I was a legal historian, I am not an attorney so I bounced my thoughts off of Marc Dean Millot, who is. He graciously allowed me to share the following:
Some simple observations:
* Under an "at will" arrangement an employer may discharge an employee for any reason or no reason, but not the wrong reasons as defined by law (e.g., discrimination on the basis of gender, race, religion, etc). The critical factor here would not be the relationship of VAM to reality, but the consistent application of VAM to employees. But this is generally a private sector context.
• Unless stated otherwise, government employment is considered an entitlement, where employees can be generally discharged only for causes defined by law (budget cuts, violation of laws, poor performance) and according to procedures affording due process established by law. In general before an agency can change those causes or procedures, they will either go to the legislature for a change in the statute, or some administrative rule-making process with notice, hearings, a record of decision etc. to change the relevant regulations. I am not sure whether the legislature or agency would need to establish must more than a minimal rational basis for VAM, but I think the change would require some formal process. Assuming the change passed the rational test, the question might be whether some "protected class" (e.g, African Americans) were disadvantaged by VAM strictly because of factors related to their class status. There are a great number of these cases, and I just don't know the details of court rulings to offer much guidance.
• Where the state permits collective bargaining with public employees, the bargaining agreement rules and changes must be negotiated in good faith. There would be a question of whether VAM falls within the permissible scope of bargaining, whether even with a union contract the district has discretion in this matter.
We have 50 states and 15,000 school districts each with its own legal and regulatory environment and practices regarding teacher employment so....
Expect a lot of litigation that will throw a wrench into implementation of these RTTT provisions for at least several years. The teachers unions will put a lot of money into this - it's just too much of a loss to them not to put up a fight. Indeed I can see other public employee unions helping out - the implications for their jobs are huge.
If I wanted to stop VAM, the first thing I'd do is go to the courts for an injunction to buy some time. Then I'd get the political machine running in the state legislatures to head it off.
But I'd have to wait until I knew who won the grants. I would not want to be blamed for losing my state's RTTT grant, and I would not want to waste time gearing up for a fight in states that didnt win a grant.
The question here is whether anyone is really serious about implementing VAM if they win the grant. If you want to win a grant you have to say what the feds want to hear. Once you win the grant, everything's re-negotiable. This is the essence of what RAND colleague Tom Glennan described as education's "compliance mentality."
Just to be clear, my view of value added is that it should apply to all inputs - teachers, principals, superintendents, and the firms providing products, services and programs aimed at improving teacher performance and student achievement. It is entirely unprincipled to say the state of the art in these evaluation techniques is sufficiently advanced to hold one part of the "value chain" accountable for its role, but to refuse to apply those same techniques and tools to the other parts of the chain. As best I can tell they have been used to assess every part of the chain (in addition to teachers, they are the stuff of WWC's gold standard, they've been used to assess the charter concept and charter operators, SES providers, superintendents, etc, etc) and the debate over all these studies suggests VA Methods apply equally well/poorly to all inputs. If everyone in the system was held to the same system of accountability, everyone might be just a tad more focused on what it actually reveals about contributions to student outcomes, how the system works, where it's weak, how it might be improved and just how important it should be to matters of job security and compensation. Everyone is too interested in the other guys accountability.
Until we can get everyone to agree that they will be accountable under the same VAM, I'd say the system isn't ready to be used. Well, for research, yes; for accountability, no.