The following are excerpts from a recent post from blog Baseline Scenario - What Happened To The Global Economy and What We Can Do About It:
Resist The Temptation To 'Race To Nowhere'"
For full post, click here."This guest post is contributed by Kathryn McDermott and Lisa Keller. McDermott is Associate Professor of Education and Public Policy and Keller is Assistant Professor in the Research and Evaluation Methods Program, both at the University of Massachusetts, Amherst.On March 29, the U.S. Department of Education announced that Delaware and Tennessee were the first two states to win funding in the "Race to the Top" grant competition. A key part of the reason why these two states won was their experience with "growth modeling" of student progress measured by standardized test scores, and their plans for incorporating the growth data into evaluation of teachers. The Department of Education has $3.4 billion remaining in the Race to the Top fund, and other states are now scrutinizing reviewer feedback on their applications and trying to learn from Delaware's and Tennessee's successful applications as they strive to win funds in the next round.
One of the Department's priorities is to link teachers' pay to their students' performance; indeed, states with laws that forbid using student test scores in this way lost points in the Race to the Top competition. A few months ago, James pointed out some of the general flaws in the pay-for-performance logic; here, our goal is to raise general awareness of some statistical issues that are specific to using test scores to evaluate teachers' performance.
Using students' test scores to evaluate their teachers' performance is a core component of both Delaware's and Tennessee's Race to the Top applications. The logic seems unassailable: everybody knows that some teachers are more effective than others, and there should be some way of rewarding this effectiveness. Because students take many more state-mandated tests now than they used to, it seems logical that there should be some way of using those test scores to make the kind of effectiveness judgments that currently get made informally, on less scientific grounds.
The problem is that even if you accept the assumption that standardized tests convey useful information about what students have learned (which we both do, in general), measuring the performance gains (or losses) of students in a particular classroom is far more complicated than subtracting the students' September test scores from their June test scores and averaging out the gains.
The first problem has to do with class sizes..."
" . . .The second general problem has to do with how students end up with particular classmates and particular teachers. . ."
" . . .In both Delaware and Tennessee, students' test-score growth will be combined with other kinds of information to make judgments about teacher performance. Considering data from multiple sources will help overcome some of the issues we've raised here. However, looking at "what the numbers say" appeals to policy makers who crave simple indicators of complex phenomena. Legislators and governors don't have to pass a statistics exam before taking office, and they haven't had an especially good record of listening to educational testing experts before they mandate new uses for test results. For example, the relevant professional associations have jointly endorsed a set of principles on appropriate uses of tests which, among other things, caution against using a particular test for purposes other than those for which its validity has been studied and confirmed.
Despite this caution, policy makers tend to pile extra uses onto tests once they've required that students take them . . . . . . . . . The tendency has also been for quantitative performance indicators, even if of somewhat dubious quality, to dominate over other forms of evaluation. We worry that something similar will happen with the use of student performance in determining teachers' pay, promotion, or retention. "The numbers" look objective to people outside schools, while other measures like analysis of lesson plans or documentation of classroom observations seem by comparison to be imprecise means by which the "education establishment" can continue to protect the incompetent.
Educators have welcomed the Obama administration's willingness to eliminate some of the less logical components of No Child Left Behind, such as the "adequate yearly progress" benchmarks based on unfounded assumptions about how schools improve and on definitions of "proficiency" driven more by political expediency than by an objective definition of what students need to learn in order to succeed in further education and careers. However, even though we're now "racing to the top" rather than trying to ensure "no child left behind," we still risk basing reasonable-sounding policies on unreasonable assumptions and racing (with apologies to Talking Heads) on a road to nowhere. . ." "