Curmudgucation: More Teacher Effectiveness Mirages
The Fordham Institution has a new report entitled “Teacher Effectiveness and Improvement in Charter and Traditional Public Schools.” Despite what it claims to study, the report is a near-perfect demonstration of Campbell’s Law in action.
The study starts with a question that, as used car salesmen put it, assumes the sale:
Study after study has found that urban charter schools, and non-profit charter networks in particular, tend to be more successful at boosting student achievement than traditional public schools in similar settings. But why?
We’re not going to get bogged down in the details of this study, including this assertion that charter school superiority is a proven thing, because none of them matter when it comes to understanding why this study is fatally flawed.
The fleshed-out version of the question under study is this– we know that more experienced teachers are generally better at their craft then newbies (an assertion that Fordham didn’t make back in the days when they were part of the Let’s Get Rid Of Teacher Job Protections crowd), but we also know that charters mostly have newbie teachers, so how is it that charters gets these superior results with fresh-out-the-wrapper staff?
The report was written by Matthew P. Steinberg (George Mason University) and Haisheng Yeng (U of Penn grad student). They worked from a pile of data from the PA department of education from between 2007 and 2017.
We could dig deeply into this report, but there’s no reason to. All we really have to see is this sentence:
Like other studies, this one uses estimates of teachers’ value-added—that is, their contribution to students’ English language arts (ELA) and math achievement growth— as a proxy for their effectiveness.
So once again, “teacher effectiveness” is being used as a synonym for “scores on a single standardized test of math and reading soaked in the widely-debunked VAM formula.” This is bunk. They try to prop it up with this–
Although such estimates cannot capture other valuable aspects of teaching practice and behaviors, research shows that (in addition to learning more math and English language arts) students assigned to teachers with higher value-added scores are more likely to go to college and earn higher salaries later in life.
That assertion about later salaries is cited from Chetty, Friedman, & Rockoff, 2014, a work that is problematic at best and bunk at worst. Meanwhile, even folks in the ed reform community have caught on to the fact that raising a child’s Big Standardized Test score doesn’t lead to that child having a better life.
But using test scores as a proxy for “student achievement” or “teacher effectiveness” is a critical tactic for ed reformsters, because writing a whole paper about how one set of teachers is better at raising test scores isn’t very sexy or exciting. “What I really want from a teacher is for her to get my kid to do better on that one standardized test they take every spring, and nothing else matters as much,” said no parent ever. Likewise, while there are a million interpretations of “good teacher,” very few of them are “teacher whose students get good scores on that one big test.”
Using test scores as a measure of teaching quality and student achievement isn’t just a bad, inaccurate measurement–it triggers Goodhart’s Law, or its somewhat better known sibling, Campbell’s Law. The idea here is that the more you use a quantitative social measure for social decision-making, the more it will tend to disrupt and corrupt the processes it’s supposed to be monitoring. If you like a pithier version, take Strathern’s restatement of Goodhart–
When a measure becomes a target, it ceases to be a good measure.
Which brings us back to this piece of research. Steinberg and Yang find that CMO-run charters seem to have better-trained teachers, and they posit that hiring practices, training practices, or the charter chain tendency to force all teachers to follow the prescribed procedures–not just curricular, but pedagogical–might be the answer.
But if we stop using bad proxies and just say plainly what we’re talking about, there’s little mystery here. The premise of the study is that newbie charter teachers get better test scores than public school teachers. That points to just one thing–more focused test prep. This is easy to enforce with newbie teachers in a restrictive teaching environment because they don’t have enough of a well-established professional identity to push back. Meanwhile, teachers in public schools are trying to balance the demand for raising test scores against the demand to actually teach.
In short, the explanation laid out by this report is that charter schools (at least the ones studied here) train teachers to do test prep instead of training them to actually teach.
Mike Petrilli suggested on Twitter this morning that I’m being cynical here with my reading of the report, but I think it’s far more cynical to keep arguing in 2020 that using a single set of test scores and a long-since-discredited number crunching formula as a measure of true teacher quality. It is possible, as Petrilli says, that some charters are doing a great job of teaching junior teachers to “teach well,” but as long as the premise is that “teach well” means “have students who get high scores on the BS Test,” we’ll never, ever know.
“Well, then, how do we figure out which teachers are doing a good job,” has been the complaint for the past couple of decades, and I agree that this is a tough nut to crack, but that does not mean that we settle on a bad answer. It is hard to cure certain types of cancer, but that does not mean that we should settle for “drink bleach and sacrifice a frog under a full moon” as a cure. We do not, like the proverbial drunk, search for our car keys under a streetlight a hundred feet from where we dropped the keys because the light is better there.
It has been true since we ushered in test-centric schooling under NCLB–the discussion about teacher quality is worth having, but we cannot have it if we insist on using as “data” something that does not measure teacher quality.
There are other problems with this particular study. Most notably, since it is based on tests scores, it makes sense to look at the students who are taking those tests and the long-known techniques of cherry picking and push-out used by many charter schools to insure that they have a good crop of quality test-takers. Or we could talk about longer school days, or simply organizing the vast amount of school time around the BS Test. Any of these would explain the charter alleged test edge. This study doesn’t address any of that.
But it doesn’t matter. Any study that accepts the premise that BS Test-based VAM scores are a measure of teacher quality is wasting time.
January 22, 2021