Insomnia? Read This Post. Charter School Efficacy Research Design
Posted: February 21st, 2013 | Author: Michael Goldstein | | 15 Comments »
Sorry folks. We’re going uber-nerdy today. I want to figure out what I think by writing out what I think. You’re free to come along for the ride. If not, this is way more fun to read.
Today we have:
1. Colin Hitt
2. Matt Di Carlo
3. Me
1. Colin Hitt
First up is Colin Hitt. He is from the Illinois Policy Institute. Here’s his blog about the great folks at SEED, and another blog from 2 months ago, which reads:
So you say charter schools don’t work. That’s an empirical claim. It needs to be backed up by evidence. Here’s a helpful guide to the most rigorous research available. Once you’ve tackled this material, you’ll be in position to prove your point.
As you probably know, the gold standard method of research in social science is called random assignment. Charter schools are particularly well-suited for random assignment evaluations, since they’re usually required by law to admit students by lottery. The lotteries are fair to families – that’s why they’re put in place. But they also allow researchers to make fair comparisons between students who win or lose lotteries to attend charter schools.
To date, nine studies (updated to: 10) lottery-based evaluations of charter schools have been released. Let’s go through them, starting with the earliest work.
Here’s the punch line:
Altogether, these studies have remarkably similar findings that urban charter schools are producing significant gains in reading or math, or both. Suburban charter schools perform less well – you could cite this fact, but frankly this a minor concern in the battle to close the racial achievement gap in American education.
2. Matt Di Carlo
He works for the Shanker Institute. He strikes me as fair and nerdy. I like both these characteristics, and share the second one (and aspire to the first).
He writes:
Among the more persistent arguments one hears in the debate over charter schools is that the “best evidence” shows charters are more effective….
The basic point is that we should essentially dismiss – or at least regard with extreme skepticism – the two dozen or so high-quality “non-experimental” studies, which, on the whole, show modest or no differences in test-based effectiveness between charters and comparable regular public schools.
In contrast, “randomized controlled trials” (RCTs), which exploit the random assignment of admission lotteries to control for differences between students, tend to yield positive results. Since, so the story goes, the “gold standard” research shows that charters are superior, we should go with that conclusion.
RCTs, though not without their own limitations, are without question powerful, and there is plenty of subpar charter research out there. That said, however, the “best evidence” argument is not particularly compelling (and it’s also a distraction from the positive shift away from obsessing about whether charters do or don’t work toward an examination of why). A full discussion of the methodological issues in the charter school literature would be long and burdensome, but it might be helpful to lay out three very basic points to bear in mind when you hear this argument.
1 – Only a relatively tiny handful of charters have ever been part of an RCT….
You can read his whole critique here.
He goes on to argue:
In general, charter schools are one of the few reforms under heavy discussion today that actually has a solid, long-term research base. If some charter supporters want to argue that this body of evidence – RCT and otherwise – suggests that charters might be more effective with low-income students, or when they set up shop in urban areas, there’s a case to be made there (though the evidence is far more mixed than is sometimes implied).
However, the research overall is rather clear – there are good, bad and medium charters, and the same goes for regular public schools.
Those who cling to the “best evidence” theory certainly score points for social scientific caution, but taking this viewpoint too far – i.e., essentially ignoring all research but RCTs – seems like wishful thinking more than anything else. And, more importantly, it distracts from the far more important task of trying to explain the wide variation in measured charter performance in terms of concrete policies and practices, which can inform all schools, regardless of their governance structures. The charter movement as a whole seems to be moving in this direction. Let’s hope this continues.
3. Me
I’m not fully sure what I think. One reason I write this blog is the actual writing helps me decide what I think. So off we go.
a. I Heart RCT
Like Collin Hitt, I really value RCTs as “gold standard research.” I believe in the scientific method. I wrote an essay about that here.
As a sidebar, my brain is now asking Why. Why do I value this particular type of study?
My brain says:
*I’m influenced by scholars like MIT’s Josh Angrist and several from Harvard (Roland Fryer, Tom Kane, Marty West, etc), who champion this method, not as perfect, but as better than other methods.
*I’m influenced by Tom Loveless, a scholar predisposed to pointing out weaknesses in other research that is less empirically strong.
*Many nights when my wife and I talk about how our day went, I’m struck by the contrast between education research (which seems to go in “fad waves”) and cancer research (my wife’s work).
Closing the achievement gap and curing cancer are tough problems. If I had to bet on where bigger progress will made in the next 10 years, I’d bet cancer, because for all its faults — and there are many — RCTs are standard operating procedure. Learning in that field is generally reliable (insert a million exceptions here). Researcher A discovers a little thing. Researchers B-Z get to build on that.
Let me restate. There are many limitations of RCTs. Ed Liu inserted some yesterday into the comments section of this blog, including specifically that an edu-RCT is different from a medi-RCT.
For those of you youngsters thinking of going to a PhD program in the social sciences, Christopher Winship wrote an extremely wonky paper about RCT limitations, which is here. Disclosure: Chris once bought me a sandwich at the Harvard Faculty Club, and I thought it was soggy. Plus no chips, just some upscale lettuce on the side.
Anyway, what do they say about democracy? It’s the worst form of government except all the others that have been tried? That’s RCTs in education research.
End of sidebar.
Annnnnnnnway, Hitt’s point — 10 gold standard studies of charters, all positive — shouldn’t be taken lightly.
b. I agree with Di Carlo
Here’s a simple thought experiment. If you could do an RCT of every charter school in the nation, what do you guess you would find? Two likely choices.
Choice 1. A mix of good, bad, and medium charters, in roughly equal proportions.
Choice 2. Many more good charters than bad (i.e., something similar to what we find in the 10 small RCT studies that Hitt mentions).
My guess: Choice 1. That is, my guess is for every Boston, where the RCTs show amazing results for charters, there’s a Rhode Island, where the charter sector is unimpressive.
Like Di Carlo, the larger studies of charters are less empirically strong, but they’re not worthless. A study like CREDO NYC, which came out yesterday, and is not an RCT, has findings similar to Hoxby’s RCT. So I would guess that CREDO USA is probably a decent proxy for all charter quality. It’s. A. Guess. I could be wrong. I am often wrong.
c. What I wish
Some charter opponents approvingly quote Macke Raymond of Stanford for her non RCT CREDO study (when it says nationally only 17% of charters are “better”), then denigrate Macke Raymond of Stanford for her CREDO study when it says a particular subset of charters (like NYC or NJ) are unusually good. And vice versa for some charter advocates.
In a fantasy world, folks from very effective (while still highly imperfect!) charters wouldn’t have to invest so much energy just defending their right to exist. If that happened, discussion would move more to what Di Carlo hopes:
…the far more important task of trying to explain the wide variation in measured charter performance in terms of concrete policies and practices, which can inform all schools, regardless of their governance structures.
That’s my fantasy.

Any traction on that Phase 1/2/3 idea from your Ed Next piece?
You’ve been some REALLY important points. I’ll give you two of them:
1) RCT are biased. That’s a technical term, meaning systematic error.
The bias in this case? RCT — and DiCarlo of Shanker Blog pointed out recently (and I pointed out years ago) — can only work with over-subscribed charter schools. Any charter school that does not have a surplus of applicants does not need to do a lottery, and there is no random assignment. Which charter schools are most likey to have he most applicants? The better ones, right? (How we define “better” and what “better” means to families might vary, but it’s got to be some form of better.)
This means that RCTs should not be expected to have the worst charter schools, or necessarily even average charter schools. The bias is that charter schools included in these oversubscription-based RCTs will not be representative of the full range of charter schools.
On the other hand, there are no such restraints on which schools to which students who do not get into the charter schools go to and are included in these RCT studies.
Except that’s not true. Which TPS are most likely to have lots of families looking for an alternative? The best TPSs? Probably not, right? There’s a bias there towards worse TSPs for the comparison.
A bias towards the best charter schools being included and towards the worst TPS being included.
What do we do with that? Does that mean that RCTs are useless?
No. It means they are limited. It means we cannot use them to judge overall charter school quality or overall comparisons between charter schools and TPSs.
1) Unless the overall charter average is far below the TPS average, the RCT approach SHOULD show included charter schools outperforming included TPSs. Any time we don’t see that, we know there’s a big problem with those charters.
2) They can still offer some evidence about the relative magnitude differences between these subsets of schools in different places/times. That is, the same study in Boston, NYC and Minneapolis, to compare between the cities.
But due to the selection effects of schools, results of RCT studies cannot be generalized to the entire range of charter schools, or to the entire range of public schools. These studies only tell us (consistently) that the better charter schools outperform the worse public schools, by the criteria these studies examine.
There actually are a LOT of other problems with applying the RCT approach, but this is the most obvious one.
I actually think Rhode Island’s charter schools are good. The least impressive ones are the high schools, and they’re saddled with a math test twice as difficult to pass as the MCAS, which makes them look bad.
Also, our charter sector grew during a period of rapidly expanding enrollment in Providence, which makes the whole process *much* more pleasant.
2) There are two most important questions about charter schools.
A) Should a family send their kids to a charter school? Is that charter school a better option for its child than the available TPS?
B) Are charter schools a good policy for the whole community/system.
These are different questions and require different approaches to answer.
The first question does not have to pay attention to the cost of charter schools on non-charter schools and students at those traditional public schools. But the second does.
Costs?
Here’s a really basic one: Peer Effects. Put an average kid in a classroom with above average kids and he will do better than when put in a classroom with below average kids. Peers effects are not just some notion. They are well studied and often are included when analyzing student data.
As a parent, I would want to see studies that do NOT control for peer effects. I wouldn’t care what happens if both kinds of kids are sent equally to both schools. I would only care about the classmates my child actually will have. Studies that provide answers to the first question should NOT include peer effects.
As a policy-makers, I WOULD want to see studies that control for peer effects. I would care which option works better when we control for who is attending the school. Peer effects is one way to capture some of the cost to other school.
So, which kind of study are RCTs? They answer question A. The idea of an RCT is to control or everything else through the random selection. The issue is what should be controlled, and that depends on which question you are trying to answer.
RCTs do not tell policy-makers or the community that charter schools are better for the community or system than TPSs. Rather, they tell individual families that the local charter school is better than the local TPS. At most.
Again, there are lots of other issues about RCT studies and you should understand them before you cheer to loud for their limited findings.
I wrote about some of them a few years ago here: http://morethoughtful.blogspot.com/2009/09/what-is-gold-standard.html
Hi Ceolaf and thanks for the comments.
1. For what it’s worth, both in this blog and many others, I’ve discussed RCT limitations. For example, I linked to Winship’s paper, who makes some of the same points you make above. Similarly, peer effects were discussed in the comments yesterday. And obviously I linked to Di Carlo’s blog precisely to point out issues with RCTs — because it was a good explanation.
I don’t know, it’s a little weird to carefully write out those limitations, and then have you say HEY WHAT ABOUT ALL THE LIMITATIONS?
2. Anyway, I do agree on your “a” and “b” setup. I think there’s 3 views of thoughts on “b.”
Charters are always good policy always b/c it gives parents freedom to choose.
Charters are never good policy because (long list of stuff).
But many people/voters are somewhere in the middle. For them, first it matters if charters are “good” however defined, because if they’re not good, then there’s no need to further consider the whole community issue.
Hey Sean. No traction! Well, a wee bit. We have a Harvard scholar doing some small stuff with us. But most of the field thinks we already knows what works (“we just need the will to do it!”), some of the rest of the folks still prefer qualitative research, and fewer still care about individual teacher decisions. But check out SREE, there’s some interesting stuff.
Tom: Okay, corrected!
Who is moving to Providence?
Hi Michael,
I was under the impression that charter schools, as alternatives to what I guess you could call “standard public” schools, were generally different from each other. Isn’t the point of a charter to work from a more-or-less unique vision of what schooling should be?
So I’m always surprised that charter schools are lumped together as if they were all franchises of the same idea. Studies of charter schools end up being seen as the entire universe of charter schools vs. the entire universe of public schools. For that matter, public schools aren’t all that monolithic either (though perhaps more so than charter schools).
What am I missing?
Hi Bill,
You make a very good point. Still, I’d say:
a. Lumping
There is a contentious debate about whether charter schools are a good idea. There’s some need to lump, even with the limitations of any analysis, to even discuss this policy question. The “average” charter does matter in this context.
b. Unique?
I wouldn’t say the 6300 charters are unique per se. I would say there are about 6 or 7 models that describe most of those 6300.
Moreover, in Boston, there are 20 “Commonwealth” charter schools. 14 of the 20 would describe themselves as very similar to one another.
c. Anti-lumping
Individual charters do get evaluated as individual schools. By the state. By parents. By their boards of trustees. By potential hires.
The “average charter” gets studied a lot.
What happens less frequently is an effort to distill the 6,300 charters into 6 or 7 models, and then analyze which one seems most effective.
Wow, that’s quite a distillation, though it does make sense when you think about it.
Have the 6 or 7 models been defined/detailed in any coherent way, or is it just your sense of it?
Your last comment seems to me crucial: analyzing which models seems most effective.
Thanks.
I’d add on to Ceolaf’s points about the RCTness of this research by pointing out that it’s really not.
In “real” RCT all subjects are randomly assigned. Unless the lottery includes ALL eligible children, it’s not randomized. It’s a selection amongst the people who want to be elsewhere.
If you could do a study that really did randomly assign students that would be very interesting indeed. It would certainly help to address some of the other issues here — like the charters v. worst TPS rather than average TPS problem.
Yep. In fact, another wrinkle is that the studies are “intent to treat.”
200 kids apply.
90 selected by lottery.
Study measures effect of winning the lottery.
Only some of those 90 kids actually attend the charter school. (Others reject their admission offer, go somewhere else.)
The results of those kids, the ones who win the admission lottery but DON’T attend the charter, actually get included in the “treatment group.” Even though they don’t spend a single day at the charter.
Therefore, including those kids is technically correct but probably UNDERSTATES the impact of actually attending the charter (what we really care about).
I.e., say 60 lottery winners actually attend the school. And they have math gains of twice the normal rate.
And 30 lottery winners choose other schools. They make normal math gains.
The study of these 90 winners would show “average gains” at 1.67 times the normal rate.
*
The other thing that surprises people: the LOWER the kid upon arrival, the BIGGER his gains at No Excuses charters. Example KIPP Lynn:
“The average reading gains are driven almost completely by SPED and LEP students, whose reading scores rise by roughly 0.35 standard deviations for each year spent at KIPP Lynn.”
“Therefore, including those kids is technically correct but probably UNDERSTATES the impact of actually attending the charter (what we really care about).”
Depends on the audience, right? Policymakers and researchers may care about “intent to treat,” school leaders “treatment on the treated.”
What enhances its RCT-ness (and thus rigor + validity) is the “intent to treat” design.
Hey bro,
But policymakers and researchers don’t care about the “offer” of a charter, do they? I.e., they use intent-to-treat because it’s the best measurement in a world of imperfect measurements (I think we agree there).
If there existed an equivalent measurement of “charter effect,” who’d prefer intent-to-treat? Am I missing something?
Hi Bill,
I think I recall various articles and papers on this, but I don’t know any off top of head….