Sunday, March 25, 2007

A Question of Scale: Class-size Reduction and America's Misplaced Priorities


Everyday Americans and politicians alike think class-size reduction is a key to any plan to improve education in America. I used to agree. Then I ran the numbers. I've since come to the conclusion that class-size reduction is a $40 billion mistake. Allow me to explain.

Though much research has shown (and common sense confirms) that teacher quality is the key variable when it comes to student improvement, teacher quality is hard to measure. And without accurate measurement, it is impossible to compare the impact of the various known components of teacher quality to the impact of other seemingly helpful interventions such as reducing class-size, instituting after-school programs, hiring additional school counselors etc. Lacking a scale for comparison we can't evaluate financial trade-offs and politicians are likely to go for popular, feel good programs of which class-size reduction is the American favorite. But new research has given us exactly the tools we need to make precise comparisons and the financial cost-benefit analysis this research makes possible is simply damning for class-size reduction.

The best source of this information is a new study called "How and Why Do Teacher Credentials Matter for Achievement" (Clotfelter, Ladd, and Vigdor 2007). Using value-added methodology*, this study took 10 years of data for every student and teacher in the state of North Carolina and used it to analyze the impact of various factors -- including teacher experience and credentials, student socio-economic background, and class-size -- on student achievement. Though the study was not new in conception, its massive data set suggests that its results may be the most reliable to date.

Of particular interest are the following results (for math achievement**):

Input
Increase in student achievement
The first 3 to 5 years of teacher experience:
7.2% - 9.1% of a standard deviation
     (vs. a brand new teacher)

Teacher having a regular teaching license:
3.3% - 5.9% SD
     (vs. having an emergency license)

Teacher same race as student:
2.0% - 2.9% SD
-All additional years of teacher experience:
2.0% - 2.8% SD
     (beyond 5 up to 27)

Teacher is National Board certified:
2.0% - 2.8% SD
Teacher scored 1 SD above average on
1.1% - 1.5% SD
     a teaching licensure exam:***

Reducing class-size by 5 students per teacher:
1.0% - 2.5% SD
Teacher attended a competitive college:
0.7% - 1.0% SD
     (vs. attending an uncompetitive college)

Reducing class-size by 1 student per teacher:
0.2% - 0.5% SD
Teacher has an advanced degree:
–0.3% - 0.2% SD
     (usually MA)


                                                                             
The numbers above represent the amount of improvement that students experienced when given the input listed. It's very important to note that the results above are measured in percent of a standard deviation, not in percent of a test score. If you aren't statistically inclined enough to interpret what standard deviations mean, don't worry; the percentages above still tell us the relative impact of these factors, which is the crucial factor when comparing trade-offs.

Comparing those relative factors you'll notice several things. First of all, teacher experience and licensure rank near the top of the list while class size reduction and teachers having masters’ degrees rank at the bottom. But more important than the ranking is how much smaller of an impact class-size reduction has than some of these other factors. For example, having a teacher who is not a novice (7.2% - 9.1% SD) exerts an influence 3½ to 7 times greater than the impact of reducing class size by 5 students (1.0% - 2.5% SD).

But the really mind-blowing results come when you start comparing class-size reduction to giving students a teacher with a reasonably good (though not unlikely) combination of teacher credentials (estimated by adding the relevant values in the table above). Clotfelter and Ladd do their own estimates of this sort and come up with a combined effect size of 15% - 20% SD for math (and 8% -12% SD for reading) of a well-credentialed teacher.

The first time I read this portion of the study I said to myself, "Yes, class size is less important,” but that finding is not particularly novel to anyone who follows this sort of research. However when I decided to actually compare how much less important class size is my jaw dropped. The effect size of teacher credentials is 8 to 10 times that of a major class size reduction in math and 6 to 8 times as big in reading.

To really understand the importance of these differences in scale you need to look at a few statistics in order to put a dollar amount next to a policy choice. America currently has a nationwide student to teacher ratio of about 16 to 1 (roughly 50 million public school students divided by roughly 3.1 million teachers). The average teacher salary in America currently floats around $48,000, which means we are spending roughly $150 billion dollars a year on teacher salaries. Cutting class size in half would require doubling the number of teachers and therefore doubling the amount we spend on teacher salaries, putting the figure at around $300 billion. (This does not even account for the increased costs of benefits, additional classrooms, and other factors needed to enact such a decrease in class size.)

The next important question is, "What would we get for such a monumental expenditure?" Based on the results in North Carolina we can estimate. If class size reduction has a linear effect-size (granted an assumption worth exploring) cutting it in half (i.e. reducing it by 8 students per teacher nationwide) would achieve an effect size in math eight times that of reducing it by one student per teacher, in other words 1.6% to 4% SD. To give you a sense of the meaning of that effect size, it is in the same ballpark as giving students a teacher who is of the same race (about 2% - 3% SD). The former intervention would cost nationwide about $150 billion dollars (or better than half of the $250 billion that all the states combined spend on education) and the other would cost roughly nothing.

The analysis so far has not compared the impact of class-size reduction to the impact of teacher effectiveness differences that have nothing to do with credentials. Put simply, quite a lot of research has shown that there is a huge difference in the performance of teachers that is not attributable to differences in credentials. (See for example the seminal "Teacher Effects on Longitudinal Student Achievement" Jordan, Mendro, and Weerasinghe 1997 or more recently Kane, Rockoff and Staiger's work published in simplified version at "Photo Finish" in Education Next and available at http://www.hoover.org/publications/ednext/4612527.html).

It's not possible to directly compare the effect sizes from the North Carolina data set to those in these other studies, but what these other studies have shown is that in much the same way that teacher credentials are an order of magnitude more important than class size, teacher talent is an order of magnitude more important than teacher credentials. This subject is actually worthy of another post entirely. The simple upshot is that if a direct comparison shows that the cost-benefit ratios of reducing class size and improving teacher credentials are in different categories, the cost-benefit ratios of reducing class size compared to increasing the concentration of teaching talent are on different planets.

So how about instead of setting a goal of halving class size, requiring us to spend $150 billion annually, we instead set a goal of doubling teacher pay, so that average is around $100 K and simultaneously impose a ruthlessly competitive tenure process where districts only offer tenure to teachers with dramatically high value-added?

For those who think that all this discussion of halving class size (i.e. cutting it by 8 students per teacher) is a crazy hypothetical, it's worth looking at the trend over the past 30 to 40 years. Since 1970 class size has gone down in America by about 6 pupils per student, a roughly 27% decrease. There is no indication that teacher quality has gone up by 27% or even 2.7% in that same time period.

Judging by the education news media, I must be the only person running these numbers. Every other news article, position statement by a politician, teacher blog, or opinion on the street, assumes that reducing class size is a high priority and should be done whenever possible. But it is quite simply the most misplaced priority out there and no one seems to be drawing any attention to this fact. The mild experiments in different versions of performance pay, about which there have been such intense fights, represent just peanuts compared to the money that has been sunk, and that prevailing sentiment proposes to keep sinking, into class-size reduction.

Of course increasing the teacher talent pool would require recruiting a large number of new teachers. So here's a proposal of what we could do with the money we would have saved from not decreasing class-size. How about completely free college for anyone who meets a high academic threshold (3.5 GPA, 700 or higher Math SAT) and commits to majoring in math or science and then teaching math or science in a low-income school for 4 years after college? (I mention these hypothetical credentials based on the 7% of a SD increase you get from having a teacher with licensure test scores not 1, but 2 SD’s above the mean). This approach to teacher recruitment would also help 100,000 students a year with college access and in particular could offer high-achieving low-income students a powerful opportunity to attend college while serving the communities from which they come. (I imagine this might have the side effect of improving the student-teacher racial matching mentioned above, which was measured as having an effect size similar to that of halving class size.)

If we assume a good state college costs $20,000 a year (tuition, room, board, books, everything) for four years and say we had 100,000 takers for our national program each year, it would only cost us $8 billion for each cohort (or roughly half of current Title I costs). If on the other hand we spent the same $8 billion reducing class size we could hire about 160,000 new teachers and achieve a nationwide decrease in class size of about 0.8 pupils per teacher.**** Based on Clotfelter, Ladd, and Vigdor’s numbers the latter intervention would likely produce a nationwide effect size of less than 0.2% to 0.5% of a SD; (0.2% – 0.5% of SD would occur if we achieved a full 1 pupil reduction in class size). Not 2% - 5% SD, but 0.2% - 0.5% SD. For 8 billion dollars.

As I mentioned above, states and school districts have actually enacted this horrendously inefficient second intervention. If we simply undid the 6 pupil per teacher, nationwide class-size decrease of the past thirty years, which would require laying off roughly 830,000 teachers nationwide (about 27%), and put us back to our previous roughly 22 to 1 ratio (not terrible), we would have an additional $40 billion or so a year to spend (830,000 teachers times $48,000 per year in salaries). If instead of teacher recruitment through financial aid we wanted to use this money to increase teacher salary, it would allow us to raise salaries for the remaining two and a quarter million teachers by almost $18,000 a year across the board. That's a raise that people considering teaching would notice.

If districts wanted to be more strategic than proposed above, we could raise salaries for 75% of those teachers (about 1.7 million of them) by $10,000 across the board ($17 billion total) and raise salaries for the top quartile (roughly 560,000 teachers) by approximately $40,000 per teacher ($23 billion total). Frankly, I would be for this even if top quartile was determined not by value-added, but by peer-review and principal evaluations since I believe we'd still be getting enough overlap between actually good teachers and people being highly compensated. We could then have those $100K a year teachers that so many of us dream of (or dream of being) and not just a few of them.

Maybe we could do all this without laying off any teachers if we just don’t replace the almost 1 million who are nearing retirement. But to do that we’d have to convince all of America to give up our most beloved, and misguided, policy intervention. We'd have to convince Americans that the bottom line in improving student achievement is teacher quality not teacher quantity. We'd have to convince them that there is a real trade-off being made and that every time you spend money reducing class-size you are not spending it producing, recruiting and retaining effective teachers. We’d have to convince them that bigger classes are actually better for education.

--Dewey


*Value-added methodology is a way of evaluating student-progress as opposed to just absolute test scores. To radically oversimplify, if a teacher takes a student who is scoring about a 60 on an exam at the beginning of a year and teaches that student to the level where she scores an 80 on the same (or a very similar) exam by the end of the year that teacher has added value of 20 points. If another teacher took a student who was scoring 85 and took her to scoring 90 that teacher added 5 points. Even though the second teacher's student scored significantly higher, we would say the first teacher did a better job helping her student grow, in fact a dramatically better job. This is a very gross oversimplification of value-added, but it conveys the basic idea. Most value-added models take into account various other factors besides just the student's initial test score, in order to not hold teachers accountable for factors outside of their control that differentiate their students from the students of other teachers.

**The fact that these scores are for math, not reading is actually quite important. In general the impact that school-based factors have on reading tests scores is significantly lower than the impact that school-based factors have on math scores and conversely the impact of non-school factors such as parental education levels is much bigger for reading than for math. Given that parents typically interact with their children more through language than through math, it makes intuitive sense that parental (and peer or community) impact on language skills would be larger than parental impact on math skills. The relevant effect sizes of class-size reduction on reading achievement are 1.0% - 2.0% SD for a reduction of 5 students per teacher.

***Scoring one standard deviation higher than average would mean you scored in roughly the top 16% of test takers. Scoring two standard deviations higher than average would mean you scored in the top 2% of test takers.

****Figures come from the following rough calculation: Current ratio: 50 million current public school students divided by 3.1 million current public school teachers = 16.1 students per teacher. Ratio after spending $8 billion on a class-size reduction initiative: 50 million current students divided by 3.26 million teachers = 15.3 students per teacher. Difference: 16.1 – 15.3 = 0.8 student per teacher reduction. This assumes that the new Math and Science corps replaces retiring teachers as opposed to being added to the current number of teachers. 


Post-script:
The one thing that this analysis doesn't address is the potential of non-linear effect sizes for class-size reduction. Clotfelter et al. actually found striking non-linearity in the effect size for increased teacher test scores with a teacher scoring 1 SD better than average producing barely 1% SD increase in student achievement, but a teacher scoring 2 SD better than average on the licensure test producing a whopping 7% SD increases. Frankly, non-linear effect sizes for class-size seem more likely than not. At the extremes it is obvious that teaching 40 elementary age children with even mild discipline issues starts to become ridiculous, and teaching a class of 4 children is essentially a form of tutoring. At some point in the future I'm going to come back to this issue. I'll just say for the moment that the kind of changes in class-size discussed in this paper i.e. changes of 6 pupils per teacher, are not creating these extreme cases in which we would predict severe non-linearity. More on this in time to come.

13 comments:

dorian said...

Fascinating. I guess it would also be worth finding out to what extent the teacher-credentials listed as being important in the beginning of your post are already satisfied in the current system (to find out how much room there is for improvement in this area).

And I agree that it would be worthwhile to find a way of measuring the "intangible good teaching qualities" that teachers possess to varying degrees irrelevant of credentials.

TurbineGuy said...

Excellent 2nd post. Consider yourself linked.

One observation.

Isn't it possible that reducing class size would actually have a negative effect, instead of a small positive one.

We already have difficulty attracting teachers into teaching, especially in math and science. Doubling the number of teachers would have to entail reducing standards and quality.

Wouldn't the net effect of the lower average teacher quality more than cancel out the small benefits of reduced class size?

H. said...

Wonderfully content-rich post. I hope you'll be writing a lot more.

Do you think that class size might have a significant effect on the number of teachers who stay beyond the critical first years, and thus indirectly on the average length of teacher experience?

(A different question would then be whether - assuming a greater survival rate among teachers with smaller class sizes - the right candidates are selected for.)

CrypticLife said...

Wow. Came here through Rory's (parentalcation) posting, and I'm not disappointed. I also always figured teacher quality was more important than class size, but the difference is astounding.

I like your financial analysis as well. One minor issue with it, though, is you don't consider the administrative costs of running such a program of guaranteed tuition. Also, I suspect the program would have little draw without also increasing teacher salaries.

Unknown said...

I'll have to get that paper and read it. You've been linked. Excellent post.

Catherine Johnson said...

wow!

Catherine Johnson said...

For some reason, I can't post a link to The Common School. I keep getting a message saying that the link is causing Mozilla to run slowly.

I'll keep trying.

Catherine Johnson said...

btw, we have 100K teachers in my district, along with a very high rate of tutoring and parent reteaching.

slide and glide

Catherine Johnson said...

If you have time, I'd love to get citations for the research on home contribution to math versus reading.

I had been noticing this for quite awhile, but haven't seen research supporting this observation.

Dewey said...

Hello all and thanks so much for you comments.

Let me see if I can address several of them all at once.

To rigamarole, regarding the current state of teacher credentials: there is actually quite a rich set of research on this subject. The problems are particularly acute for minority and poor students. A few leaders in the research are the academics Richard Ingersoll and Helen Ladd, and the organizations The Education Trust and The National Council on Teacher Quality. See for example http://www.pubpol.duke.edu/research/papers/SAN06-08.pdf and also http://www2.edtrust.org/NR/rdonlyres/8DE64524-592E-4C83-A13A-6B1DF1CF8D3E/0/AllTalk.pdf

There is a ton more to say on this so I'll perhaps come back to it in a later post.

Oops something came up. More comments to other folks in just a bit.

Jen said...

That figure of 16 kids in a class is total teachers, right? I'm guessing that includes most non-administrative staff of a school -- special ed, aides, speech therapists, etc. That or there are huge pockets of extremely small classes unreported around this country!

Our district (urban) certainly doesn't have 16 in a class -- 25 is probably more of a reasonable estimate and I'd guess most elementary teachers would say something in the 18-24 range would be acceptable to them. Under 30 in HS? The non-Catholic private schools around here use their 16-18 student class sizes as a selling point, not a "we're just the same as you are" point!

I keep thinking about grading/parent contact/recordkeeping time. Imagine if a HS teacher only spent 5 minutes per kid (let's say per week) on all of the above. 5 classes, 25 kids each = 125 kids = over 10 hours a week. That doesn't even count the parent conference that takes 45 minutes out of a week. If that same teacher adds another 25 students (5 per class) you're at 12 1/2 hours a week. For 5 minutes.

So is that teacher going to take more or less time to look at homework? More likely to actually read and grade it or just check it as "done" regardless of quality, more likely to read a rough draft or even a finished essay carefully, or just scan it for obvious problems?

Now on the teacher talent item? I'd *love* to see that quantified. I agree with your assessment that some sort of teacher/principal/maybe even parent rating system would quickly sort out the top and bottom 20% of teachers. Lose the bottom, reward the top.

Nancy Flanagan said...

Interesting interpretation of one rather small piece of the Clotfelter study. Arguing about class size as a factor in student achievement is hardly a new phenomenon, however. Eric Hanushek has been pushing the fallacy of reducing class size in his research for, literally, decades, in dozens of papers.

Whenever I read about "exactly the tools we need" and "precise comparisons" to evaluate school performance, I know that I'm looking at the work of someone who sincerely believes in the "reality" of tiny quantitative distinctions. We may have good evidence that class size matters little when comparing residual learning measured by standardized tests--but try telling that to Soccer Mom and Banker Dad: Go ahead and put your kid into a kindergarten class of 35! Research shows that the lack of attention won't make a bit of difference when they take the SATs!

The supposed gold standard in class size research--the Tennessee STAR study--acknowledges that teacher quality matters more than twice as much as class size. Even the best and hardest-working teacher, however, runs out of time and steam when classes get too large. The master English teacher with 35 students will assign half the essays as the master English teacher with 16 kids, and provide half the feedback. This may not make a difference to some students--but it will to others. It's just common sense, which is why policy-makers persist,in spite of the research evidence, to propose class size reductions.

The STAR study was also clear that you get the biggest achievement bang from class size reduction with at-risk students (who haven't had rich literacy and larger-world experiences before school) and with kids in the very early grades. That, too, makes sense. Perhaps we should be reducing class size in the early grades and in high-needs elementary schools; perhaps the problem is uniform class size restrictions across the board, instead of where they'll do the most good.

Reductions in overall class size since 1970 are largely due to Special Education regulation and legislation, which restrict class sizes for students with learning disabilities. You can argue that these restrictions are not necessary, but that's a different argument from the one you're making about how to recapture the economic good old days, by raising both class size and teacher talent.

About this "teacher talent" assertion: it would be interesting to know how, precisely, using these great tools we now have, you plan to measure quality teaching. If you use Clotfelter's numbers, your best bet is teacher retention, since that's the optimum strategy for increasing effectiveness. Competitive college admission (which presumes high entrance exam scores and those high GPAs) and advanced degrees won't do it. Where do these high-quality teachers come from? How are they molded? This is an issue educators have been struggling with for decades, as well. There is a great deal more involved in teacher efficacy than credentials, it seems.

Finally--a lot of your calculations assume validity that would have to be tested by further research. The strange thing about social science research is that the numbers may say one thing, but it feels different on the ground.

BTW, I am an instrumental music teacher and my average class size last year was 67 students. This is an issue I am intimately familiar with.

Anonymous said...
This comment has been removed by a blog administrator.