Regression to the Mean

Extreme results tend to be followed by more average ones - not because of any intervention, but because that's how variation works.

Also known as Regression toward the mean · Statistical regression · Mean reversion

What regression to the mean means

Regression to the mean is the statistical phenomenon in which extreme values in a dataset - unusually high or unusually low results - tend to be followed by values closer to the overall average. It’s not a force that causes things to move toward the middle. It’s a consequence of the fact that extreme outcomes involve an element of luck, and luck doesn’t repeat reliably.

The concept was first described by Sir Francis Galton in the 1880s, when he noticed that unusually tall parents tended to have children who were shorter than them (though still above average), and unusually short parents tended to have children who were taller than them. He called it “regression toward mediocrity.” The name stuck, though the principle is far from mediocre - it’s one of the most important and most misunderstood ideas in statistics.

Regression to the mean matters because we consistently mistake it for evidence that something has changed. A student who scores exceptionally well on one exam and less well on the next hasn’t necessarily lost ability. A patient who feels terrible one week and better the next hasn’t necessarily benefited from treatment. An athlete who performs brilliantly one season and less so the next hasn’t necessarily declined. In each case, the most likely explanation is that an extreme performance was followed by a more typical one - exactly as statistics would predict.

How regression to the mean works

The role of luck in extreme results

Every measurable outcome is a combination of consistent factors (skill, talent, conditions) and variable factors (luck, timing, random fluctuation). A student’s exam result reflects both their underlying knowledge and a host of random variables: which questions came up, how they slept the night before, whether they happened to revise the right topics.

When someone achieves an extreme result - unusually high or unusually low - the random component is likely to have been unusually favourable or unfavourable. And because random factors don’t repeat in the same way, the next result is likely to have a more typical random component - pulling the overall result back toward the average.

This isn’t mysterious. It’s arithmetic. If you roll two dice and get twelve, the most likely outcome on your next roll is something less than twelve - not because the dice have changed, but because twelve requires both dice to land on six, which is unlikely to happen twice in a row.

Why we miss it

Regression to the mean is constantly occurring all around us, but we rarely notice it because our brains are wired to find causes for changes. When performance improves, we look for what caused the improvement. When it declines, we look for what went wrong. The possibility that neither the improvement nor the decline was caused by anything - that both are simply the natural fluctuation of a variable outcome around its average - doesn’t satisfy our need for explanation.

This is where regression to the mean connects to post hoc reasoning. If a manager criticises an employee after a bad week and the employee improves the following week, the manager concludes that the criticism worked. If a doctor prescribes a remedy when symptoms are at their worst and the patient improves, the doctor concludes that the remedy worked. In both cases, the improvement was the statistically expected outcome regardless of the intervention. But because the intervention coincided with the improvement, it gets the credit.

Kahneman described this as one of the most significant sources of error in human judgement. It leads us to overvalue punishments (which tend to be followed by improvement because they’re administered after poor performance), undervalue rewards (which tend to be followed by decline because they’re administered after strong performance), and draw false conclusions about the effectiveness of almost everything we do.

Regression in both directions

It’s important to understand that regression to the mean works in both directions. Unusually good results tend to be followed by less good ones, and unusually bad results tend to be followed by less bad ones. This means that after a period of poor performance, things will probably improve without any intervention at all - and after a period of exceptional performance, things will probably get worse.

This symmetry is what makes regression so easy to exploit and so hard to detect. Almost any intervention applied after a low point will appear to work, because improvement was coming anyway. And almost any change made after a high point will appear to cause a decline, because decline was coming anyway.

Regression to the mean in everyday life

Medicine and health

Regression to the mean is one of the biggest challenges in evaluating medical treatments. People typically seek treatment when their symptoms are at their worst - an extreme point. Even without effective treatment, symptoms are statistically likely to improve simply because extreme lows tend to be followed by more moderate states. This is why randomised controlled trials with placebo groups are essential in medicine: they separate genuine treatment effects from statistical regression.

Alternative medicine benefits enormously from regression to the mean. People try unconventional remedies when conventional medicine hasn’t helped - usually at the point of maximum suffering. When symptoms improve (as they statistically tend to), the remedy gets the credit. This isn’t fraud in most cases. It’s a genuine misattribution caused by a statistical phenomenon that most people don’t know about.

Sport

The “sophomore slump” in sport - the tendency for rookies who had exceptional first seasons to perform less well in their second - is primarily a regression effect. A player who had an outstanding debut season probably benefited from some favourable random variation. The second season, with more typical luck, produces more typical results. It looks like decline, but it’s better understood as a return to baseline.

Similarly, “form” in sport is often less meaningful than it appears. A team on a winning streak is described as having momentum. A team on a losing streak is described as being in crisis. In many cases, both are simply experiencing the natural fluctuation of outcomes around their true ability level. The streak ends not because something changes, but because extreme sequences are unlikely to persist.

Business and management

In organisational settings, regression to the mean leads to systematic misjudgements about what works. Companies that perform poorly and then hire a new CEO often see improvement - which is attributed to the new leader. But a company that was performing at an extreme low was statistically likely to improve regardless of who took charge.

Performance management suffers similarly. Employees selected for remedial coaching based on poor performance tend to improve afterward - but much of that improvement would have occurred without the coaching. Conversely, high performers selected for stretch assignments sometimes disappoint - not because the assignment was wrong, but because their exceptional performance included a random component that didn’t repeat.

Education

Teachers often notice that praising a student for exceptional work seems to be followed by a decline, while criticising a student for poor work seems to be followed by improvement. This can lead to the damaging conclusion that criticism is more effective than praise. But regression to the mean predicts exactly this pattern: extreme performances in either direction tend to be followed by more average ones, regardless of the response they receive.

Understanding this doesn’t mean praise and criticism don’t matter. They do. But it means you can’t evaluate their effectiveness by looking at what happens next in individual cases. You need to look at patterns across many instances - which is exactly what makes regression to the mean so important for anyone making decisions based on data.

How to account for regression to the mean

Compare to a control group

The gold standard for detecting regression effects is comparison. If you intervene after poor performance and see improvement, compare it to a similar group that received no intervention. If both groups improved by the same amount, the intervention didn’t cause the change. Regression did.

Be sceptical of before-and-after claims

Any time someone presents evidence in the form “things were bad, we did X, and then things got better,” ask whether regression to the mean could explain the improvement. This is especially important when the intervention was applied at an extreme point - the very moment when regression is most likely.

Think in averages, not episodes

Single data points are unreliable. Trends are meaningful. If you want to know whether someone is improving, declining, or staying steady, look at the long-term average rather than comparing the most recent result to the one before it. Individual variation around the mean is normal and expected - it’s not a signal unless it persists.

Expect extreme results to moderate

If someone just had an extraordinary result - in either direction - the single most likely prediction for their next result is something closer to their average. This isn’t pessimism or optimism. It’s statistical literacy. And it’s one of the most useful habits of mind you can develop.

Regression to the mean is one of those concepts that, once you understand it, you see everywhere. It explains the sophomore slump, the effectiveness illusion, the pattern of criticism-then-improvement that shapes so many relationships. It doesn’t make the world less interesting. It makes your understanding of it more accurate.

How to spot it

Watch for people claiming credit for improvement that would have happened anyway. Notice when a treatment, programme, or intervention is deemed effective because performance improved after a low point - without considering that a return to average was the most likely outcome regardless. Pay attention when the 'sophomore slump' or 'beginner's luck' is attributed to psychology rather than statistics.

A thought to hold onto

If you only intervene after an extreme result, almost anything you do will look like it worked - because the next result was going to be closer to average anyway.

Why it matters now

In a world increasingly driven by data, metrics, and evidence-based decision-making, regression to the mean is one of the most important statistical concepts to understand - and one of the most commonly overlooked. It causes us to misjudge the effectiveness of treatments, the performance of employees, the value of strategies, and the impact of policies.