Regression To The Mean

the tendency for extremely high or extremely low scores to become more moderate (i.e., closer to the mean) over time.

May 24, 2021

Nobel prize-winning psychologist Daniel Kahneman wrote a book about biases that cloud our reasoning and distort our perception of reality. One of the errors that he examines in Thinking Fast and Slow is the infamous regression toward the mean.

The notion of regression to the mean was first worked out by Sir Francis Galton. The rule goes that, in any series with complex phenomena that are dependent on many variables, where chance is involved, extreme outcomes tend to be followed by more moderate ones.

"Regression toward the mean" sounds a lot more sophisticated than it actually is... to use more simple language, regression toward the mean is an important term in the world of statistics because it describes situation in which something or someone performed better/worse than usual and ultimately returns to "average" or "normal".

This is one of the reasons it’s dangerous to extrapolate from small sample sizes, as the data might not be representative of the distribution. It’s also why James March argues that the longer someone stays in their job, “the less the probable difference between the observed record of performance and actual ability.” Anything can happen in the short run, especially in any effort that involves a combination of skill and luck. (The ratio of skill to luck also impacts regression to the mean.)

“Regression to the mean is not a natural law. Merely a statistical tendency. And it may take a long time before it happens.”
— Peter Bevelin

The rules of regression suggest that when evaluating performance or hiring, we must rely on track records more than outcomes of specific situations. Otherwise, we are prone to be disappointed.

When Kahneman was giving a lecture to Israeli Air Force about the psychology of effective training, one of the officers shared his experience that extending praise to his subordinates led to worse performance, whereas scolding led to an improvement in subsequent efforts. As a consequence, he had grown to be generous with negative feedback and had become rather wary of giving too much praise.

Kahneman immediately spotted that it was regression to the mean at work. He illustrated the misconception by a simple exercise you may want to try yourself. He drew a circle on a blackboard and then asked the officers one by one to throw a piece of chalk at the center of the circle with their backs facing the blackboard. He then repeated the experiment and recorded each officer’s performance in the first and second trial.

Naturally, those that did incredibly well on the first try tended to do worse on their second try and vice versa. The fallacy immediately became clear: the change in performance occurs naturally. That again is not to say that feedback does not matter at all – maybe it does, but the officer had no evidence to conclude it did.

The Imperfect Correlation

In order to understand why regression to the mean happens and how we can make sure we are aware of it when it occurs, we must first understand correlation.

The word correlation is used in everyday life to denote some form of association. However, in statistical terms we use correlation to denote association between two quantitative variables.

The correlation coefficient is measured on a scale that varies from – 1 to + 1. Complete correlation between two variables is expressed by either + 1 or -1. When one variable increases as the other increases the correlation is positive; when one decreases as the other increases it is negative. Complete absence of correlation is represented by 0.

There are few if any phenomena in human sciences that have a correlation coefficient of 1. There are, however, plenty where the association is weak to moderate and there is some explanatory power between the two phenomena.

Kahneman observed a general rule: Whenever the correlation coefficient is imperfect, there will be regression to the mean.

This effect can be illustrated with a simple example:

Assume you are at a party and ask why it is that highly intelligent women tend to marry men who are less intelligent than they are. Most people, even those with some training in statistics, will quickly jump in with a variety of causal explanations ranging from avoidance of competition to the fears of loneliness that these females face. A topic of such controversy is likely to stir up a great debate.

Now, what if we asked why the correlation between the intelligence scores of spouses is less than perfect? This question is hardly as interesting and there is little to guess – we all know this to be true. The paradox lies in the fact that the two questions happen to be algebraically equivalent. Kahneman explains:

[…] If the correlation between the intelligence of spouses is less than perfect (and if men and women on average do not differ in intelligence), then it is a mathematical inevitability that highly intelligent women will be married to husbands who are on average less intelligent than they are (and vice versa, of course). The observed regression to the mean cannot be more interesting or more explainable than the imperfect correlation.

Assuming that correlation is imperfect, the chances of two partners representing the top 1% in terms of any characteristic is far smaller than one partner representing the top 1% and the other – the bottom 99%.

Correlation Does Not Imply Causation

We should be especially wary of the regression to the mean phenomenon when trying to establish causality between two factors. Whenever correlation is imperfect, the best will always appear to get worse and the worst will appear to get better over time, regardless of any additional treatment. This is something that the general media and sometimes even trained scientists fail to recognise.

Consider the example Kahneman gives:

Depressed children treated with an energy drink improve significantly over a three-month period. I made up this newspaper headline, but the fact it reports is true: if you treated a group of depressed children for some time with an energy drink, they would show a clinically significant improvement. It is also the case that depressed children who spend some time standing on their head or hug a cat for twenty minutes a day will also show improvement.

Whenever coming across such headlines it is very tempting to jump to the conclusion that energy drinks, standing on the head or hugging cats are all perfectly viable cures for depression. These cases, however, once again embody the regression to the mean:

Depressed children are an extreme group, they are more depressed than most other children—and extreme groups regress to the mean over time. The correlation between depression scores on successive occasions of testing is less than perfect, so there will be regression to the mean: depressed children will get somewhat better over time even if they hug no cats and drink no Red Bull.

Awareness of the regression to the mean phenomenon itself is a great first step towards a more careful approach to understanding luck and performance. If there is anything to be learned from the regression to the mean it is the importance of track records rather than relying on one-time success stories.

Personally, this phenomenon has really helped me realise why it is easier to get success in something than to maintain that success. One time success can be achieved by a variety of factors like luck, extreme willpower, sincere dedication for a short run, but if we stop putting in the efforts and start to rest on our laurels, the success is also short lived since things start to regress to the mean.

The best way to have regression to the mean work in our favour rather than against us is to take more shots and improve our average. So that if and when something does regress to its mean, the mean itself is pretty spectacular. This is really where the value of good daily habits win over random bursts of inspiration and hard work.

So the next time, you are in despair for not being where you were supposed to be by now, know that every small step counts. Almost everyone regresses to the mean, so instead of comparing ourselves to someone else’s current state, we really just need to have more days where we show up for ourselves and have those small wins.

Saloni's Newsletter

Discussion about this post