A university was accused of discriminating against women: in every department, the admission rate for women seemed higher than for men, but in the overall total it was lower. How can a number be true in each part and false in the whole? It is not magic nor error — it is Simpson's paradox, one of the most treacherous statistical traps, and a reminder that data misleads those who read it carelessly.
Simpson's paradox, explained
Simpson's paradox happens when a trend that appears in several groups disappears or reverses when the groups are combined. In the university case, women applied more to the most competitive departments (with fewer places for everyone), and men to the easier ones. Each department was fair, but the uneven mix of applications created, in the total, a misleading picture. The culprit is a hidden variable — here, the department chosen.

Why this matters for your business
It is not an academic curiosity. Imagine a new sales process seems to have a worse conversion rate than the old one — in total. But if the new process was tested mostly on difficult customers and the old one on easy ones, the total lies: segmenting by customer type, the new process may be better in both. Acting on the total would lead to abandoning a process that is actually superior.
Other traps in the same family
- Averages that hide: a comfortable average can combine two groups at opposite extremes, neither near the average.
- Survivorship bias: analyzing only those who "survived" (customers who stayed, projects that finished) and forgetting those who did not.
- Small samples: the extremes (the best and worst store) tend to be the smallest ones, just by statistical chance.
The common thread: the hidden variable
Almost all these traps share the same root: there is a factor behind the scenes we are not seeing. Simpson's paradox is the most dramatic example, but the lesson is general — an aggregated number can hide opposite realities in its components. Looking only at the total is looking at the shadow, not the object.
How not to fall into the trap
The defense is the habit of segmenting. Before acting on a total number, ask: if I split this by relevant groups — customer type, region, period — does the story hold? Often it does, and great. When it does not, you have just avoided a wrong decision based on a total that lied.
An example of discipline
A marketing team saw that a campaign had, in total, worse performance than the average. Before canceling it, they segmented by channel and found it was excellent in two channels and terrible in a third that dragged the total down. The right decision was not to cancel — it was to turn off the third channel and reinforce the other two. The habit of segmenting turned a "cancel this" into a profitable optimization.
In practice
Next time a total tells you a clear story, be a little suspicious and split it by the groups that matter. Data is powerful, but only when we read it with the care to look for the hidden variable. Would the last number that made you decide survive the test of being segmented?