Why probability probably doesn’t exist (but it is useful to act like it does)
Why probability probably doesn’t exist (but it is useful to act like it does)
By David Spiegelhalter
Illustration: Sébastien Thibault
Life is uncertain. None of us know what is going to happen. We know little of what has
happened in the past, or is happening now outside our immediate experience.
Uncertainty has been called the ‘conscious awareness of ignorance’1 — be it of the
weather tomorrow, the next Premier League champions, the climate in 2100 or the
identity of our ancient ancestors.
In daily life, we generally express uncertainty in words, saying an event “could”, “might”
or “is likely to” happen (or have happened). But uncertain words can be treacherous.
When, in 1961, the newly elected US president John F. Kennedy was informed about a
CIA-sponsored plan to invade communist Cuba, he commissioned an appraisal from his
military top brass. They concluded that the mission had a 30% chance of success — that
is, a 70% chance of failure. In the report that reached the president, this was rendered as
“a fair chance”. The Bay of Pigs invasion went ahead, and was a fiasco. There are now
established scales for converting words of uncertainty into rough numbers. Anyone in
the UK intelligence community using the term ‘likely’, for example, should mean a
chance of between 55% and 75% (see go.nature.com/3vhu5zc).
Attempts to put numbers on chance and uncertainty take us into the mathematical
realm of probability, which today is used confidently in any number of fields. Open any
science journal, for example, and you’ll find papers liberally sprinkled with P values,
confidence intervals and possibly Bayesian posterior distributions, all of which are
dependent on probability.
And yet, any numerical probability, I will argue — whether in a scientific paper, as part of
weather forecasts, predicting the outcome of a sports competition or quantifying a
health risk — is not an objective property of the world, but a construction based on
personal or collective judgements and (often doubtful) assumptions. Furthermore, in
most circumstances, it is not even estimating some underlying ‘true’ quantity.
Probability, indeed, can only rarely be said to ‘exist’ at all.
Chance interloper
Probability was a relative latecomer to mathematics. Although people had been
gambling with astragali (knucklebones) and dice for millennia, it was not until the
French mathematicians Blaise Pascal and Pierre de Fermat started corresponding in the
1650s that any rigorous analysis was made of ‘chance’ events. Like the release from a
pent-up dam, probability has since flooded fields as diverse as finance, astronomy and
law — not to mention gambling.
Does quantum theory imply What’s more, as emphasized by the philosopher Ian
the entire Universe is
preordained? Hacking2, probability is “Janus-faced”: it handles both
chance and ignorance. Imagine I flip a coin, and ask you the
probability that it will come up heads. You happily say “50–50”, or “half”, or some other
variant. I then flip the coin, take a quick peek, but cover it up, and ask: what’s your
probability it’s heads now?
Note that I say “your” probability, not “the” probability. Most people are now hesitant to
give an answer, before grudgingly repeating “50–50”. But the event has now happened,
and there is no randomness left — just your ignorance. The situation has flipped from
‘aleatory’ uncertainty, about the future we cannot know, to ‘epistemic’ uncertainty,
about what we currently do not know. Numerical probability is used for both these
situations.
There is another lesson in here. Even if there is a statistical model for what should
happen, this is always based on subjective assumptions — in the case of a coin flip, that
there are two equally likely outcomes. To demonstrate this to audiences, I sometimes
use a two-headed coin, showing that even their initial opinion of “50–50” was based on
trusting me. This can be rash.
Some assumptions that people use to assess probabilities will have stronger
justifications than others. If I have examined a coin carefully before it is flipped, and it
lands on a hard surface and bounces chaotically, I will feel more justified with my 50–50
judgement than if some shady character pulls out a coin and gives it a few desultory
turns. But these same strictures apply anywhere that probabilities are used — including
in scientific contexts, in which we might be more naturally convinced of their supposed
objectivity.
Here’s an example of genuine scientific, and public, importance. Soon after the start of
the COVID-19 pandemic, the RECOVERY trials started to test therapies in people
hospitalized with the disease in the United Kingdom. In one experiment, more than
6,000 people were randomly allocated to receive either the standard care given in the
hospital they were in, or that care plus a dose of dexamethasone, an inexpensive
steroid3. Among those on mechanical ventilation, the age-adjusted daily mortality risk
was 29% lower in the group allocated dexamethasone compared with the group that
received only standard care (95% confidence interval of 19–49%). The P value — the
calculated probability of observing such an extreme relative risk, assuming a null
hypothesis of no underlying difference in risk — can be calculated to be 0.0001, or
0.01%.
RELATED
This is all standard analysis. But the precise confidence
level and P value rely on more than just assuming the null
hypothesis. It also depends on all of the assumptions in the
statistical model, such as the observations being
independent: that there are no factors that cause people
treated more closely in space and time to have more-
Why forecast an election
similar outcomes. But there are many such factors,
that’s too close to call?
whether it’s the hospital in which people are being treated
or changing care regimes. The precise value also relies on all of the participants in each
group having the same underlying probability of surviving 28 days. This will differ for all
sorts of reasons.
None of these false assumptions necessarily mean that the analysis is flawed. In this case,
the signal is so strong that a model allowing, say, the underlying risk to vary between
participants will make little difference to the overall conclusions. If the results were
more marginal, however, it would be appropriate to do extensive analysis of the model’s
sensitivity to alternative assumptions.
To exercise the much-quoted aphorism, “all models are wrong, but some are useful”4.
The dexamethasone analysis was particularly useful because its firm conclusion
changed clinical practice and saved hundreds of thousands of lives. But the probabilities
that the conclusion was based on were not ‘true’ — they were a product of subjective, if
reasonable, assumptions and judgements.
I will add the caveat here that I am not talking about the quantum world. At the sub-
atomic level, the mathematics indicates that causeless events can happen with fixed
probabilities (although at least one interpretation states that even those probabilities
express a relationship with other objects or observers, rather than being intrinsic
properties of quantum objects)5. But equally, it seems that this has negligible influence
on everyday observable events in the macroscopic world.
I can also avoid the centuries-old arguments about whether the world, at a non-quantum
level, is essentially deterministic, and whether we have free will to influence the course
of events. Whatever the answers, we would still need to define what an objective
probability actually is.
Imprecise expression of uncertainty helped to persuade John F. Kennedy to back a CIA-organized
invasion of Cuba. Credit: Michael Ochs Archives/Getty
Many attempts have been made to do this over the years, but they all seem either flawed
or limited. These include frequentist probability, an approach that defines the
theoretical proportion of events that would be seen in infinitely many repetitions of
essentially identical situations — for example, repeating the same clinical trial in the
same population with the same conditions over and over again, like Groundhog Day.
This seems rather unrealistic. The UK statistician Ronald Fisher suggested thinking of a
unique data set as a sample from a hypothetical infinite population, but this seems to be
more of a thought experiment than an objective reality. Or there’s the semi-mystical idea
of propensity, that there is some true underlying tendency for a specific event to occur
in a particular context, such as my having a heart attack in the next ten years. This seems
practically unverifiable.
RELATED
There is a limited range of well-controlled, repeatable
situations of such immense complexity that, even if they
are essentially deterministic, fit the frequentist paradigm
by having a probability distribution with predictable
properties in the long run. These include standard
randomizing devices, such as roulette wheels, shuffled
‘Shut up and calculate’: how
cards, spun coins, thrown dice and lottery balls, as well as
Einstein lost the battle to
explain quantum reality pseudo-random number generators, which rely on non-
linear, chaotic algorithms to give numbers that pass tests
of randomness.
or
doi: https://doi.org/10.1038/d41586-024-04096-5
Correction 18 December 2024: The picture caption in this story did not accurately
capture the circumstances surrounding the Cuba invasion.
References
5. Rovelli, C. Helgoland: The Strange and Beautiful Story of Quantum Physics (Penguin,
2022).
6. Misak, C. Frank Ramsey: A Sheer Excess of Powers (Oxford Univ. Press, 2020).
COMPETING INTERESTS
Latest on:
Mathematics and computing History Society