Habits of Highly Mathematical People: Discussing Definitions
Habits of Highly Mathematical People: Discussing Definitions
The most common question students have about mathematics is “when will I
ever use this?” Many math teachers would probably struggle to give a coherent
answer, beyond being very good at following precise directions. They will say
“critical thinking” but not much else concrete. Meanwhile, the same teachers
must, with a straight face, tell their students that the derivative of arccosine is
important. (It goes beyond calculus, in case you were wondering)
1. Discussing definitions
2. Coming up with counterexamples
3. Being wrong often and admitting it
4. Evaluating many possible consequences of a claim
5. Teasing apart the assumptions underlying an argument
6. Scaling the ladder of abstraction
Discussing definitions
A primary skill that mathematicians develop is fluidity with definitions. There’s
a lot more to this than it sounds at first. What I mean by this is that
mathematicians obsess over the best and most useful meaning of every word
they use. Mathematicians need logical precision because they work in the
realm of things which can be definitively proven or disproven. And if
something can be done “definitively,” it must necessarily be definable.
Let me start with a mathematical example first, one which has some
relationship to real life, the word “random.” Randomness as a concept has
plagued mathematics for much of its recent history because it’s difficult to nail
down a precise definition of what it means for an event to be random.
Statisticians deal with this conundrum by saying that things can’t be random,
but rather processes can be random and you can define the probability of an
event happening as a result of the process. That was a very brief overview, but
it’s the foundation for pretty much all of statistics.
But it’s not the only definition of randomness. Because we intuitively want to
say, for example, that flipping a coin and getting 20 heads in a row is “less
random” than getting HTHHTHHHTTTHTHHTHHTH. Mathematicians
looked at the situation and decided the statistical definition of randomness is
not enough, and invented a second definition called Kolmogorov
complexity. Very roughly, an event is called “Kolmogorov random” if the
shortest computer program that produces the event is as long as the
description of the event. (This uses a purely mathematical definition of a
“computer” that was invented before actual computers, think of Alan Turing)
Colloquially, you can imagine that a Kolmogorov random event requires the
description of the event itself to be written out, in full, in the source code of the
computer program that produces it.
Everyone has to deal with new definitions, whether it’s a new definition of
marriage and gender, or the legal definition of “intent,” “reasonable,” or
“privacy.” A seasoned mathematician will readily notice that the government
has no useful definition of “religion.” Being able to think critically about
definitions is a foundation of informed discourse.
1. Often, when coming up with a new definition, one has a set of examples
and counterexamples that one wants the definition to adhere to. So
examples and counterexamples help guide one to build good definitions.
2. When encountering a new existing definition, the first thing every
mathematician does is write down examples and counterexamples to help
them understand it better.
However, examples and counterexamples go beyond just thinking about
definitions. They help one evaluate and make sense of claims. Anyone who has
studied mathematics knows this pattern well, and it goes by the name of
“conjecture and proof.”
As a bad analogy, maybe you conjecture that the Earth is the center of the
universe. Then you try to come up with examples of the object that satisfy the
claim. In our solar system example maybe you make a toy model that shows
how the Earth could be the center of the universe, if the universe were as
simple as the toy. Or you could try to go make some measurements involving
the sun and moon and come up with evidence that the claim is false, that
actually the Earth revolves around the sun. The difference in mathematics is
that the “evidence” is a counterexample and it’s only called such if it’s provable.
“Evidence” in mathematics is often just a temporary placeholder until the truth
is discovered, though for some high profile math problems mathematicians
have found nothing but “evidence,” even after hundreds of years of study.
I witness scenarios like this all the time, but only in the context of
mathematics. The only reason it can happen is because both mathematicians,
regardless of who is actually right, is not only willing to accept they’re wrong,
but eager enough to radically switch sides when they see the potential for a
flaw in their argument.
Sometimes I will be in a group of four or five people, all discussing a claim, and
I’ll be the only one who disagrees with the majority. If I provide a good
argument, everyone immediately accepts they were wrong without remorse or
bad feelings. More often I’m in the majority, being forced to retrace, revise, and
polish my beliefs.
Having to do this so often—foster doubt, be wrong, admit it, and start over—
distinguishes mathematical discourse even from much praised scientific
discourse. There’s no p-value hacking or lobbying agenda, and there’s very
little fame outside of the immediate group of people you’re talking to. There’s
just the search for insight and truth. The mathematical habit is putting your
personal pride or embarrassment aside for the sake of insight.
10. Almost all JFK conspiracy theories must be false, simply because they’re
mutually inconsistent. Once you realize that, and start judging the competing
conspiracy theories by the standards you’d have to judge them by if at most one
could be true, enlightenment may dawn as you find there’s nothing in the way of
just rejecting all of them.
Another:
Indeed, exploring the limits of a claim is the mathematician’s bread and butter.
It’s one of the simplest high-level tools one has for evaluating the validity of a
claim before going through the details of the argument. Indeed, it can also be
used as a litmus test for deciding which arguments are worthwhile to
understand in detail.
Sometimes, the limits of an argument result in an even better and more elegant
theorem that includes the origin claim. More often, you simply realize you were
wrong. So this habit is a less formal variation on being wrong often, and
coming up with counterexamples.
When you face a situation like this in mathematics, you spend a lot of time
going back to the basics. You ask questions like, “What do these words mean in
this context?” and, “What obvious attempts have already been ruled out, and
why?” More deeply, you’d ask, “Why are these particular open questions
important?” and, “Where do they see this line of inquiry leading?”
I’m not taking a political stance either way, but rather pointing out that, if a
mathematician is in a situation of complete bafflement, teasing apart
underlying assumptions is part of the playbook. A large part of what has been
pointed out as the “liberal media underestimating Trump” comes from not
answering these kinds of questions. Instead they tweet the most misguided
quotes they can find in the hopes of popping Trump supporters’ filter bubbles.
Which, if poll data is to be believed, is not very effective…
Scaling the ladder of abstraction
The last habit is a concept I’m borrowing from Bret Victor. It’s the idea that
when you’re reasoning about a problem, there are many different resolutions at
which you can think (he uses “rung”). In Victor’s example, if you’re designing a
car-driving algorithm, you can study it at the finest resolution, where you are
writing an algorithm and watching it behave in a single execution.
At a higher level, you can control different parameters of the algorithm (and
time) with a slider, elevating one algorithm into a family of algorithms that can
be tuned. And you can further generalize which parameters and behaviors are
tunable to find a way to search through the space of all possible algorithms. As
you go, you’re looking for high-level patterns that can help you achieve your
end goal, designing a great car-driving algorithm back down at the bottom
rung (the coarsest resolution).
You might start at the lowest rung of the ladder, understanding some examples
of a definition to get your bearings, then jump up to the main theorem of the
paper and understand how its perceived as a huge improvement over previous
work. You might see they use some technique from the 50’s in a field you’re not
familiar with, but you use that idea as a black box and try to understand the
high-level proof of the main theorem, stepping down one rung. Then maybe
you go to the open problem section to see what work is left to do, and if it
seems enticing enough you can prepare yourself to do that by reading the rest
of their paper in detail.
End notes
I don’t want to imply that developing the habits of highly mathematical people
is unambiguously a good thing. In the real world, many of these habits are a
double-edged sword. Anyone who has gone through an undergraduate math
education has known a person (or been that person) to regularly point out that
X statement is not precisely true in the very special case of Y that nobody
intended to include as part of the discussion in the first place. It takes a lot of
social maturity beyond the bare mathematical discourse to understand when
this is appropriate and when it’s just annoying.
The point is that adhering too religiously to these principles in every situation
will make people think you’re an asshole or make you feel like a buffoon more
than it will help you. It’s knowing when it matters to hold to these principles
that allows one to wield mathematical thinking skills like a chef’s knife, safely
and efficiently slicing up ideas and arguments into their essential forms.