The Hidden Dangers in Algorithmic Decision Making - Towards Data Science-1
The Hidden Dangers in Algorithmic Decision Making - Towards Data Science-1
. . .
A robot judge in Futurama was all fun and games, until COMPAS was created.
The quiet revolution of artificial intelligence looks nothing like the way movies
predicted; AI seeps into our lives not by overtaking our lives as sentient robots, but
instead, steadily creeping into areas of decision-making that were previously exclusive
to humans. Because it is so hard to spot, you might not have even noticed how much of
your life is influenced by algorithms.
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 1/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Picture this — this morning, you woke up, reached for your phone, and checked
Facebook or Instagram, in which you consumed media from a content feed created by
an algorithm. Then you checked your email; only the messages that matter, of course.
Everything negligible was automatically dumped into your spam or promotions folder.
You may have listened to a new playlist on Spotify that was suggested to you based on
the music that you’d previously shown interest in. You then proceeded with your
morning routine before getting in your car and using Google Maps to see how long
your commute would take today.
In the span of half an hour, the content you consumed, the music you listened to, and
your ride to work relied on brain power other than your own — it relied on predictive
modelling from algorithms.
Machine learning is here. Artificial intelligence is here. We are right in the midst of the
information revolution and while it’s an incredible time and place to be in, one must be
wary of the implications that come along with it. Having a machine tell you how long
your commute will be, what music you should listen to, and what content you would
likely engage with are all relatively harmless examples. But while you’re scrolling
through your Facebook newsfeed, an algorithm somewhere is determining someone’s
medical diagnoses, their parole eligibility, or their career prospects.
At face value, machine learning algorithms look like a promising solution for
mitigating the wicked problem that is human bias, and all the ways it can negatively
impact the lives of millions of people. The idea is that the algorithms in AI are capable
of being more fair and efficient than humans ever could be. Companies, governments,
organizations, and individuals worldwide are handing off decision-making for many
reasons — it’s more reliable, it becomes easier, it is less costly, and it’s time-efficient.
However, there are still some concerns to be aware of.
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 2/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Getty Images
Bias can be defined broadly as a deviation from some rational decision or norm, and
can be statistical, legal, moral, or functional. We see bias in our everyday lives as well
as on a societal scale. Oftentimes, one perpetuates the other.
For example, on your way home, you may choose to take a route that is in a “safer”
neighbourhood— what determines this? Maybe the area is home to those who are
lower on the spectrum of socio-economic privilege. While it’s not necessarily the case
that the less privileged are more likely to participate in criminal activities, your bias,
whether explicit or implicit, urges you to take a different route. On a larger scale, these
areas may be more heavily patrolled by police, which, in turn, could lead to a higher
arrest rate than a more affluent neighbourhood, giving off the illusion of a higher
crime rate, regardless of the actual amount of crime that goes on there. This vicious
cycle only seems to reinforce our initial biases.
For example, in a hiring learning algorithm seeking candidates that are most likely to
succeed, the training dataset may be fed with data of 200 resumes from the top-
performing candidates in the company. The algorithm then seeks out patterns and
correlations, which contribute to their predictive power when analyzing the likelihood
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 3/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
We’ll start with how algorithms are trained — machine learning algorithms are trained
based on datasets that are chosen by the programmers. With this training data, they
recognize and leverage patterns, associations, and correlations in the statistics.
For example, an algorithm can be trained to distinguish between a cat and a dog by
being fed thousands of pictures of different cats and dogs. Classification is the easier of
the tasks; applying an algorithm to a judgement call based on a human is much more
multifaceted than that. For example, in the case of AI in the criminal justice system,
specifically assisting judges in making a decision whether or not to grant parole to an
offender — engineers can feed thousands of decisions and cases that were made by
humans in the past, but all the AI can understand from that is the outcome of a
decision. It still does not possess the sentience to understand that humans are
influenced by so many variables, and rationality is not always the top tier of human
decision-making. This is a problem coined by computer scientists called ‘selective
labelling.’ Human biases are learned throughout many years of societal integration,
cultural accumulation, media influences, and more. All of these learned biases seep
into the algorithms that learn — just as humans, they don’t start off biased. However, if
given a flawed dataset, they might end up as such.
Societal Reflection
Algorithms are taught to make predictions based on information fed to it and the
patterns it extracts from this information. Given that humans show all types of biases, a
dataset representative of the environment can learn these biases as well. In this sense,
algorithms are like mirrors — the patterns they detect reflect the biases that exist in
our society, both explicit and implicit.
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 5/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Take Tay, the original Microsoft chatbot, for example. Tay was designed to simulate the
tweets of a teenage girl from interactions with Twitter users — however, in less than 24
hours, the internet saw Tay go from tweeting innocent things like “humans are super
cool” to quite worrisome ones, such as “Hitler was right I hate the jews,” simply in
virtue of the surrounding tweets on the internet. Microsoft removed the tweets,
explaining that Tay had shown no issues in the initial testing phase, which had a
training data set that featured filtered, non-offensive tweets. Clearly, filtering had gone
out the window when Tay came online. This seems indicative of a possible method of
bias alleviation, which would be to monitor and filter incoming data as algorithms are
put into use and engagement with the real world.
Word Embedding
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 6/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
However, a problem with word embedding is that it has the potential to amplify
existing gender associations. One study done by Bolukbasi et. al at Boston University
explored word embedding used in Google Translation services. The training period
seldom involves many human engineers, and instead are trained based on libraries of
natural language content such as news articles, press releases, books, etc. Bolukbasi
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 7/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
It’s more complicated than that. What I just described above is called the ‘unaware’
approach to algorithm building. By removing this attribute, the premise is that gender
will be a negligible factor when it comes to job competency. However, because
algorithms are trained to identify patterns within the statistics, the existing
correlations, stereotypes, and inequalities that are so embedded into society emerge
wherever we go; they exist in reality, so they exist in the datasets that we train
algorithms in too. Machine learning will be able to pick up on observable features
associating gender that are not explicitly stated. For example, a hiring classifier may
place weight on the length of ones military service and associate that with competency
or loyalty, when in Israel, men typically serve 3 years, while women serve 2. Now you
have an attribute that is closely correlated with gender, but having removed the
essential information, you remove the context that is necessary to make an objective
decision. For this very reason, an unaware algorithm can sometimes be more biased
than its fully-informed counterpart.
On the other hand, the ‘aware’ approach does use gender information and takes into
account the tendency for a shorter military term to be served by women. Mitigating
these problems about accuracy and fairness often involve a trade-off — they cannot
exist perfectly in the same realm. The unaware approach is a more fair process — it
does not take into account sensitive attributes during its training phase. However, this
can lead to a biased outcome. The aware approach uses a process that is more unfair —
it takes into account sensitive classifications and informations, but can end up with a
more objective outcome.
Feedback Loops/Self-Perpetuation
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 8/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Furthermore, machine learning is prone to being stuck in feedback loops, which can
end up perpetuating bias. For example, when machine-based prediction is used in
criminal risk assessment, someone who is black is more likely to be rated as high-risk
than someone who is white. This is simply due to the disparity in criminal records
between black and white people, which unfortunately reflects human bias in race. And
because the machine has labelled yet another black person as high-risk, this new
addition to the collection of data further tips the scale to be biased against black
defendants. In this case, the system has not only reflected patterns learned from
human bias but has also reinforced its own learning.
Surrogate Objectives
Besides problems within the training data, there are many ways in which bias can
make its way into the process of an algorithm. Our next exploration is concerning the
construct validity of the measures that propagate algorithms — is what you’re trying to
measure actually measuring what you need it to? And when it doesn’t accurately
measure, what are the consequences?
Social media algorithms no longer show posts based on chronological order, but rather,
a machine learning algorithm filters through everything you have ever engaged with.
The goal is to measure engagement — based on your previous interest, it will then
show you more content that it believes you would be likely to engage with. The higher
the engagement rate on a piece of content, the more likely that algorithm is to take the
piece of content and pop it on to others newsfeeds — in a perfect world, this makes
sense. Posts that are popular should in theory be better content — otherwise, why
would they perform so well?
Unfortunately, humans are not as smart as we need them to be in order for this
algorithm to work the way it should. The content that performs the best consistently
can be composed of fake news, celebrity gossip, political slander, and many other
things that serve no purpose to the betterment of the world. But because these
algorithms can’t understand that, these echo chambers form, and it continues on.
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 9/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Many of the decisions in the process of hiring practices are also being handed off to AI,
in areas such as resume screening, job aptitude analyzing, and comparison. Job
recruiting is an extremely timely process and has high costs for everyone involved —
even higher if a mistake is made. The National Association of Colleges and Employers
estimated the cost of hiring an employee to be around $7,600 at a medium sized
company of 0–500. By letting an algorithm do the heavy lifting, a company can devote
much of its resources and funds elsewhere, and hopefully end up with a successful
choice.
However, surrogate objectives become a problem in this process, as many desirable job
traits are very difficult to operationalize. Some of the industry buzzwords these days
include ‘creativity,’ ‘communication,’ and ‘productivity,’ all of which are incredibly hard
to measure. The most common test for measuring creativity is the alternative uses test,
in which one comes up with unconventional uses for common items. Based on this
measure, an employee may be assigned a ‘creativity aptitude’ score, which then is part
of a training dataset that screens prospective employees for the same trait. The
problems is that the alternative uses test only tests one aspect of creativity — divergent
thinking. It neglects all other aspects of creativity, some which may be very valuable for
company culture. You end up with a staff of creatives that are all creative in the same
way — ironically boring.
Conclusion
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 10/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Although we’ve illuminated many of the problems that AI models can introduce, there
are a multitude of reasons that companies may make the switch from a human-
centered decision-making approach. As previously mentioned, despite all of its flaws,
artificial intelligence is still more objective than humans. Because of this, we see a
continued use of artificial intelligence in decision- and prediction-based tasks. But less
biased is not equivalent to unbiased — what happens when an algorithm makes a
biased decision? How do we decide who should take responsibility? It is not as if we
can punish an algorithm for making a biased prediction (what would we do, erase it?).
Arguably, the best way to keep track of accountability is to keep accurate and detailed
records of the processes of AI decision-making. That is, the processes and data by
which the decisions come to be made need to be transparent, so that if anything should
go wrong, some third-party auditor is able to retrace the steps leading up to the
outcome to locate the source of the problem. Bills and laws have been established to
keep practices transparent for this purpose.
Of course, this method is not without problems of its own. Audit is not always feasible
for artificial intelligence featuring big data, which are extremely large data sets, nor is
it always applicable to systems engaged in deep learning, which feature large data sets
as well as complex networks. Algorithmic autonomy and transparency seem have an
inverse relationship — as these algorithms become increasingly better at ‘learning’ and
adjusting, it becomes more difficult to understand where the biases occur. While
auditing is effective for simpler models, we may need a different way to alleviate bias
for complex algorithms.
Another way of mitigating bias is aimed at the the trainers and creators of the AI. By
making them aware of their own prejudices, we have a better chance of keeping it out
of the algorithms. It’s important to note that human bias exists and is hard to mitigate
due to it being an evolutionary trait, but we are becoming increasingly more aware of
the biases that our own brains are susceptible to. To conclude, algorithms can be part
of alleviating institutional bias — if we remain educated, aware, smart, and selective.
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 11/12
4/30/2020 The Hidden Dangers in Algorithmic Decision Making - Towards Data Science
Bright, Peter. “Microsoft Terminates Its Tay AI Chatbot After She Turns Into a Nazi.” Ars
Technica, March 24, 2016.
Courtland, Rachel. “Bias Detectives: the researchers striving to make algorithms fair.”
Springer Nature, Macmillan Publishers, June 21, 2018.
Shapiro, Stewart. “The Objectivity of Mathematics.” Synthese, vol. 156, no. 2, 2007, pp.
337–381.
Bolukbasi, T., Chang, K., Zou, J., Saligrama, V., Kalai, A. “Man is to Computer
Programmer as Woman is to Homemaker? Debiasing Word Embeddings.” Microsoft
Research New England, 2016.
https://towardsdatascience.com/the-hidden-dangers-in-algorithmic-decision-making-27722d716a49 12/12