Media and Artificial Intelligence: Matthew Gentzkow

1) AI is having a major impact on the demand side of media through improving how content is matched to consumers, rather than automating content production. 2) While AI can dramatically improve how efficiently the market matches content to consumer interests, the gains may be smaller than expected due to limitations in personalizing recommendations for individual tastes. 3) A key challenge is that consumer interests and what is best for society can diverge, and actors may try to manipulate media matching to shape what people see for their own ends. AI has potential but also risks in how it addresses these issues.

Uploaded by

Shivangi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Media and Artificial Intelligence: Matthew Gentzkow

Uploaded by

Shivangi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Media

and Artificial Intelligence

Matthew Gentzkow

On March 17, 2014, a magnitude 4.4 earthquake shook southern California. The first story
about the quake on the LA Times’ website—a brief, factual account posted within minutes—
was written entirely by an algorithm.1 Since then, “robot reporters” have produced stories in
major news outlets on topics ranging from minor league baseball games to corporate earnings
announcements.2 Some have speculated that future media will consist largely of content
produced by artificial intelligence (AI).3

While AI and related technologies will indeed have a transformative impact on media markets,
automated production of content—whether news or entertainment—is likely to be a minor
part of this story for the foreseeable future. Unlike industries such as manufacturing and
transportation, where thousands of jobs consist mainly of repetitive tasks that are well within
the capability of current technologies, most of the value in media is in the production of
complex content that heavily weights areas like judgment, interpretation, creativity, and
communication, where humans continue to dominate algorithms and will do so for many years
to come.

Instead, the major impact of AI has been and will continue to be on the demand side of
media—not the production of content, but the process by which this content is matched to
consumers. Future improvements in AI have the potential to profoundly alter this process for
both good and for ill.

The basic economics of media mean demand-side matching plays a uniquely important role.
News stories, social media posts, songs, and movies are all prototypical “experience goods”
whose quality and fit with a consumer’s tastes can only be judged once they have been
consumed. Marginal costs are low, and the nature of demand varies greatly across consumers
and time. Together, these factors mean that the market produces oceans of content of widely
varying quality and appeal that must be sorted and filtered in order to produce social value.
Effective matching, whether by traditional mechanisms such as human editing and well-known
media brands, or by modern algorithms, is what converts this mass into a comprehensible,
entertaining, and informative set of goods. It is a key factor determining the level of trust in

1 Oremus, Will, 2014, “The first news report on the L.A. earthquake was written by a robot,” Slate, March 17.
2 “Automated journalism,” Wikipedia.
3See, e.g.: Bruno, Nicola, 2011, “Will machines replace journalists?” Nieman Reports; Carlson, Matt, 2015, “The

robotic reporter,” Digital Journalism, 3:3, 416-431.

media and the extent to which media can be manipulated by governments, advertisers, or
other third parties seeking to persuade. It is what has been thoroughly upended by the advent
of social media, which puts a decentralized, algorithm-driven process of matching in place of
the centralized broadcasting model that has dominated media for centuries. And it is where we
need to focus our attention if we are to address the current crisis of media and democracy.

There are three main dimensions along which this matching process can fail. First, quite simply,
consumers may not be able to find what they want. Despite tremendous progress in search and
related technologies, sifting through the mass of content to find the pieces that maximize a
consumer’s utility remains a formidable problem. Second, what consumers want may not be
well aligned with what is best for society. Scholars have long pointed out that individual and
social objectives are likely to diverge in media, as consumers do not fully account for the way
their own decisions to be more or less informed about various issues spill over and affect others
via the political process. Third, actors such as governments and firms may seek to capture
media in order to shape the selection of content consumers see for their own ends.

AI has the potential to dramatically improve the efficiency with which the market matches
content to consumers. However, the potential gains, and also the possible negative
consequences, vary greatly across these three dimensions.

AI and Search
The most obvious gains from AI will come in making it easier for consumers to find the media
content that they want. This “search” problem encompasses not only search technologies
strictly defined, but also recommendations, reviews, and an array of other technologies that
help consumers navigate content.

At first glance, search appears to be a prototypical application in which the gains to AI should
be large. In general, AI will be effective in domains with (i) a tightly specified decision problem;
(ii) measurable, clearly defined objectives; (iii) large volume of data on prior cases. Choosing a
piece of content to satisfy a consumer’s immediate demand clearly satisfies (i). Clicks, viewing
time, and other easily captured metrics easily satisfy (ii). And online interactions produce vast
amounts of data sufficient for (iii).

The gains to AI in search and recommendation problems have indeed been substantial. The
“Netflix Challenge”—how to use prior data on individual consumers’ movie ratings to predict
future ratings—was a canonical application of machine learning. Google search, Amazon
product recommendations, and the Facebook news feed all rely heavily on AI technologies.

Yet in another sense, the gains from AI have been surprisingly small. People have been
predicting for decades that the defining feature of digital media will be the personalization of
search and matching—going beyond simply sorting web pages or movies to show those most
relevant to a query, and instead using rich information about a consumer’s prior choices and
characteristics to select content uniquely suited to their individual tastes. Though people have
been forecasting a revolution in the quality of personalization for as long as the internet has
existed, this promise remains largely unrealized.

Google search today involves essentially no personalization.4 The only major exception is the
use of location data to define locally relevant results. Two users at the same location entering
the same query will see the same results in the overwhelming majority of cases. While
personalized recommendations are certainly prominent on sites like Netflix and Amazon, their
quality remains by most accounts surprisingly poor. If I log in to Netflix today, four out of five of
my “personalized recommendations” are for additional episodes of television series I have
already watched. Amazon’s “Recommendations for You” page offers mostly products I have
already purchased, or products very similar to those I have already purchased—suggesting for
example, that since I recently bought an electric toothbrush, I might like to buy another one.
Even on Facebook, where personalization of content and advertising are at the core of the
business, evidence suggests that much of what drives variation in users’ newsfeeds is the set of
items their friends share (combined with non-personalized predictions of the overall popularity
of content) rather than finely tailored individual recommendations.5

What explains this personalization paradox? One possibility is that the predictions of a
revolution in personalization have just been premature, and that AI technology is now reaching
the point where the promise will finally be realized. There is certainly no doubt that progress
will continue, and there will likely be domains where frontier technologies do produce large
gains.

There may be a more fundamental answer to the paradox, however. Consider three different
tasks that a search algorithm might perform. The first is providing an interface through which a
consumer can communicate what they are looking for at a particular moment—e.g., parsing the
text of a Google search like “Indonesia tsunami news” to determine its meaning. The second is
ranking content in terms of its average quality or relevance—e.g., determining that a tsunami
story on WSJ or CNN is on average preferred to a similar sounding story on an obscure political
blog. The third is personalization—e.g., using consumer characteristics or past behavior to

4 Hannak, Aniko, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan
Mislove, and Christo Wilson, 2013, “Measuring Personalization of Web Search,” jin Proceedings of the 22Nd
International Conference on World Wide Web, 527–38, WWW ’13, New York, NY, USA: ACM.
5Bakshy, Eytan, Solomon Messing, and Lada A. Adamic, 2015, “Exposure to Ideologically Diverse News and
Opinion on Facebook,” Science 348 (6239): 1130–32.
determine that one consumer might prefer the WSJ story while another might prefer the CNN
story.

The relative return to improving each of these tasks depends on the extent to which tastes are
correlated between consumers and within consumers over time. Personalization will be most
important in a world where the key dimension is stable individual differences in preferences—
some consumers always like to read highbrow stories about tsunamis while others always like
to read lowbrow stories, say. The other tasks become more important to the extent that what a
given consumer is wants at one moment can be quite different from what she wants at
another, and that for a given need consumers agree to a significant degree about what is most
relevant.

A possible explanation for the personalization paradox, then, is that we have tended to over-
estimate the importance of stable individual differences relative to the other kinds of variation.
Figure 1 shows one example consistent with this hypothesis based on web browsing data from
2008. Each point in the plot is an online news or politics site. The x axis shows the average
utility liberals get from the site and the y axis shows the average utility conservatives get from
the site, where both are inferred from each group’s likelihood of visiting the sites.

A world where stable individual differences were key would be one where this plot sloped
downward—some sites give high utility to conservatives and low utility to liberals while others
do the reverse. In that world, knowing the searcher’s ideology and customizing content to it
would be critical. In fact, the plot is clearly upward sloping with a high positive correlation. It is
true that conservatives like foxnews.com relatively more and that liberals like nytimes.com
relatively more, but this kind of variation is swamped by the fact that everyone likes both of
these sites more than smaller sites and blogs.

There is no question that the quality of search and recommendation systems will continue to
improve dramatically with advances in AI. It may well be, however, that these gains continue to
be more about improved communication with users and overall ranking of content than about
personalization.

AI and Bias
Many of the deepest problems in media today stem not from an inability to give consumers
what they want, but from the fact that what they appear to want is not aligned with what is
good for society. Some may demand celebrity news and puppy videos rather than information
that would make them more informed citizens. Others may prefer misleading partisan content
or outright misinformation rather than more balanced and accurate political news. A
substantial risk in an AI driven future is that algorithms become ever more expert at catering to
these tastes, with disastrous consequences for society.

Can AI also be part of the solution? It certainly has a role to play. Facebook and others have
devoted significant effort to training algorithms to identify misinformation. Google can in
principle tune its algorithms to weight social objectives as well as the likelihood of clicks, for
example by showing accurate information about the Holocaust rather than Holocaust denial
sites in response to the query “did the Holocaust happen?”6

If we return to the criteria for what makes problems amenable to AI solutions, however, it is
clear that we should expect AI to be far less effective in addressing bias than in improving
search. Social objectives such as promoting truth and healthy democracy are much harder to
define precisely than giving consumers what they want, and there are few cases in which they
are easily quantifiable. Training data for search is generated automatically by consumer clicks;
training data for identifying misinformation, in contrast, must typically be coded by human fact
checkers. For other forms of bias, there are essentially no training cases because we lack hard
measures of the broader social impact of most content.

Consistent with this prediction, most efforts to fight bias and misinformation to date have
relied primarily on human judgment. While Facebook’s efforts to fight misinformation certainly
involve AI, most of the effective strategies have been things like downranking sites that
consumers report trusting less, adding “article context” information with additional detail
about sources, and filtering articles based on fact checking. These all involve far more human
judgment than AI. Similarly, Google’s adjustments to cases like Holocaust denial have relied to a
significant degree on changing instructions to human raters rather than changing the objectives
of AI algorithms.

We can hope that future developments in AI will make it more effective in aligning media
content with social good. For the near future, however, most progress is likely to continue to
come from human intelligence as curator, editor, and counterweight to the forces pulling more
and more strongly toward satisfying short-run consumer demand.

AI and Capture
Probably the oldest, and possibly the most serious, concern is that media may be captured by
third parties that shape or filter content to serve their own objectives. A leading case today is
the massive censorship apparatus of the Chinese government. Other autocratic governments
engage in similar activity on a smaller scale, and even democratic governments frequently

6Sullivan, Danny, 2016, “Google’s top results for ‘did the Holocaust happen’ now expunged of denial sites,”
searchengineland.com (link).
intervene to try to suppress content they find objectionable. Governments not only try to affect
what their own citizens see but what is seen abroad, as in the case of Russian interference in US
and European elections. The advertising that fuels most digital markets is itself a form of third-
party intervention.

How is AI likely to change the risk of media capture? Here, again, AI has the potential to both
dramatically worsen the dangers and to be a key part of the solution.

On one hand, the Chinese government can use AI to more effectively screen objectionable
content, monitor citizens to identify dissidents and impending protests, and target propaganda
messages to maximize their effectiveness. Russian intelligence operatives can use AI to
optimize their foreign influence campaigns, testing large volumes of content to determine what
works best. Commercial advertisers can similarly use these tools to optimize and target
content.

On the other hand, AI may also provide a robust defense against such manipulation. Consumers
in autocratic countries can use AI to detect propaganda images and other content that has
been manipulated from its original source. Better search technologies from international
sources can help consumers evade domestic controls. Facebook and other social media
companies can use AI to identify foreign interference in elections.

Again, the key question is to what extent the relevant objectives can be defined and measured
at large scale. Identifying social media posts that mention sensitive topics such as Tibet or that
comment critically on the government should be right in the sweet spot in this respect, given
the ability of modern natural language processing tools to disambiguate meaning. Online
surveillance to identify dissidents or impending protests is also well suited to AI, though in
these cases the number of past examples that can be used for training is much smaller in scale.
Optimizing for actual persuasive impact is a much harder task. While it is easy to observe the
reach of propaganda or advertising, determining its effectiveness is much harder, particularly
when the goal is to affect a long-run outcome like support for a regime rather than a short-run
outcome like internet purchases.

Some of the most relevant research on this problem to date comes from work by Bei Qin, David
Stromberg, and Yanhui Wu on the content of Chinese social media.7 They show, on one hand,
that Chinese social media is actually full of government criticism and discussion of sensitive
topics, suggesting either that the regime prefers not to suppress these topics or that their
technology does not yet allow them to do so comprehensively. (Which explanation is correct
has important implications for the way we should expect censorship to evolve with better AI.)

7Qin, Bei, David Strömberg, and Yanhui Wu, 2017, “Why does China allow freer social media? Protests versus
surveillance and propaganda,” Journal of Economic Perspectives, 31(1).
At the same time, these authors show that machine learning applied to social media provides a
potentially powerful surveillance tool, with even simple algorithms able to predict the
occurrence of future protests or unrest with high fidelity.

Conclusion
There is no question that AI will have profound impacts on media markets. While automation of
production may play some role, the unique properties of media goods mean the more
important effects are likely to occur on the demand side. Here, there is great potential for social
good, as AI can make it easier for consumers to navigate the bewildering mass of online content
through search and personalized recommendations, and to identify cases where third parties
are attempting to manipulate them. There is also cause for concern, as AI may tilt content more
heavily toward consumer demand in domains where this is at odds with social good, and AI
tools may be used to more effectively persuade and deceive.
Figure 1: Average utility of news sites for conservatives and liberals

Source: Matthew Gentzkow and Jesse M. Shapiro, 2011, “Ideological segregation online and
offline,” Quarterly Journal of Economics, 126(4), Model Appendix.