0% found this document useful (0 votes)

30 views

Machine Learning Twitter

This document discusses Twitter's implementation of machine learning across several areas of its platform and business. It describes how Twitter uses machine learning to improve user experience by personalizing timelines and photos, protect online discussions by detecting hate speech, and help partners identify security flaws. The key applications of machine learning at Twitter include timeline ranking, photo cropping, hate speech detection, and providing data to partners for analysis. Twitter's internal machine learning team, Cortex, works to streamline these processes and innovations.

Uploaded by

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Machine Learning Twitter

Uploaded by

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

MACHINE

LEARNING
IMPLEMENTATION
BY TWITTER

By
Reshma Menon
2027153
INTRODUCTION

Clayton Christensen coined the term ‘disruptive technologies’ and introduced the concept in
the 1995 article “Disruptive Technologies: Catching the Wave,” which he co-authored with
Joseph Bower. The article aims at management executives who decide on the funding and
purchasing strategies in the companies.

Disruptive technology can be described as a technology that changes the existing market
competition by changing the performance metrics or standards along with which the firms
compete (Danneels, 2004). Innovative change in the economy is referred to as ‘creative
destruction’ and may also refer to new circumstances brought forth by such changes in the
economic and political landscape wherein entire businesses and industries become obsolete to
make way for new enterprises and models.

Machine Learning is one such disruptive technology. It is the field of study that develops a
computer’s capability to learn without being explicitly programmed to perform a task
(JavaTPoint, 2017). It includes the study of computer algorithms that automatically improve
the accuracy of predictions through experience. Given its human-like nature to learn through
experience, it is seen as a subset of Artificial Intelligence (AI).

Hence, while AI is the broader study of the creation of intelligent machines that simulate
human-like cognitive capabilities, Machine Learning is an application of AI and empowers
computer systems to improve on this ability with experiential learning and essentially
building a knowledge base necessary to make informed decisions.

The two core methods of Machine Learning are supervised and unsupervised learning.

Neural Networks: Neural networks is a model of Machine Learning, inspired by the human
brain and is used in unsupervised learning. It is an interconnected web of nodes wherein each
node is responsible for simple computation, similar to the neurons present in a human brain
(Goyal, 2020).

Deep Learning: Deep Learning can be considered as an evolved version of machine learning
that uses a neural network to continually analyse data and enables machines to make
informed and accurate decisions without human intervention. It uses a logical structure which
is similar to the decision-making process of a human (Grossfeld, 2020).
TWITTER

Twitter is an American company, initially launched as a microblog and later expanded as a

social networking service where users can post and interact with messages called ‘tweets’. It
is also called the “SMS of the internet”.

Twitter’s business model is similar to that of other social networking platforms:

• Users are required to create a profile

• Users can post tweets under 280 characters
• Unregistered users can only read tweets; they cannot post, like or share tweets
• Users can post using the desktop site, mobile app or even SMS
• Users can choose the user accounts they would like to follow

Around 2017, Twitter shifted its focus to video content and content creators since videos
have higher rates of engagement than text and banner advertisements. To incentivise and
encourage video content creators, the company shares a portion of its revenue with them –
creators receive 70% of the commission, and the remaining goes to Twitter (Das, 2017).

The company offers the following to its stakeholders:

• User benefits – Users share content across the globe, enjoy real-time content and
participate in conversations with other users through tweets
• Advertiser benefits – Global reach, unique advertisement formats and real-time
engagement with the target audience
• Data Partner benefits – Access, parsing and analysing of data to generate insights

Twitter generates revenue primarily from Advertising (almost 80%) and Data Licensing.

A company or individual can advertise either by:

• Promoting a tweet so that it appears in user’s timelines

• Promoting an entire account
• Promoting a specific trend, for example, #SwachBharat or #Titanic

Data licensing is where Twitter sells its public data, which the company calls ‘Firehose’,
which amounts to approximately 500 million tweets per day to different companies.
Companies can analyse this data to understand consumer trends in preferences, potential
niche areas of the market and general behaviour of the target audience to tailor their products
accordingly.

IMPLEMENTATION OF MACHINE LEARNING

Twitter implemented machine learning (specifically, deep learning) for the following reasons:

o Protecting the online community:

In a world where free speech is considered a birthright, it is inevitable for people to
disagree on opinions and thoughts. Sometimes, such disagreements get out of hand
and end up in a full-fledged war.
Twitter feuds are a common phenomenon, wherein users – movie celebrities,
politicians, businessmen and the commonwealth – find comfort behind their screens
to voice out their thoughts and pass judgements on issues and other people. Often,
such conversation takes an ugly turn and are coloured with anti-racial, anti-patriotic
and terroristic tones resulting in an aggravated audience.
o Better user experience:
As mentioned earlier, the company generates a majority of its revenue from
advertisements. However, promoting an advertisement of a retirement plan to a
teenager may not be useful and will hamper the user experience by cluttering their
timeline with irrelevant promotional activity.
Hence, Twitter uses ML to analyse Big Data efficiently and display promotions and
advertisements according to the user’s tastes and preferences, the accounts they
follow, the hashtags they follow or post on their timeline and the company websites
that the user accesses via Twitter.
As the site uses cookies for the above data, all users are mandated to read and agree to
privacy and data licensing agreement before creating a Twitter account.

The Twitter ML Platform consists of the ML tools and services provided by Twitter Cortex
– an AI-driven internal core team responsible for facilitating machine learning innovations
within the company. Cortex aims at streamlining complex ML processes, allowing engineers
to focus on other functions such as modelling, experimentation and user experience.

The various uses and implementations of Machine Learning by the company are:
1. Timeline Ranking:
Twitter introduced the setting “Show me the best Tweets first” which shows the users
Tweets that they are most likely to care by displaying them on the top of their
timelines.
The system requests for feedback to check if the ranking was correct and uses the
responses to learn and improve future timelines. The system analyses the tweet itself
(its recency, number of likes and retweets), tweet’s author (does the user follow the
author or has previously engaged with them?) and the user’s preferences in the past.
This has significantly helped users by allowing them to essentially customise and
curate their account feed and ensures relevancy of data. (Koumchatzky & Andryeyev,
2017)

2. Make Photos More Engaging:

Twitter uses facial recognition and machine learning to crop photos to show the most
interesting and engaging parts of the picture.
Cortex initially focused on describing what features of a picture are salient – the team
gathered data from eye studies and research to understand eye-tracking, which records
the part of a picture that people view first. Based on this data, the team built a neural
network model. However, given the massive user base, the model was to be reduced
in size in order to reduce the load time on the server.

o The team engaged in knowledge

distillation wherein they trained a
smaller network to imitate the larger
model. An ensemble of large networks
is used to generate predictions on a set
of images, and then, the predictions
along with third-party saliency data are
used to train a smaller and faster
network.

Source: Twitter
o The team also performed pruning
to pro-actively remove features that were unhelpful or irrelevant to the
performance of the neural network and incurred huge costs for computation (Dar,
2018).
3. Detecting Hate Speech:
Twitter performs sentiment analysis to detect mean or hate speech. Sentiment analysis
is a technique used for classifying texts and discerning opinions and emotions in text.
There are three different approaches to perform the task:
o Lexicon-based – which uses pre-defined lexicon library to check the occurrence of
words in the revised text
o ML-based – which uses language model classifiers such as linear regression
o Deep learning techniques – which learns complex features using neural networks

The system continually parses and analyses millions of tweets and flags provides easy-to-
read analysis in the form of a Dashboard to the internal teams for potential hate speech
and trends on the platform.

4. Spot Critical Security Flaws:

Data Partners that purchase public data from Twitter include companies that analyse
vulnerability information that tells system administrators the bugs that need to be
patched or resolved.
For example, researchers at Ohio State University in association with FireEye, a
security company, described a new system that reads millions of tweets checking for
mentions of software security vulnerabilities and using machine learning trained
algorithms to assess the severity of the threat posed based on how they are described.
The above research found that Twitter can not only predict the majority of system
security flaws that usually appear days later on the US National Vulnerability
Database, but the company can also use Natural Language Processing to
approximately predict which of these vulnerabilities will be assigned a ‘high’ or
‘critical’ severity rating at greater than 80% accuracy (Greenberg, 2019).

IMPLICATIONS FOR MANAGERS

1. Efficient use of resources:

As Machine Learning is trained to learn and tackle complex processes that require
massive data parsing and analysis, this frees the engineers and other organisational
staff to focus on other aspects of the site such as data modelling, experimentation of
upgradation and versioning prototypes of features and bug fixes and enhancing user
experience based on feedback responses.

2. Reduction in costs incurred:

Optimal utilisation of resources helps in a significant reduction in the costs incurred to
the company. This helps in increasing the operating revenue of the firm and ensures
sustainable profit growth of the company.

3. Better user experience:

Allowing users to curate their Twitter timeline essentially gives them the control to
curate the feed they see in their account. This allows for enhanced user experience
and increased usage of the platform by the consumers.
Also, de-cluttering of data and providing relevant advertisements and promotions help
to better familiarise the product with the target audience.

4. Opportunity to utilise market data

Companies partnering with Twitter for public data can analyse the market trends and
better understand the target audience in terms of their tastes and preferences, features
of their product that are well-received and appreciated by the consumers.
Several companies have a Twitter account where they address the grievances of their
consumers and receive real-time feedback on their products.
This helps to retain partner clients with Twitter and strengthen client relationships for
future endeavours.

5. Employee Morale and Satisfaction:

As employees are free to perform functions that add value to their knowledge and
system and allocate mundane and routine tasks to the system, they experience an
upliftment in their overall morale and satisfaction levels.
Employees have reasonable working hours and perform tasks on a project basis which
allows them to choose the type of work that meets their capabilities and is to their
liking.
This reduces the overall labour turnover and increases employee retention within the
company.
SOCIAL AND ETHICAL ISSUES

1. Violation of privacy laws:

Sharing of user data borders on data privacy issues. Though the company, mandates
users to read and accept an agreement which allows the company to sell data to
partner clients for analysis, sharing of individual data which was not intended to be
sold to third parties cannot be considered private.

2. Platform for influencing opinions:

Several media influencers such as politicians, movie celebrities and social media
influencers tend to use Twitter as for lobbying their personal agendas and influencing
their followers to agree to their opinions or ideas.
In the name of a movement, users tend to turn a conversation into a feud that may
manifest into the real-world as an anti-racial or culture offensive movements resulting
in chaos and discord in the community.

3. Security threats to users:

Often, links meant to be promotional turn out to be click-baits that re-direct the user to
an unknown or unsecured site. The user unintentionally downloads a malware.
Malware refers to malicious software and can be a computer virus, Trojan or spyware
that aims at deceiving the user and infecting the system to either destroy or retrieve
private data. Also, user accounts can be hacked by individuals with malicious intent
and can post confidential data, fake news or other information to wrongly influence
the followers of the account.

4. Disseminating incorrect information:

There are several incidents on Twitter were similar name accounts post fake news and
confuse the public. Recently, the Finance
Ministry clarified that the proposal to reduce
pensions by 20% is false and no such proposal
is being considered by the government.
However, the news did create significant
panic among the public.
Mitigating Risks:
• Twitter has introduced new labels ad warning messages to provide additional context
and information on specific tweets that contain disputed or misleading information
related to COVID-19. The company has also introduced labels for tweets that contain
synthetic and manipulated media.
• The company can apply stricter controls such as firewalls to reduce the chances of
hacking of user accounts. Twitter can use a two-step verification which involves a
password and a code or OTP sent to the user via SMS.
• The company can provide a monthly statistics report to users displaying the type and
amount of data shared with partner clients for better transparency on the usage and
sharing of individual data.
• The company currently uses machine learning to detect hate speech in tweets and
flags and even shuts down user accounts that indulge in inappropriate behaviour on
the social networking platform

Conclusion:

Artificial intelligence and Machine learning are exciting and promising concepts in the world
of data analysis and process automation. The adoption of such concepts by not only the
technology firms but also companies in the service and manufacturing industries indicate the
myriad possibilities of the application of human-like intelligence in functions across
industries and markets.

We are in an era where technologies once hypothesised in Science Fiction novels and movies
are now becoming a reality. While technological advancements are intended to be for the
betterment of the society, it is important that we move towards revolution with caution and
perseverance.
References
Danneels, E. (2004). Disruptive Technology Reconsidered: A Critique and Research Agenda. Journal
of Product Innovation Management, 246-258.
Dar, P. (2018, January 27). Twitter is using Machine Learning to Make Photos More Engaging.
Retrieved from Analytics Vidhya: https://www.analyticsvidhya.com/blog/2018/01/twitter-
using-machine-learning-make-photos-engaging/
Das, S. (2017, July 21). How Does Twitter Make Money? Twitter Business Model. Retrieved from
feedough: https://www.feedough.com/how-does-twitter-make-
money/#:~:text=Twitter%20business%20model%20is%20similar,unregistered%20can%20on
ly%20read%20them.
Goyal, K. (2020, February 13). Machine Learning vs Neural Networks. Retrieved from UpGrad:
https://www.upgrad.com/blog/machine-learning-vs-neural-
networks/#:~:text=Machine%20Learning%20uses%20advanced%20algorithms,modelling%2
0using%20graphs%20of%20neurons.
Greenberg, A. (2019, July 3). Machine Learning Can Use Tweets to Spot Critical Security Flaws.
Retrieved from Wired: https://www.wired.com/story/machine-learning-tweets-critical-
security-flaws/
Grossfeld, B. (2020, January 23). Deep learning vs machine learning: a simple way to understand the
difference. Retrieved from Zendesk: https://www.zendesk.com/blog/machine-learning-and-
deep-learning/
JavaTPoint. (2017). Difference between Artificial intelligence and Machine learning. Retrieved from
javaTpoint: https://www.javatpoint.com/difference-between-artificial-intelligence-and-
machine-
learning#:~:text=On%20a%20broad%20level%2C%20we,data%20without%20being%20pro
grammed%20explicitly.
Koumchatzky, N., & Andryeyev, A. (2017, May 9). Using Deep Learning at Scale in Twitter’s
Timelines. Retrieved from Blog.Twitter:
https://blog.twitter.com/engineering/en_us/topics/insights/2017/using-deep-learning-at-scale-
in-twitters-timelines.html