Google NLP: NLP (Natural Language Processing)
Google NLP: NLP (Natural Language Processing)
• The field of study that focuses on the interactions between human language and
computers is called Natural Language Processing.
• It sits at the intersection of computer science, artificial intelligence, and computational
linguistics.
• “Natural Language Processing is a field that covers computer understanding and manipu-
lation of human language, and it’s ripe with possibilities for newsgathering,” Anthony
Pesce said in Natural Language Processing in the kitchen.
WHAT IS NATURAL LANGUAGE PROCESSING? (NLP)
• NLP is a way for computers to analyze, understand, and derive meaning from human
language in a smart and useful way.
• By utilizing NLP, developers can organize and structure knowledge to perform tasks such
as automatic summarization, translation, named entity recognition, relationship
extraction, sentiment analysis, speech recognition, and topic segmentation.
• NLP is used to analyze text, allowing machines to understand how human’s speak. This
human-computer interaction enables real-world applications like automatic text
summarization, sentiment analysis, topic extraction, named entity recognition, parts-of-
speech tagging, relationship extraction, stemming, and more.
• NLP is characterized as a difficult problem in computer science. Human language is rarely precise, or
plainly spoken.
• To understand human language is to understand not only the words, but the concepts and how
they’re linked together to create meaning. Despite language being one of the easiest things for the
human mind to learn, the ambiguity of language is what makes natural language processing a difficult
problem for computers to master.
• “By analyzing language for its meaning, NLP systems have long filled useful roles, such as correcting
grammar, converting speech to text and automatically translating between languages.”
WHAT CAN DEVELOPERS USE NLP ALGORITHMS FOR?
• NLP algorithms have a variety of uses. Basically, they allow developers to create a
software that understands human language.
• Due to the complicated nature of human language, NLP can be difficult to learn and
implement correctly.
• However, with the knowledge gained from this article, you will be better equipped to use
NLP successfully.
• Some of the projects developers can use NLP algorithms for are:
a. Summarize blocks of text using Summarizer to extract the most important and central ideas while ignoring
irrelevant information.
b. Create a chat bot using Parsey McParseface, a language parsing deep learning model made by Google that
uses Point-of-Speech tagging.
c. Automatically generate keyword tags from content using AutoTag, which leverages LDA, a technique that
discovers topics contained within a body of text.
d. Identify the type of entity extracted, such as it being a person, place, or organization using Named Entity
Recognition.
e. Use Sentiment Analysis to identify the sentiment of a string of text, from very negative to neutral to
very positive.
f. Reduce words to their root, or stem, using PorterStemmer, or break up text into tokens using Tokenizer.
OPEN SOURCE NLP LIBRARIES
• These libraries provide the algorithmic building blocks of NLP in real-world applications. Algorithmia provides
a free API endpoint for many of these algorithms, without ever having to setup or provision servers and
infrastructure.
• Apache OpenNLP: a machine learning toolkit that provides tokenizers, sentence segmentation, part-of-speech
tagging, named entity extraction, chunking, parsing, coreference resolution, and more.
• Natural Language Toolkit (NLTK): a Python library that provides modules for processing text, classifying, tokenizing,
stemming, tagging, parsing, and more.
• Stanford NLP: a suite of NLP tools that provide part-of-speech tagging, the named entity recognizer, coreference
resolution system, sentiment analysis, and more.
• MALLET: a Java package that provides Latent Dirichlet Allocation, document classification, clustering, topic modeling,
information extraction, and more.
A FEW NLP EXAMPLES
• Use Summarizer to automatically summarize a block of text, exacting topic sentences, and
ignoring the rest.
• Generate keyword topic tags from a document using LDA (Latent Dirichlet Allocation), which
determines the most relevant words from a document. This algorithm is at the heart of
the Auto-Tag and Auto-Tag URL microservices.
• Sentiment Analysis, based on StanfordNLP, can be used to identify the feeling, opinion, or
belief of a statement, from very negative, to neutral, to very positive. Often, developers with
use an algorithm to identify the sentiment of a term in a sentence, or use sentiment analysis to
analyze social media.