Artificial Intelligence - Natural Language Processing



What is Natural Language Processing?

Natural Language Processing (NLP)refers to AI method of communicating with an intelligent systems using a natural language such as English. Processing of natural language is required when you want an intelligent system like robot to perform as per your instructions, when you want to hear decision from a dialogue based clinical expert system, etc.

The field of NLP involves making computers to perform useful tasks with the natural languages humans use. The input and output of an NLP system can be speech or a written text.

Key Terms in NLP

Some of the key terms and concepts in natural language processing are listed below −

  • Phonology − It is study of organizing sound systematically.
  • Morphology − It is a study of construction of words from primitive meaningful units.
  • Morpheme − It is primitive unit of meaning in a language.
  • Syntax − It refers to arranging words to make a sentence. It also involves determining the structural role of words in the sentence and in phrases.
  • Semantics − It is concerned with the meaning of words and how to combine words into meaningful phrases and sentences.
  • Pragmatics − It deals with using and understanding sentences in different situations and how the interpretation of the sentence is affected.
  • Discourse − It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.
  • World Knowledge − It includes the general knowledge about the world.

Techniques in NLP

Techniques in NLP are methods and algorithms used to process, analyze, and understand human language and data. Some of the common NLP techniques are −

Techniques in NLP
  • Tokenization − It is a technique in NLP that involves splitting a sentence or phrase into smaller units known as tokens.
  • Part-to-Speech Tagging − This technique in NLP involves the process of identifying and labeling works in a sentence based in their part of speech (noun, verb, adjective).
  • Named Entity Recognition (NER) − This technique in NLP is used to identify named entities in the text, such as people, organizations, locations, dates, and more.
  • Semantic Analysis − This technique in NLP determines the sentiment expressed in a piece of text.

Steps in NLP

For the better understanding, analysis of written and spoken language effectively the following 5 NLP steps are followed

NLP Steps
  • Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a language means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of text into paragraphs, sentences, and words.
  • Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for grammar and arranging words in a manner that shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by English syntactic analyzer.
  • Semantic Analysis − It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer disregards sentence such as “hot ice-cream”.
  • Discourse Integration − The meaning of any sentence depends upon the meaning of the sentence just before it. In addition, it also brings about the meaning of immediately succeeding sentence.
  • Pragmatic Analysis − During this, what was said is re-interpreted on what it actually meant. It involves deriving those aspects of language which require real world knowledge.

Components of NLP

There are two components of NLP as given −

Natural Language Understanding (NLU)

NLU helps machine to understand and analyse human language by extracting metadata from content which includes concepts, entities, keywords, emotion, relations, and semantic roles. Understanding involves the following tasks −

  • Mapping the given input in natural language into useful representations.
  • Analyzing different aspects of the language.

Natural Language Generation (NLG)

It is the process of producing meaningful phrases and sentences in the form of natural language from some internal representation.

It involves −

  • Text planning − It includes retrieving the relevant content from knowledge base.
  • Sentence planning − It includes choosing required words, forming meaningful phrases, setting tone of the sentence.
  • Text Realization − It is mapping sentence plan into sentence structure.

Challenges in NLP

Natural Language Processing often faces various challenges due to the complexity and diversity of human language. The most common challenge would be ambiguity, below the different levels of ambiguity −

  • Lexical Ambiguity − It is at very primitive level such as word-level. For example, treating the word “board” as noun or verb?
  • Syntax Level Ambiguity − A sentence can be parsed in different ways. For example, “He lifted the beetle with red cap.” − Did he use cap to lift the beetle or he lifted a beetle that had red cap?
  • Referential ambiguity − Referring to something using pronouns. For example, Rima went to Gauri. She said, “I am tired.” − Exactly who is tired?
Advertisements