0% found this document useful (0 votes)
11 views25 pages

NLP_Presentation1

Natural Language Processing (NLP) is a branch of Artificial Intelligence that enables machines to understand and generate human language, impacting various applications like voice assistants and chatbots. Key components of NLP include text preprocessing, syntactic and semantic analysis, sentiment analysis, and machine translation, while challenges involve context understanding, sarcasm detection, and data bias. The future of NLP is focused on advanced AI models, improved context understanding, and more efficient multilingual processing.

Uploaded by

jaloj24591
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views25 pages

NLP_Presentation1

Natural Language Processing (NLP) is a branch of Artificial Intelligence that enables machines to understand and generate human language, impacting various applications like voice assistants and chatbots. Key components of NLP include text preprocessing, syntactic and semantic analysis, sentiment analysis, and machine translation, while challenges involve context understanding, sarcasm detection, and data bias. The future of NLP is focused on advanced AI models, improved context understanding, and more efficient multilingual processing.

Uploaded by

jaloj24591
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Natural Language

Processing (NLP)
Overview
CHAPTER 1
Introduction to NLP

 What is NLP?
 Natural Language Processing (NLP) is a field of Artificial
Intelligence (AI) that enables machines to understand,
interpret, and generate human language. It combines
linguistics, computer science, and machine learning to
bridge the gap between human communication and machine
understanding.
 Why is NLP Important?
 NLP powers many everyday applications, such as:
 Voice Assistants (Siri, Alexa, Google Assistant)
 Chatbots & Customer Support
 Machine Translation (Google Translate)
 Sentiment Analysis (Social media monitoring)
 Text Summarization & Information Retrieval (Search
engines)
 Key Components of NLP
 1. Text Preprocessing
 Before a machine can understand human language, raw text must be cleaned
and structured.
 Tokenization: Splitting text into words or sentences.
 Example: "AI is amazing!" → ["AI", "is", "amazing", "!"]
 Stopword Removal: Removing common words (e.g., "is", "the", "and").
 Stemming & Lemmatization: Reducing words to their root form.
 Stemming: "running" → "run" (simple cut-off)
 Lemmatization: "running" → "run" (linguistically correct)
 2. Syntactic Analysis (Syntax Processing)
 Understanding sentence structure.
 Part-of-Speech (POS) Tagging: Identifying words as nouns, verbs,
adjectives, etc.
 Parsing: Analyzing sentence grammar.
 Example: "The cat sat on the mat." → (Subject: "cat", Verb: "sat", Object: "mat")
 3. Semantic Analysis (Meaning Extraction)
 Understanding the meaning behind words and sentences.
 Named Entity Recognition (NER): Identifying names, places, dates.
 Example: "Elon Musk founded Tesla in 2003." → (Person: "Elon Musk",
Organization: "Tesla", Year: "2003")
 Word Sense Disambiguation: Understanding word meaning based on
context.
 Example: "I went to the bank." (Financial institution vs. Riverbank)
 4. Sentiment Analysis
 Determining whether text is positive, negative, or neutral.
 Example: "This movie is fantastic!" → Positive
 Used in social media monitoring, customer feedback
analysis, and brand reputation tracking.
 5. Machine Translation
 Automatic translation of text between languages.
 Example: Google Translate (English ↔ French)
 Uses deep learning models like Transformer-based
architectures.
 6. Text Generation
 AI-generated content using models like GPT-4 and ChatGPT.
 Example: Writing essays, summarizing articles, generating
code.
History of NLP

 1. 1950s: Turing Test, rule-based systems.


 2. 1960s-70s: ELIZA chatbot, early translation
systems.
 3. 1980s: Statistical NLP, corpus linguistics.
 4. 1990s-2000s: Machine learning models in
NLP.
 5. 2010s-Present: Deep learning, BERT, GPT
models.
Phases of NLP
 1. Lexical Analysis (Tokenization & Morphological Processing)
 Goal: Breaking down text into individual words, phrases, or symbols (tokens).
 🔹 Tokenization – Splitting a sentence into words (tokens).
🔹 Stemming – Reducing words to their root forms (e.g., "running" → "run").
🔹 Lemmatization – Converts words to their dictionary base form (e.g., "better" → "good").
🔹 Stopword Removal – Removing common words like "is," "the," and "and."
🔹 Part-of-Speech (POS) Tagging – Assigning grammatical categories to words.
 2. Syntactic Analysis (Parsing/Grammar Check)
 Goal: Analyzing sentence structure based on grammar rules.
 🔹 Parsing – Understanding the grammatical structure of a sentence.
🔹 Dependency Parsing – Identifies relationships between words (subject, object, verb).
🔹 Constituency Parsing – Breaks sentences into sub-phrases (e.g., noun phrase, verb
phrase).
🔹 Grammar Checking – Detects errors in syntax (e.g., misplaced words, subject-verb
agreement).
 3. Semantic Analysis (Understanding Meaning)
 Goal: Extracting the true meaning of words and sentences.
 🔹 Word Sense Disambiguation (WSD) – Resolving word meanings based on context.
🔹 Named Entity Recognition (NER) – Identifying names, locations, and entities in text.
🔹 Semantic Role Labeling (SRL) – Assigning meaning to words (who did what to whom).
🔹 Lexical Semantics – Understanding synonyms, antonyms, and relationships between
words.
 4. Discourse Integration (Context Analysis Across Sentences)
 Goal: Understanding how different sentences in a document relate to each other.
 🔹 Coreference Resolution – Identifying references to the same entity (e.g., "he,"
"it").
🔹 Anaphora Resolution – Identifying what pronouns refer to in the text.
🔹 Discourse Parsing – Structuring text into a meaningful flow (cause-effect,
contrast, elaboration).

 5. Pragmatic Analysis (Real-World Context & Intent Recognition)


 Goal: Interpreting text based on real-world knowledge and speaker intention.
 🔹 Sentiment Analysis – Detecting emotions (positive, negative, neutral).
🔹 Speech Act Recognition – Understanding intent (question, command, request).
🔹 Conversational AI Understanding – Detecting sarcasm, humor, and
implications.

 6. Speech Processing (Speech Recognition & Generation)


 Goal: Converting spoken language into text and vice versa.
 🔹 Automatic Speech Recognition (ASR) – Converts speech into text (e.g., Siri,
Google Assistant).
🔹 Text-to-Speech (TTS) – Converts text into human-like speech (e.g., screen
readers).
🔹 Phonetics & Prosody Analysis – Understanding tone, pitch, and pronunciation.
Ambiguity in NLP

 Lexical: Words with multiple meanings.


 Syntactic: Different sentence structures.
 Semantic: Context-dependent meanings.
 Pragmatic: Tone and situation variations.
 1. Lexical Ambiguity (Words with Multiple Meanings)
 Lexical ambiguity arises when a word has more than one
meaning, and the context does not make it immediately clear
which meaning is intended. This often occurs with homonyms
(words that sound the same but have different meanings) and
polysemes (words with related but distinct meanings).
 Examples:
 "Bank" – Can refer to a financial institution or the side of a
river.
 "Bat" – Can mean a flying mammal or a piece of sports
equipment.
 "Light" – Can mean "not heavy" or "illuminated."
 Lexical ambiguity is often resolved through context. For
instance, in "He deposited money in the bank," the
financial institution meaning is clear. However, in "He sat on
the bank," it likely refers to the riverbank.
 2. Syntactic Ambiguity (Different Sentence Structures)
 Syntactic ambiguity occurs when a sentence or phrase can be
parsed in more than one way due to its structure. This often
happens with phrases that can be interpreted differently depending
on grouping or punctuation.
 Examples:
 "I saw the man with the telescope."
 Does it mean I used a telescope to see the man?
 Or does it mean I saw a man who had a telescope?
 "Visiting relatives can be annoying."
 Does it mean the act of visiting relatives is annoying?
 Or does it mean relatives who visit are annoying?
 Syntactic ambiguity is often clarified through punctuation or
rewording:
✔ "I used a telescope to see the man."
✔ "Relatives who visit can be annoying."
 3. Semantic Ambiguity (Context-Dependent Meanings)
 Semantic ambiguity occurs when a sentence contains words or
phrases that can be interpreted in multiple ways based on
meaning. Unlike lexical ambiguity (where a word has multiple
dictionary meanings), semantic ambiguity arises from how
meanings interact within a sentence.
 Examples:
 "She can't bear children."
 Does it mean she cannot tolerate children?
 Or does it mean she is unable to have children?
 "Old friends and teachers attended the reunion."
 Does it mean friends who are old and teachers?
 Or does it mean old friends and also teachers?
 Semantic ambiguity is resolved by adding clarity through phrasing:
✔ "She is unable to have children."
✔ "Both old friends and teachers attended the reunion."
 4. Pragmatic Ambiguity (Tone and Situation Variations)
 Pragmatic ambiguity occurs when the meaning of a sentence
depends on the context, speaker intention, or tone, rather than
just words and grammar. It involves implied meaning, which is
influenced by situational context, cultural norms, and
conversational maxims.
 Examples:
 "Can you pass the salt?"
 A literal interpretation: Do you have the ability to pass the salt?
 A pragmatic interpretation: Please pass the salt. (A polite request)
 "It's cold in here."
 A literal interpretation: Stating the temperature is low.
 A pragmatic interpretation: Please close the window or turn on the
heater.
 Pragmatic ambiguity is often clarified through intonation, body
language, and situational cues.
Challenges of NLP

 1. Context Understanding
 2. Sarcasm & Humor Detection
 3. Polysemy & Homonyms
 4. High Computational Costs
 5. Data Bias & Fairness
 6. Multilingual Processing
 1. Context Understanding
 One of the biggest challenges in NLP is understanding
context in human language. Words and phrases often
derive meaning from their surrounding text, making it
difficult for machines to interpret them correctly.
 Examples:
 Pronoun Resolution: "John told Mark he won the lottery."
→ Who won, John or Mark?
 Context Dependency: "I saw a man on a hill with a
telescope." → Did I use a telescope, or did the man have it?
 Challenges:
 Resolving ambiguous references.
 Understanding implied meanings in conversations.
 Interpreting conversational nuances in dialogue systems.
 2. Sarcasm & Humor Detection
 Detecting sarcasm and humor is difficult because these
forms of expression rely heavily on tone, cultural
context, and prior knowledge.
 Examples:
 Sarcasm: "Oh, great! Another Monday morning meeting!"
(Actually means the opposite.)
 Humor: "Why did the scarecrow win an award? Because he
was outstanding in his field!"
 Challenges:
 Sarcasm often contradicts the literal meaning.
 Humor depends on wordplay, context, and sometimes
culture.
 Emotion detection is needed for proper interpretation.
 3. Polysemy & Homonyms
 Polysemy refers to words with multiple meanings, while homonyms
are words that sound or look the same but have different meanings.
These cause lexical ambiguity in NLP.
 Examples:
 Polysemy:
 "Bank" → (Financial institution) OR (Side of a river)
 "Light" → (Not heavy) OR (Bright)
 Homonyms:
 "Bark" → (Sound made by a dog) OR (Tree covering)
 "Lead" → (To guide) OR (A metal)
 Challenges:
 Machines struggle to disambiguate meanings without full sentence
context.
 Dictionary-based approaches fail when words have figurative meanings.
 Requires large annotated datasets to train models effectively.
 4. High Computational Costs
 NLP models, especially deep learning-based ones like GPT,
BERT, and LLaMA, require massive computational
resources for training and inference.
 Examples:
 Training a BERT model → Requires hundreds of gigabytes
of text and extensive GPU/TPU resources.
 Inference for large models → Needs real-time processing,
making them expensive for large-scale applications.
 Challenges:
 High energy consumption and carbon footprint.
 Slower response times in real-time applications.
 Need for optimization techniques like pruning, quantization,
and distillation.
 5. Data Bias & Fairness
 NLP models learn from biased training data, leading to unfair or
discriminatory outputs. This issue arises due to historical biases
present in text corpora.
 Examples:
 Gender Bias:
 "A doctor is most likely a man."
 "A nurse is most likely a woman."
 Racial & Cultural Bias:
 AI-generated job recommendations may favor certain demographics over
others.
 Challenges:
 Need for ethical AI development.
 Fair representation of diverse linguistic and cultural backgrounds.
 Bias mitigation techniques (reweighting datasets, adversarial
training).
 6. Multilingual Processing
 Languages vary in grammar, syntax, word order, and
meaning, making multilingual NLP a complex challenge.
 Examples:
 Word Order Differences:
 English: "I eat an apple."
 Japanese: "I apple eat." (Word order changes.)
 Low-Resource Languages:
 English has abundant training data, but many languages (like
Amharic or Khmer) have limited text datasets, making model
training difficult.
 Challenges:
 Adapting models to under-resourced languages.
 Handling code-switching (mixing languages in a sentence).
 Cross-lingual transfer learning without losing accuracy.
Applications of NLP

 - Chatbots & Virtual Assistants


 - Machine Translation (Google Translate)
 - Text Summarization
 - Sentiment Analysis
 - Speech Recognition
 - Spam Detection
 - Medical NLP
 - Search Engines
Future of NLP

 Advanced AI Models (GPT-4, ChatGPT)


 Better Context Understanding
 Emotionally Aware NLP
 More Efficient Multilingual NLP
 1. Advanced AI Models (GPT-4, ChatGPT, etc.)
 What It Means:
 Modern AI models like GPT-4, ChatGPT, LLaMA, and Claude represent a significant
leap in NLP technology. These models leverage transformers, large-scale datasets,
and deep learning to generate highly coherent and contextually relevant text.
 Key Features:
 Bigger & Better Training Data:
 These models are trained on vast amounts of text from books, websites, academic
papers, and social media to understand diverse linguistic patterns.
 Improved Understanding & Generation:
 They produce more human-like responses, making them useful for chatbots, customer
support, education, and content creation.
 Few-Shot & Zero-Shot Learning:
 GPT-4 can perform tasks with little to no prior examples, making it adaptable for various
domains without extensive fine-tuning.
 Challenges & Future Scope:
 Ethical Concerns: Bias in training data can lead to unfair or misleading outputs.
 Computational Costs: Training these models requires massive GPUs/TPUs, making
them expensive and energy-intensive.
 Hallucinations: These models sometimes generate false or misleading
information, requiring better fact-checking mechanisms.
 2. Better Context Understanding
 What It Means:
 Traditional NLP models struggled with understanding context, but newer models
incorporate better memory and reasoning mechanisms to grasp complex
conversations, documents, and long-form content.
 Improvements in Context Handling:
 Longer Context Retention:
 Models like GPT-4 can maintain context over several paragraphs or even pages,
improving long-form interactions.
 Coreference Resolution:
 AI can now better track references like pronouns and named entities, reducing
ambiguity.
 Example:
 "Alice told Bob that she would call later." (Who is "she"? AI now resolves this better.)
 Multi-Turn Conversations:
 Chatbots & Virtual SSAssistants can hold meaningful conversations across
multiple turns, remembering previous responses.
 Disambiguation of Meaning:
 AI can differentiate between meanings based on surrounding text.
 Example:
 "I went to the bank." → (Financial institution or riverbank?)
 Challenges & Future Scope:
 Implied Context Handling: AI still struggles with implicit meanings and
sarcasm.
 Memory Limitations: Even advanced models have a token limit, restricting
how much past conversation they can remember.
 3. Emotionally Aware NLP
 What It Means:
 Future NLP models are being designed to detect, interpret, and respond to
human emotions, making AI interactions more natural and empathetic.
 How It Works:
 Sentiment Analysis: AI can classify text as positive, negative, or neutral,
helping in social media monitoring, customer feedback analysis, and mental
health applications.
 Tone Detection: AI can adjust its response based on anger, happiness,
sadness, or frustration detected in the user’s text.
 Adaptive Responses: Chatbots in customer service, therapy, and virtual
assistants can modify their replies based on user sentiment.
 Challenges & Future Scope:
 Sarcasm & Irony Detection: Still a major challenge as AI often struggles to
differentiate between serious and sarcastic remarks.
 Cultural Variations: Emotions are expressed differently across cultures,
requiring diverse training datasets.
 Ethical Concerns: Should AI simulate emotions, or should it just detect and
acknowledge them?
 4. More Efficient Multilingual NLP
 What It Means:
 Modern NLP models are becoming more efficient at handling multiple
languages, enabling seamless translation, speech recognition, and cross-lingual
communication.
 Key Advancements:
 Zero-Shot Translation: AI can translate languages without direct training
examples (e.g., English ↔ Swahili even if there’s little parallel data).
 Code-Switching Support: Models can handle sentences mixing languages.
 Example: "I need a break, चलो कुछ अच्छा देखते हैं।"
 Better Grammar & Syntax Adaptation: AI models now adapt to language-
specific grammar rules more effectively.
 Challenges & Future Scope:
 Low-Resource Languages: Many languages lack sufficient training data, leading
to lower accuracy.
 Regional Dialects & Slang: NLP models struggle with variations in language
across different regions.
 Bias & Fairness: Some languages receive better support than others, leading to
imbalanced performance.

You might also like