0% found this document useful (0 votes)

4 views

NLP - PPT - CH 1

Uploaded by

Somasekhar Lalam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

NLP - PPT - CH 1

Uploaded by

Somasekhar Lalam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Natural Language

Processing
Pushpak Bhattacharya
Aditya Joshi

Chapter 1
Introduction

Copyright © 2023 by Wiley India Pvt. Ltd.

Chapter 1 Introduction
• 1.1 Language and Linguistics
• 1.2 Ambiguity and Layers of NLP
• 1.3 Grammar, Probability and Data
• 1.4 Generations of NLP
• 1.5 Scope of the book

2 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Learning Objectives

• Understand the relationship between linguistics, probability, and data in the context
of Natural Language Processing (NLP)

• Describe three generations of NLP

• List typical NLP problems

• Conceptualize applications of NLP

3 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

“Languages allow humans to record and reproduce ideas and are recorded in multiple
forms using spoken words or gestures/signs, written words, or in the digitized form
such as video, audio, or text”

• This work explores the field of natural language processing (NLP): computational
techniques that process languages used by humans (i.e., ‘natural’ languages)

• ‘Human language technology’ highlights that it is a technology while keeping the

‘human’ in focus

• ‘Computational linguistics’ is an alternative name, which identifies the collaboration

between linguistics and computer science that is at the heart of NLP

4 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Language and Linguistics

• Birds and animals create sounds to communicate with one another

• Communication allows exchange of ideas and, as a result, improves the species’

ability to survive

• This extends even to humans

• One of the key reasons for the advancement of the human race is language

• Language is a set of well-agreed codes to convey ideas

5 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• In the case of spoken texts, sounds are the codes

• In the case of written texts, symbols are the codes that carry meaning and, hence,
convey ideas

• Scripts of languages fall into three categories depending on how they treat vowels

• They are: alphabet, abugida, and abjad

• In case of ‘alphabet’, all consonants and vowels are written explicitly

• For the ‘abjad’ category, as in Arabic, vowels are mostly dropped completely

6 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Sign languages are a detailed set of gestures conveyed using hands and/or facial
expressions to convey a detailed meaning

• They are used by people who may not be able to produce conventional sounds
in a language

• Sign language can be codified via specialized scripts

• With this diversity of language, “Linguistics” is the scientific study of languages

• The layers of NLP align with several sub-fields of linguistics

Metaphorically speaking, linguistics is the eye of NLP!

7 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Ambiguity and Layers of NLP
• Language is inherently ambiguous

• A word, a phrase, a sentence can have more than one meaning

• Ambiguity can broadly be of two kinds: lexical and structural

• The former arises primarily due to multiple meanings of lexical units, specifically
words

• The latter arises from how phrases are linked to other sentential units

8 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Lexical Ambiguity
Example:

Question: Why should one never date a tennis player?

Answer: Love means nothing to them (Ambiguity of ‘love’ and ‘nothing’)

• The ambiguity arises due to a possible meaning of ‘love’ as a zero score in tennis and
the word ‘nothing’

• The word ‘love’ as zero has a specific technical meaning in tennis

9 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Dependency Ambiguity
Let us see an interesting sentence which appeared in The Times of India:
‘Maharashtra reports increased COVID-19 cases’

• There are five words in this sentence, with the intended meaning as ‘it is reported by
the Government of Maharashtra that COVID-19 cases have increased’

Fig 1.1 Structural ambiguity

10 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Pragmatic Ambiguity
Example: The following is a conversation between a passenger and a chatbot

Fig 1.2 Example of multimodal sarcasm

11 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

NLP handles many research problems, some of which are as follows:

i. Machine translation (e.g., Google Translate)

ii. Information extraction (e.g., OpenIE)

iii. Information retrieval (all the search engines)

iv. Question-answering systems (e.g., Alexa)

v. Multimodal NLP (e.g., emotion recognition based on facial expressions and spoken
words)

vi. Sentiment and emotion analysis: (e.g., the recent ChatGPT tool)

12 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Fig 1.3 Visualizing NLP along three dimensions

13 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Grammar, Probability, and Data
• Computation in NLP involves mechanisms that encode the structure of language

• While structure brings a sense of determinism to language, it often misses the point
that language is not deterministic or fossilized

• Language is non-deterministic implies:

i. No grammar, as far as we know, can capture all and only the phenomena of the
language

ii. Given the left-hand side (LHS) of a production rule, multiple right-hand sides (RHS)
are possible

14 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Lexical and structural ambiguities can cause more than one derivation

• It is to be noted that language behaviour of a community evolves over time

• As a result, new and unexpected language phenomena are witnessed, while some
fade away

• This potentially explains why NLP turns to “probability”

• Similar to Machine Learning, In the context of NLP, this maps to:

“Use a dataset of textual examples as the ‘training data’ (in the sense of past data), to
learn a model that makes predictions on the ‘test data’ (in the sense of future data)”

• Ambiguity resolution in the context of NLP is grounded on two key principles:

i. Language constructs have multiple meanings (i.e., ambiguity)

ii. NLP must choose among possible meanings by assigning numerical scores to them

• These scores come from probability, which is in turn based on data

Fig 1.4 Relationship between grammar,
probability, and data

Generations of NLP

• Two forces propelled entry and harnessing of data in NLP:

i. Rules many times proved woefully inadequate to capture the full spectrum and
complexity of language phenomena

ii. Language data in machine-processable form kept on increasing and machine

learning opened up new vistas

• Probability showed a way of teasing the regularities out of the data

• Three ways of computing probability from language data emerged:

i. Maximum likelihood
ii. Maximum entropy
iii. Bayesian probability

• In general, disambiguation in NLP at any layer amounts to choosing the best among
possible labels, strings, trees, or graphs

• Deep learning introduced a continuum in which similar linguistic units were close to
each other because:
i. It could handle data sparsity
ii. it could learn vectors that are similar to each other

• However, NLP witnessed a paradigm shift through the use of deep towards the so-
called deep neural networks (DNN)

• DNNs interweave two operations:

i. Learning meaning representations of language units

ii. Learning to solve particular NLP tasks. DNN for NLP relies on layers of neural
networks learning representations of text

Fig 1.5 Three generations of NLP and their characteristics

• This book takes a historical perspective of NLP via the three generations when
describing the foundations and applications of NLP

• Representations of NLP is introduced in the next chapters

• Several problems in NLP such as POS tagging, parsing, natural language inference,
sentiment analysis, question-answering is discussed

• It introduces resources, business cases, linguistic foundations, and research avenues

to account for the applications, foundations, and innovations that drive today’s NLP

Thank you

Natural Language Processing -- Pushpak Bhattacharyya, Aditya Joshi -- 2023 -- Wiley -- 9357462392 -- 91318cbe64152bab200a9000ec1e277a -- Anna’s Archive
No ratings yet
Natural Language Processing -- Pushpak Bhattacharyya, Aditya Joshi -- 2023 -- Wiley -- 9357462392 -- 91318cbe64152bab200a9000ec1e277a -- Anna’s Archive
624 pages
Filename Rundell Katherine Rooftoppers - PDF
No ratings yet
Filename Rundell Katherine Rooftoppers - PDF
2 pages
Lec1-UNIT5 -MORE SIMPLER
No ratings yet
Lec1-UNIT5 -MORE SIMPLER
28 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
NLP Merged
100% (1)
NLP Merged
975 pages
Module_1_part1_NLP
No ratings yet
Module_1_part1_NLP
24 pages
Introduction To NLP
No ratings yet
Introduction To NLP
51 pages
1 Natural Language Processing-Intro
No ratings yet
1 Natural Language Processing-Intro
16 pages
m1 Ch1 Overview
No ratings yet
m1 Ch1 Overview
113 pages
Lec 1.1.2
No ratings yet
Lec 1.1.2
44 pages
NLP_MODULE-1_new_updated
No ratings yet
NLP_MODULE-1_new_updated
57 pages
NLP Introduction Overview
No ratings yet
NLP Introduction Overview
34 pages
NLP chap1
No ratings yet
NLP chap1
50 pages
NLP PPT1 (1)
No ratings yet
NLP PPT1 (1)
29 pages
NLP Notes Unit 1to5 final
No ratings yet
NLP Notes Unit 1to5 final
75 pages
01 Introduction To NLP
No ratings yet
01 Introduction To NLP
39 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
1 - Introducntion To NLP
No ratings yet
1 - Introducntion To NLP
43 pages
NLP Chapter 1
No ratings yet
NLP Chapter 1
32 pages
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
No ratings yet
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
55 pages
1 INTRODUCTION
No ratings yet
1 INTRODUCTION
13 pages
Aai NLP
No ratings yet
Aai NLP
52 pages
Natural Language Processing: Instructor: Dr. Muhammad Asfand-E-Yar
No ratings yet
Natural Language Processing: Instructor: Dr. Muhammad Asfand-E-Yar
41 pages
NLP Unit I
No ratings yet
NLP Unit I
30 pages
Archivo - 01 (4 Cópia)
No ratings yet
Archivo - 01 (4 Cópia)
6 pages
Phases of NLP (8 Files Merged)
No ratings yet
Phases of NLP (8 Files Merged)
66 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
nayie bayes classifier 21 page
No ratings yet
nayie bayes classifier 21 page
28 pages
NLP Notes (Ch1-5) PDF
100% (1)
NLP Notes (Ch1-5) PDF
41 pages
1 intro to NLP
No ratings yet
1 intro to NLP
5 pages
notes
No ratings yet
notes
9 pages
Module 1 NLP 2024-25 6 Sem Even
No ratings yet
Module 1 NLP 2024-25 6 Sem Even
116 pages
nlp-01
No ratings yet
nlp-01
16 pages
NLP - PPT - CH 2
No ratings yet
NLP - PPT - CH 2
66 pages
NLP-UNIT-I FINAL
No ratings yet
NLP-UNIT-I FINAL
31 pages
Introduction To Natural Language Processing-03-01-2024
No ratings yet
Introduction To Natural Language Processing-03-01-2024
27 pages
NLP-UNIT-I FINAL
No ratings yet
NLP-UNIT-I FINAL
31 pages
01
No ratings yet
01
60 pages
1.introduction To Natural Language Processing (NLP)
100% (1)
1.introduction To Natural Language Processing (NLP)
37 pages
Unit V
No ratings yet
Unit V
16 pages
Module-1
No ratings yet
Module-1
39 pages
AI_M3_Merged.pdf
No ratings yet
AI_M3_Merged.pdf
98 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
1 - Intro - To - NLP 2
No ratings yet
1 - Intro - To - NLP 2
55 pages
1 Intro
No ratings yet
1 Intro
72 pages
Natural Language Processing: Dr. Abdulfetah A.A
No ratings yet
Natural Language Processing: Dr. Abdulfetah A.A
25 pages
NLP Introduction
No ratings yet
NLP Introduction
35 pages
Natural Language Processing Inside Pages 2
No ratings yet
Natural Language Processing Inside Pages 2
159 pages
NLP Textbook Star Edu
No ratings yet
NLP Textbook Star Edu
103 pages
Natural Language Processing
No ratings yet
Natural Language Processing
28 pages
NLP Introduction Week3
No ratings yet
NLP Introduction Week3
28 pages
Lecture1
No ratings yet
Lecture1
16 pages
Natural Language Processing (NLP) : Chapter 1: Introduction To NLP
No ratings yet
Natural Language Processing (NLP) : Chapter 1: Introduction To NLP
96 pages
NPL
No ratings yet
NPL
2 pages
NLP
No ratings yet
NLP
88 pages
lec1
No ratings yet
lec1
18 pages
CH1
No ratings yet
CH1
87 pages
INTRONLP
No ratings yet
INTRONLP
30 pages
01 - Intro NLP
No ratings yet
01 - Intro NLP
13 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Probability Theory for Data Science Week 8
No ratings yet
Probability Theory for Data Science Week 8
68 pages
Probability Theory for Data Science Week 1
No ratings yet
Probability Theory for Data Science Week 1
60 pages
2025-02-04 14-13-20
No ratings yet
2025-02-04 14-13-20
4 pages
CM412_DL_Model Paper
No ratings yet
CM412_DL_Model Paper
5 pages
NLP - PPT - CH 4
No ratings yet
NLP - PPT - CH 4
78 pages
DL Questions
No ratings yet
DL Questions
2 pages
NLP - PPT - CH 3
No ratings yet
NLP - PPT - CH 3
69 pages
Anotations On Trevor-Wishart-Tongues-of-Fire PDF
No ratings yet
Anotations On Trevor-Wishart-Tongues-of-Fire PDF
10 pages
The Social System of Islam
No ratings yet
The Social System of Islam
5 pages
Mayora Indah TBK PDF
No ratings yet
Mayora Indah TBK PDF
84 pages
Magnetic Weighing Scale Investigatory Project
No ratings yet
Magnetic Weighing Scale Investigatory Project
14 pages
A1-A2 General Knowledge Quiz
No ratings yet
A1-A2 General Knowledge Quiz
32 pages
Materi Structure Skills 6-8
No ratings yet
Materi Structure Skills 6-8
4 pages
Idiake and Bala-Improving Labour Productivity in Masonry Work in Nigeria
100% (1)
Idiake and Bala-Improving Labour Productivity in Masonry Work in Nigeria
10 pages
REVIEWER For PRELIM 2nd Sem
No ratings yet
REVIEWER For PRELIM 2nd Sem
4 pages
Accompanying Vs Non Accompanying Family Members
No ratings yet
Accompanying Vs Non Accompanying Family Members
3 pages
Voc Test87-95
No ratings yet
Voc Test87-95
3 pages
Fitness His Edition - December 2014 ZA PDF
No ratings yet
Fitness His Edition - December 2014 ZA PDF
100 pages
IM - International MKTG
0% (2)
IM - International MKTG
14 pages
NSTP 2 CWTS 2 Module Unit2
No ratings yet
NSTP 2 CWTS 2 Module Unit2
6 pages
Assessment of Executive Functions in SCH
No ratings yet
Assessment of Executive Functions in SCH
13 pages
BURSA OCPO Contract Spec
No ratings yet
BURSA OCPO Contract Spec
2 pages
Endocrinology Old
No ratings yet
Endocrinology Old
29 pages
ESBTR Article List
No ratings yet
ESBTR Article List
2 pages
George M. Siouris-An Engineering Approach To Optimal Control and Estimation Theory-Wiley-Interscience (1996)
No ratings yet
George M. Siouris-An Engineering Approach To Optimal Control and Estimation Theory-Wiley-Interscience (1996)
211 pages
Power System PPT On CORONA
80% (10)
Power System PPT On CORONA
17 pages
Overall F/S Level (OFSL) Risk (What & Why? 3x) : Note 1 Note 2
No ratings yet
Overall F/S Level (OFSL) Risk (What & Why? 3x) : Note 1 Note 2
4 pages
1 - Golden Gate Fields Retrospecto
No ratings yet
1 - Golden Gate Fields Retrospecto
9 pages
English 10 Quarter 1 Reviewer
No ratings yet
English 10 Quarter 1 Reviewer
2 pages
G11 HHW_23-24
No ratings yet
G11 HHW_23-24
5 pages
DET Official Guide For Test Takers (EN) - 2023-05-26
No ratings yet
DET Official Guide For Test Takers (EN) - 2023-05-26
23 pages
Upp-Int - 1st Term - Exam
No ratings yet
Upp-Int - 1st Term - Exam
3 pages
Question Bank
No ratings yet
Question Bank
2 pages
S7-PDIAG - For S7-300 and S7-400 - First Steps
No ratings yet
S7-PDIAG - For S7-300 and S7-400 - First Steps
16 pages
Properties of Triangle Solved Questions
No ratings yet
Properties of Triangle Solved Questions
38 pages
Question Bank For IA Test-1
No ratings yet
Question Bank For IA Test-1
4 pages

NLP - PPT - CH 1

Uploaded by

NLP - PPT - CH 1

Uploaded by

Natural Language

Copyright © 2023 by Wiley India Pvt. Ltd.

2 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Describe three generations of NLP

• List typical NLP problems

• Conceptualize applications of NLP

3 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• ‘Human language technology’ highlights that it is a technology while keeping the

• ‘Computational linguistics’ is an alternative name, which identifies the collaboration

4 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Birds and animals create sounds to communicate with one another

• Communication allows exchange of ideas and, as a result, improves the species’

• This extends even to humans

• Language is a set of well-agreed codes to convey ideas

5 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• They are: alphabet, abugida, and abjad

• In case of ‘alphabet’, all consonants and vowels are written explicitly

6 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Sign language can be codified via specialized scripts

• With this diversity of language, “Linguistics” is the scientific study of languages

• The layers of NLP align with several sub-fields of linguistics

Metaphorically speaking, linguistics is the eye of NLP!

7 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• A word, a phrase, a sentence can have more than one meaning

• Ambiguity can broadly be of two kinds: lexical and structural

8 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Question: Why should one never date a tennis player?

• The word ‘love’ as zero has a specific technical meaning in tennis

9 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Fig 1.1 Structural ambiguity

10 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

Fig 1.2 Example of multimodal sarcasm

11 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

i. Machine translation (e.g., Google Translate)

ii. Information extraction (e.g., OpenIE)

iii. Information retrieval (all the search engines)

iv. Question-answering systems (e.g., Alexa)

12 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

13 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Language is non-deterministic implies:

14 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• It is to be noted that language behaviour of a community evolves over time

• This potentially explains why NLP turns to “probability”

• Similar to Machine Learning, In the context of NLP, this maps to:

15 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

i. Language constructs have multiple meanings (i.e., ambiguity)

• These scores come from probability, which is in turn based on data

16 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

17 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Two forces propelled entry and harnessing of data in NLP:

ii. Language data in machine-processable form kept on increasing and machine

18 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Three ways of computing probability from language data emerged:

19 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• DNNs interweave two operations:

i. Learning meaning representations of language units

20 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

21 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

• Representations of NLP is introduced in the next chapters

• It introduces resources, business cases, linguistic foundations, and research avenues

23 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

23 CH1 Introduction Copyright © 2023 by Wiley India Pvt. Ltd.

You might also like