0% found this document useful (0 votes)

26 views

517-c-30070-Assignment - chapter NLP

Uploaded by

Snehil Mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

517-c-30070-Assignment - chapter NLP

Uploaded by

Snehil Mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

BCM ARYA MODEL SR. SEC.

SCHOOL, SHASTRI NAGAR, LUDHIANA

SUBJECT: ARTIFICIAL INTELLIGENCE

CHAPTER 7: NATURAL LANGUAGE PROCESSING

One (01) Mark Questions

1. What is a Chabot?
A chatbot is a computer program that's designed to simulate human conversation
through voice commands or text chats or both. Eg: Mitsuku Bot, Jabberwacky etc.

2. What is the full form of NLP?

Natural Language Processing

3. While working with NLP what is the meaning of?

a. Syntax
b. Semantics
Syntax: Syntax refers to the grammatical structure of a sentence.
Semantics: It refers to the meaning of the sentence.

4. What is the difference between stemming and lemmatization?

Stemming is a technique used to extract the base form of the words by removing affixes
from them. It is just like cutting down the branches of a tree to its stems. For example, the
stem of the words eating, eats, eaten is eat.
Lemmatization is the grouping together of different forms of the same word. In search
queries, lemmatization allows end users to query any version of a base word and get
relevant results.

5. What is the full form of TFIDF?

Term Frequency and Inverse Document Frequency

6. What is meant by a dictionary in NLP?

Dictionary in NLP means a list of all the unique words occurring in the corpus. If some
words are repeated in different documents, they are all written just once as whilecreating
the dictionary.
7. What is term frequency?
Term frequency is the frequency of a word in one document. Term frequency can easily
be found from the document vector table as in that table we mention the frequency of
each word of the vocabulary in each document.

8. Which package is used for Natural Language Processing in Python programming?

Natural Language Toolkit (NLTK). NLTK is one of the leading platforms for building
Python programs that can work with human language data.

9. What is a document vector table?

Document Vector Table is used while implementing Bag of Words algorithm.
In a document vector table, the header row contains the vocabulary of the corpus and
other rows correspond to different documents.
If the document contains a particular word it is represented by 1 and absence of word is
represented by 0 value.
10. What do you mean by corpus?
A corpus is a large and structured set of machine-readable texts that have been produced
in a natural communicative setting.

Two (02) Mark Questions

1. What are the types of data used for Natural Language Processing applications?
Natural Language Processing takes in the data of Natural Languages in the form of
written words and spoken words which humans use in their daily lives and operates on
this.

2. Differentiate between a script-bot and a smart-bot. (Any 2 differences)

Script-bot Smart-bot
 A scripted chatbot doesn’t carry  Smart bots are built on NLP and
even a glimpse of A.I ML.
 Script bots are easy to make  Smart –bots are comparatively
difficult to make.
 Script bot functioning is very  Smart-bots are flexible and
limited as they are less powerful. powerful.
 Script bots work around a script ● Smart bots work on bigger
which is programmed in them databases and other resources
directly
 No or little language processing ● NLP and Machine learning skills
skills are required.
 Limited functionality ● Wide functionality

3. Give an example of the following:

 Multiple meanings of a word
 Perfect syntax, no meaning

 Example of Multiple meanings of a word –

His face turns red after consuming the medicine
Meaning - Is he having an allergic reaction? Or is he not able to bear the taste of that
medicine?
 Example of Perfect syntax, no meaning-
Chickens feed extravagantly while the moon drinks tea.
This statement is correct grammatically but it does not make any sense. In Human
language, a perfect balance of syntax and semantics is important for better
understanding.

4. What is inverse document frequency?

To understand inverse document frequency, first we need to understand document
frequency.
Document Frequency is the number of documents in which the word occurs irrespective
of how many times it has occurred in those documents.
In case of inverse document frequency, we need to put the document frequency in the
denominator while the total number of documents is the numerator.
For example, if the document frequency of a word “AMAN” is 2 in a particular document
then its inverse document frequency will be 3/2. (Here no. of documents is 3)
5. Define the following:
● Stemming
● Lemmatization
Stemming is a technique used to extract the base form of the words by removing affixes
from them. It is just like cutting down the branches of a tree to its stems. For example, the
stem of the words eating, eats, eaten is eat.
Lemmatization is the grouping together of different forms of the same word. In search
queries, lemmatization allows end users to query any version of a base word and get
relevant results.

6. What do you mean by document vectors?

Document Vector contains the frequency of each word of the vocabulary in a particular
document.
In document vector vocabulary is written in the top row. Now, for each word in the
document, if it matches with the vocabulary, put a 1 under it. If the same word appears
again, increment the previous value by 1. And if the word does not occur in that document,
put a 0 under it.
7. What is TFIDF? Write its formula.
Term frequency–inverse document frequency, is a numerical statistic that is intended to
reflect how important a word is to a document in a collection or corpus.
The number of times a word appears in a document divided by the total number of
words in the document. Every document has its own term frequency.

8. Which words in a corpus have the highest values and which ones have the least?
Stop words like - and, this, is, the, etc. have highest values in a corpus. But these words
do not talk about the corpus at all. Hence, these are termed as stopwords and are mostly
removed at the pre-processing stage only.
Rare or valuable words occur the least but add the most importance to the corpus.
Hence, when we look at the text, we take frequent and rare words into consideration.

9. Does the vocabulary of a corpus remain the same before and after text
normalization? Why?
No, the vocabulary of a corpus does not remain the same before and after text
normalization. Reasons are –
● In normalization the text is normalized through various steps and is lowered to minimum
vocabulary since the machine does not require grammatically correct statements but the
essence of it.
● In normalization Stop words, Special Characters and Numbers are removed.
● In stemming the affixes of words are removed and the words are converted to their base
form.
So, after normalization, we get the reduced vocabulary.
10. What is the significance of converting the text into a common case?
In Text Normalization, we undergo several steps to normalize the text to a lower level.
After the removal of stop words, we convert the whole text into a similar case,
preferably lower case. This ensures that the case-sensitivity of the machine does not
consider same words as different just because of different cases.

11. Mention some applications of Natural Language Processing.

Natural Language Processing Applications-
● Sentiment Analysis.
● Chatbots & Virtual Assistants.
● Text Classification.
● Text Extraction.
● Machine Translation
● Text Summarization
● Market Intelligence
● Auto-Correct

12. What is the need of text normalization in NLP?

Since we all know that the language of computers is Numerical, the very first step that
comes to our mind is to convert our language to numbers.
This conversion takes a few steps to happen. The first step to it is Text Normalization.
Since human languages are complex, we need to first of all simplify them in order to
make sure that the understanding becomes possible. Text Normalization helps in
cleaning up the textual data in such a way that it comes down to a level where its
complexity is lower than the actual data.

13. Explain the concept of Bag of Words.

Bag of Words is a Natural Language Processing model which helps in extracting features
out of the text which can be helpful in machine learning algorithms. In bag of words, we
get the occurrences of each word and construct the vocabulary for the corpus.
Bag of Words just creates a set of vectors containing the count of word occurrences in the
document (reviews). Bag of Words vectors are easy to interpret.
14. Explain the relation between occurrence and value of a word.

plot of occurrence of words versus their value

As shown in the graph, occurrence and value of a word are inversely proportional. The
words which occur most (like stop words) have negligible value. As the occurrence of
words drops, the value of such words rises. These words are termed as rare or valuable
words. These words occur the least but add the most value to the corpus.

15. What are the applications of TFIDF?

TFIDF is commonly used in the Natural Language Processing domain. Some of its
applications are:
 Document Classification - Helps in classifying the type and genre of a document.
 Topic Modelling - It helps in predicting the topic for a corpus.
 Information Retrieval System - To extract the important information out of a corpus.
 Stop word filtering - Helps in removing the unnecessary words out of a text body.

16. What are stop words? Explain with the help of examples.
“Stop words” are the most common words in a language like “the”, “a”, “on”, “is”, “all”.
These words do not carry important meaning and are usually removed from texts. It is
possible to remove stop words using Natural Language Toolkit (NLTK), a suite of libraries
and programs for symbolic and statistical natural language processing.

17. Differentiate between Human Language and Computer Language.

Humans communicate through language which we process all the time. Our brain keeps
on processing the sounds that it hears around itself and tries to make sense out of them
all the time.
On the other hand, the computer understands the language of numbers. Everything that
is sent to the machine has to be converted to numbers. And while typing, if a single
mistake is made, the computer throws an error and does not process that part. The
communications made by the machines are very basic and simple.

Four 04 Mark Questions

1. Create a document vector table for the given corpus:

Document 1: We are going to Mumbai
Document 2: Mumbai is a famous place.
Document 3: We are going to a famous place.
Document 4: I am famous in Mumbai.

We Are going to Mumbai is a famous place I am in

1 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 0 0 0
1 1 1 1 0 0 1 1 1 0 0 0
0 0 0 0 1 0 0 1 0 1 1 1
2. Classify each of the images according to how well the model’s output matches the
data samples:

Here, the red dashed line is model’s output while the blue crosses are actual data
samples.

● The model’s output does not match the true function at all. Hence the model is said to be
under fitting and its accuracy is lower.
● In the second case, model performance is trying to cover all the data samples even if they
are out of alignment to the true function. This model is said to be over fitting and this too
has a lower accuracy
● In the third one, the model’s performance matches well with the true function which
states that the model has optimum accuracy and the model is called a perfect fit.

3. Explain how AI can play a role in sentiment analysis of human beings?

The goal of sentiment analysis is to identify sentiment among several posts or even in the
same post where emotion is not always explicitly expressed.
Companies use Natural Language Processing applications, such as sentiment analysis, to
identify opinions and sentiment online to help them understand what customers think
about their products and services (i.e., “I love the new iPhone” and, a few lines later “But
sometimes it doesn’t work well” where the person is still talking about the iPhone) and
overall *
Beyond determining simple polarity, sentiment analysis understands sentiment in
context to help better understand what’s behind an expressed opinion, which can be
extremely relevant in understanding and driving purchasing decisions.
4. Why are human languages complicated for a computer to understand? Explain.
The communications made by the machines are very basic and simple. Human
communication is complex. There are multiple characteristics of the human language
that might be easy for a human to understand but extremely difficult for a computer to
understand.

For machines it is difficult to understand our language. Let us take a look at some of them
here:
Arrangement of the words and meaning - There are rules in human language. There are
nouns, verbs, adverbs, adjectives. A word can be a noun at one time and an adjective some
other time. This can create difficulty while processing by computers.
Analogy with programming language- Different syntax, same semantics: 2+3 = 3+2 Here
the way these statements are written is different, but their meanings are the same that
is 5. Different semantics, same syntax: 2/3 (Python 2.7) ≠ 2/3 (Python 3) Here the
statements written have the same syntax but their meanings are different. In Python 2.7,
this statement would result in 1 while in Python 3, it would give an output of 1.5.
Multiple Meanings of a word - In natural language, it is important to understand that a
word can have multiple meanings and the meanings fit into the statement according to
the context of it.

Perfect Syntax, no Meaning - Sometimes, a statement can have a perfectly correct syntax
but it does not mean anything. In Human language, a perfect balance of syntax and
semantics is important for better understanding.
These are some of the challenges we might have to face if we try to teach computers
how to understand and interact in human language.

5. What are the steps of text Normalization? Explain them in brief.

Text Normalizationin Text Normalization, we undergo several steps to normalize the

text to a lower level.

Sentence Segmentation - Under sentence segmentation, the whole corpus is divided into
sentences. Each sentence is taken as a different data so now the whole corpus gets
reduced to sentences.

Tokenisation- After segmenting the sentences, each sentence is then further divided
into tokens. Tokens is a term used for any word or number or special character
occurring in a sentence. Under tokenisation, every word, number and special character
is considered separately and each of them is now a separate token.
Removing Stop words, Special Characters and Numbers - In this step, the tokens which
are not necessary are removed from the token list.
Converting text to a common case -After the stop words removal, we convert the whole
text into a similar case, preferably lower case. This ensures that the case-sensitivity of the
machine does not consider same words as different just because of different cases.
Stemming In this step, the remaining words are reduced to their root words. In other
words, stemming is the process in which the affixes of words are removed and the words
are converted to their base form.
Lemmatization -in lemmatization, the word we get after affix removal (also known as
lemma) is a meaningful one. With this we have normalized our text to tokens which are
the simplest form of words present in the corpus. Now it is time to convert the tokens
into numbers. For this, we would use the Bag of Words algorithm

6. Through a step-by-step process, calculate TFIDF for the given corpus and mention
the word(s) having highest value.
Document 1: We are going to Mumbai
Document 2: Mumbai is a famous place.
Document 3: We are going to a famous place.
Document 4: I am famous in Mumbai.

Term Frequency
Term frequency is the frequency of a word in one document. Term frequency can easily
be found from the document vector table as in that table we mention the frequency of
each word of the vocabulary in each document.

We Are Going to Mumbai is a famous Place I am in

1 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 0 0 0
1 1 1 1 0 0 1 1 1 0 0 0
0 0 0 0 1 0 0 1 0 1 1 1
Inverse Document Frequency
The other half of TFIDF which is Inverse Document Frequency. For this, let us first
understand what does document frequency mean. Document Frequency is the number of
documents in which the word occurs irrespective of how many times it has occurred in
those documents. The document frequency for the exemplar vocabulary would be:

We Are going to Mumbai is a Famous place I am in

2 2 2 2 3 1 2 3 2 1 1 1

Talking about inverse document frequency, we need to put the document frequency in
the denominator while the total number of documents is the numerator. Here, the total
number of documents are 3, hence inverse document frequency becomes:

We Are going to Mumbai is a Famous Place I am in

4/2 4/2 4/2 4/2 4/3 4/1 4/2 4/3 4/2 4/1 4/1 4/1

The formula of TFIDF for any word W becomes:

TFIDF(W) = TF(W) * log (IDF(W)) The words having highest value are – Mumbai, Famous

7. Normalize the given text and comment on the vocabulary before and after the
normalization:
Raj and Vijay are best friends. They play together with other friends. Raj likes to
play football but Vijay prefers to play online games. Raj wants to be a footballer.
Vijay wants to become an online gamer.
Normalization of the given text:
Sentence Segmentation:
1. Raj and Vijay are best friends.
2. They play together with other friends.
3. Raj likes to play football but Vijay prefers to play online games.
4. Raj wants to be a footballer.
5. Vijay wants to become an online gamer.

Tokenization:

Raj and Vijay Raj and Vijay are best friends .

are best
friends.

They play They play Together with other friends .

together with
other friends

Same will be done for all sentences.

Removing Stop words, Special Characters and Numbers:
In this step, the tokens which are not necessary are removed from the token list.
So, the words and, are, to, an, (Punctuation) will be removed.

Converting text to a common case:

After the stop words removal, we convert the whole text into a similar case, preferably
lower case.
Here we don’t have words in different case so this step is not required for given text.
Stemming:
In this step, the remaining words are reduced to their root words. In other words, stemming
is the process in which the affixes of words are removed and the words are converted to their
base form.
Word Affixes Stem
Likes -s Like

Prefers -s Prefer
Wants -s want
In the given text Lemmatization is not required.
Given Text
Raj and Vijay are best friends. They play together with other friends. Raj likes to play
football but Vijay prefers to play online games. Raj wants to be a footballer. Vijay wants to
become an online gamer.

Normalized Text
Raj and Vijay best friends They play together with other friends Raj likes to play football but
Vijay prefers to play online games Raj wants to be a footballer Vijay wants to become an
online gamer.

100 NLP Questions
100% (5)
100 NLP Questions
23 pages
Introduction To Anthropology Notes
100% (1)
Introduction To Anthropology Notes
5 pages
P.S.Senior Secondary School Class X - Artificial Intelligence - 2021-22 Natural Language Processing Question and Answers
No ratings yet
P.S.Senior Secondary School Class X - Artificial Intelligence - 2021-22 Natural Language Processing Question and Answers
7 pages
NLP-Questions Class 10 Ai
No ratings yet
NLP-Questions Class 10 Ai
8 pages
Lemmatization Is The Grouping Together of Different Forms of The Same Word. in Search
No ratings yet
Lemmatization Is The Grouping Together of Different Forms of The Same Word. in Search
11 pages
Q_ClassX_AI_Ch7
No ratings yet
Q_ClassX_AI_Ch7
6 pages
NLP
No ratings yet
NLP
7 pages
Cbse - Department of Skill Education Artificial Intelligence
No ratings yet
Cbse - Department of Skill Education Artificial Intelligence
11 pages
NLP and Evaluation
No ratings yet
NLP and Evaluation
23 pages
Ai Notes
No ratings yet
Ai Notes
11 pages
NLP notes
No ratings yet
NLP notes
3 pages
UNIT 6- NLP NOTES
No ratings yet
UNIT 6- NLP NOTES
7 pages
NLP Qa
No ratings yet
NLP Qa
10 pages
Q - ClassX - AI - NATURAL LANGUAGE PROCESSING
No ratings yet
Q - ClassX - AI - NATURAL LANGUAGE PROCESSING
10 pages
NLP Notes
No ratings yet
NLP Notes
10 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
5 pages
NLP - CH-6
No ratings yet
NLP - CH-6
4 pages
NLP Key Points
No ratings yet
NLP Key Points
3 pages
Natural Language Processing Revision Notes
No ratings yet
Natural Language Processing Revision Notes
4 pages
X_AI-NLP Worksheet
No ratings yet
X_AI-NLP Worksheet
2 pages
Nlp Revision Notes
No ratings yet
Nlp Revision Notes
6 pages
C10_AI_UNIT 3_NLP_ HALF YEARLY
No ratings yet
C10_AI_UNIT 3_NLP_ HALF YEARLY
37 pages
Unit-I QB
No ratings yet
Unit-I QB
5 pages
NLP_unit_2_imp
No ratings yet
NLP_unit_2_imp
4 pages
UNIT 6 NATURAL LANGUAGE PROCESSING.docx
No ratings yet
UNIT 6 NATURAL LANGUAGE PROCESSING.docx
10 pages
pdf NLP
No ratings yet
pdf NLP
7 pages
Sample Paper Questions - NLP (Part 2)
No ratings yet
Sample Paper Questions - NLP (Part 2)
7 pages
Ch-6 NLP
No ratings yet
Ch-6 NLP
4 pages
Question Bank For Seen Pre-Board - AI - Grade 10 - 2021-22
No ratings yet
Question Bank For Seen Pre-Board - AI - Grade 10 - 2021-22
7 pages
Unit 6 - AI (NLP)
No ratings yet
Unit 6 - AI (NLP)
37 pages
9783293-CLASS10 AI Worksheet PART B UNIT6 Natural Language Processing (1)
No ratings yet
9783293-CLASS10 AI Worksheet PART B UNIT6 Natural Language Processing (1)
3 pages
NLP Class10.PDF
No ratings yet
NLP Class10.PDF
9 pages
NLP_AI_X
No ratings yet
NLP_AI_X
6 pages
AIUnit 6 10
No ratings yet
AIUnit 6 10
8 pages
NLP
No ratings yet
NLP
14 pages
Quest_NLP
No ratings yet
Quest_NLP
13 pages
Natural Language Processing
No ratings yet
Natural Language Processing
10 pages
Nlp and Evaluation -Mcq
No ratings yet
Nlp and Evaluation -Mcq
10 pages
LP v Oral Questions and Answers
No ratings yet
LP v Oral Questions and Answers
4 pages
64 Natural Language Processing Interview Questions and Answers-18 Juli 2019
No ratings yet
64 Natural Language Processing Interview Questions and Answers-18 Juli 2019
30 pages
VIVA Q&A
No ratings yet
VIVA Q&A
5 pages
NLP Quiz Seg 1 to 4
No ratings yet
NLP Quiz Seg 1 to 4
9 pages
Natural Language Processing_compressed
No ratings yet
Natural Language Processing_compressed
17 pages
Dealing With Textual Data
No ratings yet
Dealing With Textual Data
67 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Important 2 Marks
No ratings yet
Important 2 Marks
11 pages
Natural Language Processing (NLP) Introduction:: Top 10 NLP Interview Questions For Beginners
No ratings yet
Natural Language Processing (NLP) Introduction:: Top 10 NLP Interview Questions For Beginners
24 pages
Week 6: Introduction To Natural Language Processing
No ratings yet
Week 6: Introduction To Natural Language Processing
18 pages
Unit 6 (NLP)
No ratings yet
Unit 6 (NLP)
8 pages
Motivation Video: Mitsuku Vs Cleverbot - AI (Artificial Intelligence)
No ratings yet
Motivation Video: Mitsuku Vs Cleverbot - AI (Artificial Intelligence)
45 pages
AI_NLP
No ratings yet
AI_NLP
9 pages
nlp
No ratings yet
nlp
35 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Natural Language Processing_NOTES
No ratings yet
Natural Language Processing_NOTES
4 pages
NLP Q&A1a Text Processing
No ratings yet
NLP Q&A1a Text Processing
16 pages
Assignment of AI Finished
No ratings yet
Assignment of AI Finished
16 pages
2 Marks
No ratings yet
2 Marks
11 pages
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
From Everand
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
Alexandra George
No ratings yet
Programming Language Concepts: Improving your Software Development Skills
From Everand
Programming Language Concepts: Improving your Software Development Skills
Oliver Wegner
No ratings yet
iOS Programming Nuts and bolts
From Everand
iOS Programming Nuts and bolts
Keith Lee
4/5 (1)
mock test ENG & MATHS
No ratings yet
mock test ENG & MATHS
1 page
vector 2025
No ratings yet
vector 2025
2 pages
one dimensional motion
No ratings yet
one dimensional motion
19 pages
vector addition
No ratings yet
vector addition
9 pages
517 c 19816 Data Science Assignment
No ratings yet
517 c 19816 Data Science Assignment
2 pages
Poem Wonderful Words
No ratings yet
Poem Wonderful Words
2 pages
AG - Key Word Transformations - Practice
No ratings yet
AG - Key Word Transformations - Practice
7 pages
Joseph - Saussure The Accidental Father of Structuralism
No ratings yet
Joseph - Saussure The Accidental Father of Structuralism
19 pages
Eureka Write It Right
No ratings yet
Eureka Write It Right
19 pages
THIRD PERIODICAL TEST English 3
No ratings yet
THIRD PERIODICAL TEST English 3
2 pages
Using Tag Questions: by Dejavu 79
No ratings yet
Using Tag Questions: by Dejavu 79
12 pages
Second Language Acquisition Essay
No ratings yet
Second Language Acquisition Essay
7 pages
Test Bank Ministry Exam 2003-2011 With Answer
No ratings yet
Test Bank Ministry Exam 2003-2011 With Answer
42 pages
Engl9 WLP Week 1
No ratings yet
Engl9 WLP Week 1
25 pages
(Online) CONSONANTS 2021
No ratings yet
(Online) CONSONANTS 2021
20 pages
Direct Speech
0% (2)
Direct Speech
8 pages
ECA2plus_Tests_Grammar check 3.4B
No ratings yet
ECA2plus_Tests_Grammar check 3.4B
1 page
Unidad 4
No ratings yet
Unidad 4
3 pages
For The Final Exam - Vocabulary List - 12th English
No ratings yet
For The Final Exam - Vocabulary List - 12th English
2 pages
1596711088434-Desciptive questionsSWR - Website - 06.08.2020
No ratings yet
1596711088434-Desciptive questionsSWR - Website - 06.08.2020
29 pages
Spanish December Exam 2024 paper 2-1
No ratings yet
Spanish December Exam 2024 paper 2-1
6 pages
Verbal Tenses - Resumen
No ratings yet
Verbal Tenses - Resumen
5 pages
THE POOR FISHERMAN English Make Up
No ratings yet
THE POOR FISHERMAN English Make Up
3 pages
Nouns
No ratings yet
Nouns
4 pages
Avaliac3a7c3a3o Diagnc3b3stica 8c2ba Ano
No ratings yet
Avaliac3a7c3a3o Diagnc3b3stica 8c2ba Ano
10 pages
Grade 4
No ratings yet
Grade 4
51 pages
Lexicology Seminar 4-5
No ratings yet
Lexicology Seminar 4-5
11 pages
Critiquing Poetry Including Your Own
No ratings yet
Critiquing Poetry Including Your Own
4 pages
2015 STG Polysemy HbCogLing-1
No ratings yet
2015 STG Polysemy HbCogLing-1
19 pages
Cvo Unit 1 Nice To Meet You (AMO-PAM)
No ratings yet
Cvo Unit 1 Nice To Meet You (AMO-PAM)
25 pages
Language Acq Theories Chart
No ratings yet
Language Acq Theories Chart
2 pages
Time Exp With Hacer
No ratings yet
Time Exp With Hacer
8 pages
Subject + To Be (Am/is/are) + Past Participle (PP) : No Tense Rumus Passive Voice Contoh Passive Voice
No ratings yet
Subject + To Be (Am/is/are) + Past Participle (PP) : No Tense Rumus Passive Voice Contoh Passive Voice
1 page
Degrees of Comparison of Adjectives Asynchronous Activity
No ratings yet
Degrees of Comparison of Adjectives Asynchronous Activity
2 pages

517-c-30070-Assignment - chapter NLP

Uploaded by

517-c-30070-Assignment - chapter NLP

Uploaded by

BCM ARYA MODEL SR. SEC.

SCHOOL, SHASTRI NAGAR, LUDHIANA

SUBJECT: ARTIFICIAL INTELLIGENCE

One (01) Mark Questions

2. What is the full form of NLP?

3. While working with NLP what is the meaning of?

4. What is the difference between stemming and lemmatization?

5. What is the full form of TFIDF?

6. What is meant by a dictionary in NLP?

8. Which package is used for Natural Language Processing in Python programming?

9. What is a document vector table?

Two (02) Mark Questions

2. Differentiate between a script-bot and a smart-bot. (Any 2 differences)

3. Give an example of the following:

 Example of Multiple meanings of a word –

4. What is inverse document frequency?

6. What do you mean by document vectors?

11. Mention some applications of Natural Language Processing.

12. What is the need of text normalization in NLP?

13. Explain the concept of Bag of Words.

plot of occurrence of words versus their value

15. What are the applications of TFIDF?

17. Differentiate between Human Language and Computer Language.

Four 04 Mark Questions

1. Create a document vector table for the given corpus:

We Are going to Mumbai is a famous place I am in

3. Explain how AI can play a role in sentiment analysis of human beings?

5. What are the steps of text Normalization? Explain them in brief.

Text Normalizationin Text Normalization, we undergo several steps to normalize the

We Are Going to Mumbai is a famous Place I am in

We Are going to Mumbai is a Famous place I am in

We Are going to Mumbai is a Famous Place I am in

The formula of TFIDF for any word W becomes:

Raj and Vijay Raj and Vijay are best friends .

They play They play Together with other friends .

Same will be done for all sentences.

Converting text to a common case:

You might also like