0% found this document useful (0 votes)

2 views3 pages

NLP - Notes

Natural Language Processing (NLP) enables computers to understand and process human languages, with applications including automatic summarization, sentiment analysis, text classification, and virtual assistants. Key concepts in NLP include text normalization, tokenization, and techniques like stemming and lemmatization to simplify and analyze text data. The document also discusses the bag of words model and metrics such as term frequency and inverse document frequency for analyzing text data.

Uploaded by

srivastava.shlok2900

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views3 pages

NLP - Notes

Uploaded by

srivastava.shlok2900

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

NATURAL LANGUAGE PROCESSING - Class 10 Notes

NLP (Natural Language Processing) is dedicated for making it possible for computers
to comprehend and process human languages. Artificial intelligence is a subfield of
linguistic, computer science, information engineering and artificial intelligence
that studies how computers interact with human languages, particularly how to train
computers to handle and analyze massive volume of natural language data.

* Applications of natural language processing*

Most people utilize NLP apps on a regular basis in their daily lives following our
few examples of a real world uses of natural language processing:

Automatic summarization -- is useful for gathering data from social media and other
online sources, as well as for summarizing the meaning of documents and other
written material.

Sentimental analysis -- to better comprehend what Internet users are saying about a
company's goods and services, businesses use natural language processing tools like
sentimental analysis to understand the customer's requirement.

Text classification -- enables you to classify a document and organize it to make

it easier to find the information you need or to carry certain tasks. Spam
screening in email is one of examples of how text categorization is used.

Virtual assistants -- these days digital assistants like Google Assistant, Cortana,
Siri and Alexa play a significant role in our lives not only can we communicate
with them, but they can also facilitate our life.

*Chatbots*
A chat board is one of the most widely used NLP applications. Many chat boards on
the market now employ the same strategy as we did in the instance above.

There are two types of chat bots

1. Script-bot
2.Smart-bot

Script bots are easy to make.

Script bots work around a script which is programmed in them.
Mostly they are free and are easy to integrate to a messaging platform.
No or little language processing skills.
Limited functionality.

Smart bots are flexible and powerful.

Smartboards work on bigger database and other resources directly.
Smart bots learn with more data.
Coding is required to take this up on board.
Wide functionality.

* Human language v/s Computer language*

Humans need language to communicate which we constantly process. Our brain
continuously processes the sound adheres around us and works to make sense of them.
Our brain continuously processes and stores everything even as the teacher is
delivering the lesson in the classroom.

The computer language is understood by the computer on the other hand. All input
must be transformed to numbers before being sent into the machine. And if a single
error is made while typing the machine throws an error and skips over that area.
Machines only use extremely simple and elementary forms of communication.
*Data Processing*
Data processing is a method of manipulation of data. It means that conversation of
raw data into meaningful and machine readable content. It basically is a process of
converting raw data into meaningful information.
Since human language are complex we need to first of all simplify them in order to
make sure that the understanding becomes possible. Text normalization helps in
cleaning up the textual data in such a way that it comes down to a level where its
complexity is lower than the actual data. Let us go through text normalization in
detail.

*Text normalization*
The process of converting a text into a conical bracket standard form is known as
text normalization. For instance the conical form of word "good" can be created
from the word "goood" and "gud".

* Sentence segmentation*
under sentence segmentation the whole corpus is divided into sentences. Each
sentence is taken as a different data so now the whole corpus gets reduced to
sentences.

*Tokenisation*
Sentences are first broken into segments and then each segment is further divided
into tokens. Any word, number, or special character that appears in a sentence is
referred to as a token. Tokenisation treats each word, integer, and special
character as a separate entity and creates a token for each of them.

* Removing stop words, special characters and numbers*

In this step the tokens which are not necessary are removed from the token list.
What can be the possible words which we might not require?
Stop words are words that are used frequently in a corpus but provide nothing
useful. Humans utilize grammar to make their sentence clear and understandable for
the other person. However, grammatical terms fall under the category of stop words
because they do not add any significance to the information that is communicated
through the statement. Stop words include a, and, and, or, for, it, is, etc.

Converting text to a common case

After eliminating the stop words, we change the text's case throughout, probably to
lower case. This makes sure that the machine's case sensitivity does not treat
similar terms differently solely because of varied case usage.

*Stemming*
The remaining words are boiled down to their root words in this step. In other
words, Stemming is the process of stripping words of their effects and returning
them to their original forms.

*Lemmatisation*
Steaming and lemmatisation are alternate techniques to one another because they
function to remove Affixes. However, limitation differs from both of them in that
the words that result from the elimination of affix (also known as the lemma) is
meaningful.

*Bag of Words*
A bag of words is a textual illustration that shows where words appear in a
document. There are two components: A collection of a well known words, a metric
for the amount of well known words.
A natural language processing model called bag of words aids in the extraction of
textual information that can be used by machine learning techniques. We gathered
the instances of each term from the bag of words and create the corpus's
vocabulary.
Step-by-step approach to implement bag of words algorithm:
1. Text Normalisation: Collect data and pre-process it
2. Create Dictionary: Make a list of all unique words occurring in the corpus.
(Vocabulary)
3. Create document vectors: For each document in the corpus, find out how many
times the word from the unique list of words has occurred.
4. Create document vectors for all the documents

*Term Frequency*
The measurement of a term's frequency inside a document is called term frequency.
The simplest calculation is to count the instances of each word. However, there are
always to change that value based on the length of the document or the frequency of
the terms that appears the most often.

Inverse Document Frequency

A term's frequency inside a corpus of documents is determined by its inverse
document frequency. It is calculated by dividing the total number of documents in
the corpus by the number of documents that contain the phrase.

BSBPMG535 Student Assessment 2 - Project Portfolio.
No ratings yet
BSBPMG535 Student Assessment 2 - Project Portfolio.
15 pages
WindchillCustomizationGuide 12 0 2 0 PDF
No ratings yet
WindchillCustomizationGuide 12 0 2 0 PDF
2,477 pages
1st Pole To Clear Factor-Calculation & Significance
100% (1)
1st Pole To Clear Factor-Calculation & Significance
15 pages
NLP_AI_X
No ratings yet
NLP_AI_X
6 pages
Unit 6 (NLP)
No ratings yet
Unit 6 (NLP)
8 pages
C10_AI_UNIT 3_NLP_ HALF YEARLY
No ratings yet
C10_AI_UNIT 3_NLP_ HALF YEARLY
37 pages
AIUnit 6 10
No ratings yet
AIUnit 6 10
8 pages
Natural Language Processing Revision Notes
No ratings yet
Natural Language Processing Revision Notes
4 pages
Natural Language Processing
No ratings yet
Natural Language Processing
10 pages
Unit-6 Natural Language Processing
No ratings yet
Unit-6 Natural Language Processing
7 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
5 pages
NLP - CH-6
No ratings yet
NLP - CH-6
4 pages
AI_NLP
No ratings yet
AI_NLP
9 pages
Unit 6 - AI (NLP)
No ratings yet
Unit 6 - AI (NLP)
37 pages
Natural Language Processing Notes Class 10
No ratings yet
Natural Language Processing Notes Class 10
10 pages
pdf NLP
No ratings yet
pdf NLP
7 pages
Natural Language Processing_NOTES
No ratings yet
Natural Language Processing_NOTES
4 pages
NLP
No ratings yet
NLP
16 pages
Chapter-1 Introduction To NLP
No ratings yet
Chapter-1 Introduction To NLP
12 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Week 8-Module 7 NLP
No ratings yet
Week 8-Module 7 NLP
52 pages
IP Projects NLP
No ratings yet
IP Projects NLP
8 pages
TSP unit1 own (1)
No ratings yet
TSP unit1 own (1)
20 pages
Natural Language Processing_compressed
No ratings yet
Natural Language Processing_compressed
17 pages
UNIT 6- NLP NOTES
No ratings yet
UNIT 6- NLP NOTES
7 pages
Assignment of AI Finished
No ratings yet
Assignment of AI Finished
16 pages
TSP Unit1 Own
No ratings yet
TSP Unit1 Own
13 pages
Welcome
No ratings yet
Welcome
8 pages
NLP - Srilakshmi H - PPT Assignment
No ratings yet
NLP - Srilakshmi H - PPT Assignment
29 pages
6._NLP
No ratings yet
6._NLP
11 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
dupppppppppp
No ratings yet
dupppppppppp
15 pages
PART B NOTES
No ratings yet
PART B NOTES
62 pages
NLP Notes
No ratings yet
NLP Notes
10 pages
Text Analytics Basics
No ratings yet
Text Analytics Basics
28 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
Natural Language Processing Notes Class 10 AI
100% (1)
Natural Language Processing Notes Class 10 AI
20 pages
Natural Language Processing tools and approaches
No ratings yet
Natural Language Processing tools and approaches
106 pages
VO_MCA_SEM 4 _ Text Mining _U2
No ratings yet
VO_MCA_SEM 4 _ Text Mining _U2
15 pages
Natural Language Processing Notes Class 10 AI
No ratings yet
Natural Language Processing Notes Class 10 AI
25 pages
NLP (4)
No ratings yet
NLP (4)
40 pages
Chapter - 6 Communicating, Perceiving, and Acting
No ratings yet
Chapter - 6 Communicating, Perceiving, and Acting
30 pages
Ch-6 NLP
No ratings yet
Ch-6 NLP
4 pages
Chapter 6.
No ratings yet
Chapter 6.
31 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
AI-2
No ratings yet
AI-2
7 pages
Lecture 2 NLP
No ratings yet
Lecture 2 NLP
27 pages
Introduction to Data Science_Week 7_LAQ's
No ratings yet
Introduction to Data Science_Week 7_LAQ's
4 pages
NLP unit1
No ratings yet
NLP unit1
24 pages
Ai Applications Unit-1
No ratings yet
Ai Applications Unit-1
11 pages
Key Notes for Natural Language Processing
No ratings yet
Key Notes for Natural Language Processing
2 pages
1009_nlp_ppt
No ratings yet
1009_nlp_ppt
31 pages
NLP-Questions (1)
No ratings yet
NLP-Questions (1)
26 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
NLP m2
No ratings yet
NLP m2
71 pages
Seminar Report1
No ratings yet
Seminar Report1
17 pages
NLP FINAL
No ratings yet
NLP FINAL
33 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
Intro To NLP: Natural Language Toolkit
No ratings yet
Intro To NLP: Natural Language Toolkit
11 pages
NLP IA1
No ratings yet
NLP IA1
7 pages
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Semantics: Fundamentals and Applications
From Everand
Statistical Semantics: Fundamentals and Applications
Fouad Sabry
No ratings yet
Listofpractical 22 23
No ratings yet
Listofpractical 22 23
24 pages
Vled Maintenance Manual
No ratings yet
Vled Maintenance Manual
192 pages
Acer v193hq Rc185ac-Anu-Ae01
No ratings yet
Acer v193hq Rc185ac-Anu-Ae01
52 pages
Thanmay Cybercrime 2
No ratings yet
Thanmay Cybercrime 2
8 pages
Week 1 CN Lab
No ratings yet
Week 1 CN Lab
15 pages
Profibus-User's manual-ENG - V2.80 PDF
No ratings yet
Profibus-User's manual-ENG - V2.80 PDF
245 pages
Daikin Reiri Catalogue v2
No ratings yet
Daikin Reiri Catalogue v2
21 pages
Novel Metrics in Software Industry
No ratings yet
Novel Metrics in Software Industry
7 pages
Steam Turbine Unloading and Shut-Down of Operation Turbine/Generator Shut-Down Diagram
100% (1)
Steam Turbine Unloading and Shut-Down of Operation Turbine/Generator Shut-Down Diagram
5 pages
Tricentis-White-Paper_Quality-Center-Migration-Guide
No ratings yet
Tricentis-White-Paper_Quality-Center-Migration-Guide
11 pages
JWT
No ratings yet
JWT
2 pages
Jurnal Perkembangan Teknologi
100% (1)
Jurnal Perkembangan Teknologi
18 pages
Innovation Topic
No ratings yet
Innovation Topic
6 pages
Penawaran Ekatalog Alkes
No ratings yet
Penawaran Ekatalog Alkes
1 page
Anna University Chennai Thesis Format
100% (1)
Anna University Chennai Thesis Format
5 pages
Niche Skills
No ratings yet
Niche Skills
40 pages
Ebm Papst Fan AHU
No ratings yet
Ebm Papst Fan AHU
9 pages
Swift For Corporates Overview
100% (1)
Swift For Corporates Overview
61 pages
Installation Instructions
No ratings yet
Installation Instructions
2 pages
Class Diagram & GRASP Patterns
No ratings yet
Class Diagram & GRASP Patterns
63 pages
Kubernetes CKA 0400 Application Lifecycle Management
No ratings yet
Kubernetes CKA 0400 Application Lifecycle Management
69 pages
Orientation Manual
No ratings yet
Orientation Manual
10 pages
VPIT Status and Activity Report For July 2024
No ratings yet
VPIT Status and Activity Report For July 2024
4 pages
Traffic Modelling Guidelines
100% (1)
Traffic Modelling Guidelines
184 pages
AMZ - Driverless - JFR Uploaded 26 Jan 2020
No ratings yet
AMZ - Driverless - JFR Uploaded 26 Jan 2020
41 pages
TTS - Basic Computer Structure
No ratings yet
TTS - Basic Computer Structure
36 pages
Wifi-Nc: Wifi Over Narrow Channels
No ratings yet
Wifi-Nc: Wifi Over Narrow Channels
21 pages

NLP - Notes

Uploaded by

NLP - Notes

Uploaded by

NATURAL LANGUAGE PROCESSING - Class 10 Notes

* Applications of natural language processing*

Text classification -- enables you to classify a document and organize it to make

There are two types of chat bots

Script bots are easy to make.

Smart bots are flexible and powerful.

* Human language v/s Computer language*

* Removing stop words, special characters and numbers*

*Converting text to a common case*

*Inverse Document Frequency*

You might also like

Converting text to a common case

Inverse Document Frequency