0% found this document useful (0 votes)

4 views5 pages

POS Tagging (1)

The document discusses various techniques for Part-of-Speech (PoS) tagging, including Rule-based, Stochastic, Transformation-based, and Hidden Markov Model (HMM) methods. Each technique has its own properties, advantages, and disadvantages, with Rule-based relying on hand-written rules, Stochastic using statistical probabilities, and Transformation-based combining both approaches. HMM models are introduced as a probabilistic framework for tagging, emphasizing the importance of statistical data and independence assumptions in estimating tag sequences.

Uploaded by

pichedekho3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views5 pages

POS Tagging (1)

Uploaded by

pichedekho3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

POS Tagging

Tagging is a kind of classification that may be defined as the automatic assignment of description
to the tokens. Here the descriptor is called tag, which may represent one of the part-of-speech,
semantic information and so on.
Now, if we talk about Part-of-Speech (PoS) tagging, then it may be defined as the process of
assigning one of the parts of speech to the given word. It is generally called POS tagging. In
simple words, we can say that POS tagging is a task of labelling each word in a sentence with its
appropriate part of speech. We already know that parts of speech include nouns, verb, adverbs,
adjectives, pronouns, conjunction and their sub-categories.
Most of the POS tagging falls under Rule Base POS tagging, Stochastic POS tagging and
Transformation based tagging.

Rule-based POS Tagging

One of the oldest techniques of tagging is rule-based POS tagging. Rule-based taggers use
dictionary or lexicon for getting possible tags for tagging each word. If the word has more than
one possible tag, then rule-based taggers use hand-written rules to identify the correct tag.
Disambiguation can also be performed in rule-based tagging by analyzing the linguistic features
of a word along with its preceding as well as following words. For example, suppose if the
preceding word of a word is article then word must be a noun.
As the name suggests, all such kind of information in rule-based POS tagging is coded in the
form of rules. These rules may be either −
 Context-pattern rules
 Or, as Regular expression compiled into finite-state automata, intersected with lexically
ambiguous sentence representation.
We can also understand Rule-based POS tagging by its two-stage architecture −
 First stage − In the first stage, it uses a dictionary to assign each word a list of potential
parts-of-speech.
 Second stage − In the second stage, it uses large lists of hand-written disambiguation
rules to sort down the list to a single part-of-speech for each word.

Properties of Rule-Based POS Tagging

Rule-based POS taggers possess the following properties −
 These taggers are knowledge-driven taggers.
 The rules in Rule-based POS tagging are built manually.
 The information is coded in the form of rules.
 We have some limited number of rules approximately around 1000.
 Smoothing and language modeling is defined explicitly in rule-based taggers.

Stochastic POS Tagging

Another technique of tagging is Stochastic POS Tagging. Now, the question that arises here is
which model can be stochastic. The model that includes frequency or probability (statistics) can
be called stochastic. Any number of different approaches to the problem of part-of-speech
tagging can be referred to as stochastic tagger.
The simplest stochastic tagger applies the following approaches for POS tagging −

Word Frequency Approach

In this approach, the stochastic taggers disambiguate the words based on the probability that a
word occurs with a particular tag. We can also say that the tag encountered most frequently with
the word in the training set is the one assigned to an ambiguous instance of that word. The main
issue with this approach is that it may yield inadmissible sequence of tags.

Tag Sequence Probabilities

It is another approach of stochastic tagging, where the tagger calculates the probability of a given
sequence of tags occurring. It is also called n-gram approach. It is called so because the best tag
for a given word is determined by the probability at which it occurs with the n previous tags.

Properties of Stochastic POST Tagging

Stochastic POS taggers possess the following properties −
 This POS tagging is based on the probability of tag occurring.
 It requires training corpus
 There would be no probability for the words that do not exist in the corpus.
 It uses different testing corpus (other than training corpus).
 It is the simplest POS tagging because it chooses most frequent tags associated with a
word in training corpus.

Transformation-based Tagging
Transformation based tagging is also called Brill tagging. It is an instance of the transformation-
based learning (TBL), which is a rule-based algorithm for automatic tagging of POS to the given
text. TBL, allows us to have linguistic knowledge in a readable form, transforms one state to
another state by using transformation rules.
It draws the inspiration from both the previous explained taggers − rule-based and stochastic. If
we see similarity between rule-based and transformation tagger, then like rule-based, it is also
based on the rules that specify what tags need to be assigned to what words. On the other hand,
if we see similarity between stochastic and transformation tagger then like stochastic, it is
machine learning technique in which rules are automatically induced from data.

Working of Transformation Based Learning(TBL)

In order to understand the working and concept of transformation-based taggers, we need to
understand the working of transformation-based learning. Consider the following steps to
understand the working of TBL −
 Start with the solution − The TBL usually starts with some solution to the problem and
works in cycles.
 Most beneficial transformation chosen − In each cycle, TBL will choose the most
beneficial transformation.
 Apply to the problem − The transformation chosen in the last step will be applied to the
problem.
The algorithm will stop when the selected transformation in step 2 will not add either more value
or there are no more transformations to be selected. Such kind of learning is best suited in
classification tasks.

Advantages of Transformation-based Learning (TBL)

The advantages of TBL are as follows −
 We learn small set of simple rules and these rules are enough for tagging.
 Development as well as debugging is very easy in TBL because the learned rules are easy
to understand.
 Complexity in tagging is reduced because in TBL there is interlacing of machinelearned
and human-generated rules.
 Transformation-based tagger is much faster than Markov-model tagger.

Disadvantages of Transformation-based Learning (TBL)

The disadvantages of TBL are as follows −
 Transformation-based learning (TBL) does not provide tag probabilities.
 In TBL, the training time is very long especially on large corpora.

Hidden Markov Model (HMM) POS Tagging

Before digging deep into HMM POS tagging, we must understand the concept of Hidden Markov
Model (HMM).

Hidden Markov Model

An HMM model may be defined as the doubly-embedded stochastic model, where the underlying
stochastic process is hidden. This hidden stochastic process can only be observed through
another set of stochastic processes that produces the sequence of observations.

Example
For example, a sequence of hidden coin tossing experiments is done and we see only the
observation sequence consisting of heads and tails. The actual details of the process - how
many coins used, the order in which they are selected - are hidden from us. By observing this
sequence of heads and tails, we can build several HMMs to explain the sequence. Following is
one form of Hidden Markov Model for this problem −

We assumed that there are two states in the HMM and each of the state corresponds to the
selection of different biased coin. Following matrix gives the state transition probabilities −

A=[a11a21a12a22]A=[a11a12a21a22]
Here,
 aij = probability of transition from one state to another from i to j.
 a11 + a12 = 1 and a21 + a22 =1
 P1 = probability of heads of the first coin i.e. the bias of the first coin.
 P2 = probability of heads of the second coin i.e. the bias of the second coin.
We can also create an HMM model assuming that there are 3 coins or more.
This way, we can characterize HMM by the following elements −
 N, the number of states in the model (in the above example N =2, only two states).
 M, the number of distinct observations that can appear with each state in the above
example M = 2, i.e., H or T).
 A, the state transition probability distribution − the matrix A in the above example.
 P, the probability distribution of the observable symbols in each state (in our example P1
and P2).
 I, the initial state distribution.

Use of HMM for POS Tagging

The POS tagging process is the process of finding the sequence of tags which is most likely to
have generated a given word sequence. We can model this POS process by using a Hidden
Markov Model (HMM), where tags are the hidden states that produced the observable
output, i.e., the words.
Mathematically, in POS tagging, we are always interested in finding a tag sequence (C) which
maximizes −
P (C|W)
Where,
C = C1, C2, C3... CT
W = W1, W2, W3, WT
On the other side of coin, the fact is that we need a lot of statistical data to reasonably estimate
such kind of sequences. However, to simplify the problem, we can apply some mathematical
transformations along with some assumptions.
The use of HMM to do a POS tagging is a special case of Bayesian interference. Hence, we will
start by restating the problem using Bayes’ rule, which says that the above-mentioned conditional
probability is equal to −
(PROB (C1,..., CT) * PROB (W1,..., WT | C1,..., CT)) / PROB (W1,..., WT)
We can eliminate the denominator in all these cases because we are interested in finding the
sequence C which maximizes the above value. This will not affect our answer. Now, our problem
reduces to finding the sequence C that maximizes −
PROB (C1,..., CT) * PROB (W1,..., WT | C1,..., CT) (1)
Even after reducing the problem in the above expression, it would require large amount of data.
We can make reasonable independence assumptions about the two probabilities in the above
expression to overcome the problem.

First Assumption
The probability of a tag depends on the previous one (bigram model) or previous two (trigram
model) or previous n tags (n-gram model) which, mathematically, can be explained as follows −
PROB (C1,..., CT) = Πi=1..T PROB (Ci|Ci-n+1…Ci-1) (n-gram model)
PROB (C1,..., CT) = Πi=1..T PROB (Ci|Ci-1) (bigram model)
The beginning of a sentence can be accounted for by assuming an initial probability for each tag.
PROB (C1|C0) = PROB initial (C1)

Second Assumption
The second probability in equation (1) above can be approximated by assuming that a word
appears in a category independent of the words in the preceding or succeeding categories which
can be explained mathematically as follows −
PROB (W1,..., WT | C1,..., CT) = Πi=1..T PROB (Wi|Ci)
Now, on the basis of the above two assumptions, our goal reduces to finding a sequence C
which maximizes
Πi=1...T PROB(Ci|Ci-1) * PROB(Wi|Ci)
Now the question that arises here is has converting the problem to the above form really helped
us. The answer is - yes, it has. If we have a large tagged corpus, then the two probabilities in the
above formula can be calculated as −
PROB (Ci=VERB|Ci-1=NOUN) = (# of instances where Verb follows Noun) / (# of instances where
Noun appears) (2)
PROB (Wi|Ci) = (# of instances where W i appears in Ci) /(# of instances where C i appears)
(3)

The Great Pyramid
100% (3)
The Great Pyramid
12 pages
My Mother Never Worked
No ratings yet
My Mother Never Worked
3 pages
Zero Accident Program Master Plan Local
No ratings yet
Zero Accident Program Master Plan Local
38 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
Rule-Based POS Tagging: Part of Speech Tagging
No ratings yet
Rule-Based POS Tagging: Part of Speech Tagging
10 pages
NLPChapter3
No ratings yet
NLPChapter3
14 pages
UNIT NO 3
No ratings yet
UNIT NO 3
8 pages
5 Sequence Learning
No ratings yet
5 Sequence Learning
50 pages
POS Tagging HMM Notes With Diagrams (1)
No ratings yet
POS Tagging HMM Notes With Diagrams (1)
4 pages
Module-5 (Markov Model and Pos Tagging)
No ratings yet
Module-5 (Markov Model and Pos Tagging)
66 pages
2.1 Rule Based POS Tagging
No ratings yet
2.1 Rule Based POS Tagging
5 pages
7. POS Tagging-II
No ratings yet
7. POS Tagging-II
11 pages
hidden markov model
No ratings yet
hidden markov model
13 pages
A Probabilistic Approach to POS Tagging (HMM) _ by Arindam Dey _ CodeX _ Medium
No ratings yet
A Probabilistic Approach to POS Tagging (HMM) _ by Arindam Dey _ CodeX _ Medium
21 pages
POS Tagging Comparison
No ratings yet
POS Tagging Comparison
3 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
47 pages
NLP 4
No ratings yet
NLP 4
83 pages
Lecture Part of Speech Tagging
No ratings yet
Lecture Part of Speech Tagging
41 pages
This Is AI4001: GCR: t37g47w
No ratings yet
This Is AI4001: GCR: t37g47w
51 pages
Multi-Tagging For Transition-Based Dependency Parsing
No ratings yet
Multi-Tagging For Transition-Based Dependency Parsing
10 pages
Parts of Speech
No ratings yet
Parts of Speech
26 pages
ai txt unit4
No ratings yet
ai txt unit4
39 pages
Hmm
No ratings yet
Hmm
94 pages
Lecture#11 (POS Tagging)
No ratings yet
Lecture#11 (POS Tagging)
19 pages
Part of Speech Tagging and Hidden Markov Models
No ratings yet
Part of Speech Tagging and Hidden Markov Models
24 pages
unit-3
No ratings yet
unit-3
50 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
pos tagging and chunking
No ratings yet
pos tagging and chunking
29 pages
Unit 3
No ratings yet
Unit 3
16 pages
NLP Report - Modified
No ratings yet
NLP Report - Modified
8 pages
Lecture 5
No ratings yet
Lecture 5
56 pages
Cme4408 p6 Pos Tagging
No ratings yet
Cme4408 p6 Pos Tagging
33 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
9.Chapter7 POS Tagging
No ratings yet
9.Chapter7 POS Tagging
37 pages
Lecture 20-23 Part of Speech Tagging
No ratings yet
Lecture 20-23 Part of Speech Tagging
36 pages
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
No ratings yet
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
93 pages
Unit 3
No ratings yet
Unit 3
24 pages
10 - POS Tagging
No ratings yet
10 - POS Tagging
75 pages
May 14
No ratings yet
May 14
23 pages
Rule_based_POS_Tagging_Example (1)
No ratings yet
Rule_based_POS_Tagging_Example (1)
4 pages
10pos Tagging PDF
No ratings yet
10pos Tagging PDF
76 pages
A Hybrid Model For POS Tagging
No ratings yet
A Hybrid Model For POS Tagging
4 pages
CH2
No ratings yet
CH2
119 pages
5 Natural Language Processing
No ratings yet
5 Natural Language Processing
7 pages
Sanskrit Tag-Sets and Part-Of-Speech Tagging Methods - A Survey
No ratings yet
Sanskrit Tag-Sets and Part-Of-Speech Tagging Methods - A Survey
6 pages
Part-Of-Speech (POS) Tagging
No ratings yet
Part-Of-Speech (POS) Tagging
53 pages
ai txt unit5
No ratings yet
ai txt unit5
7 pages
Wadola Habte Seminar
No ratings yet
Wadola Habte Seminar
16 pages
0abc5d7ab6458ec55a14c9f7c300438b_lec10
No ratings yet
0abc5d7ab6458ec55a14c9f7c300438b_lec10
77 pages
Lec3-posner intro
No ratings yet
Lec3-posner intro
30 pages
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
No ratings yet
Word Classes and Part-of-Speech (POS) Tagging: CS4705 Julia Hirschberg
40 pages
Issues in Pos Tagging.pptx
No ratings yet
Issues in Pos Tagging.pptx
14 pages
Parts of Speech Tagging Using Hidden Markov Model, Maximum Entropy Model and Conditional Random Field
No ratings yet
Parts of Speech Tagging Using Hidden Markov Model, Maximum Entropy Model and Conditional Random Field
28 pages
NLP Programming en 04 HMM
No ratings yet
NLP Programming en 04 HMM
24 pages
Hadiyyisa POS Tagger With Deep Learning
100% (2)
Hadiyyisa POS Tagger With Deep Learning
34 pages
CSCI 5832 Natural Language Processing: Jim Martin
No ratings yet
CSCI 5832 Natural Language Processing: Jim Martin
46 pages
5. PoSTagging-HMM
No ratings yet
5. PoSTagging-HMM
24 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
Developing Methods for Part of Speech Tagging in Turkish Language
No ratings yet
Developing Methods for Part of Speech Tagging in Turkish Language
45 pages
PARTS OF SPEECH TAGGING Article
No ratings yet
PARTS OF SPEECH TAGGING Article
4 pages
NLP Assignment 5
No ratings yet
NLP Assignment 5
5 pages
Markov Decision Process: Fundamentals and Applications
From Everand
Markov Decision Process: Fundamentals and Applications
Fouad Sabry
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
6 pages
Data Visualization in Python With Libraries
No ratings yet
Data Visualization in Python With Libraries
28 pages
9-Sampling Methodology Employed by CMFRI
No ratings yet
9-Sampling Methodology Employed by CMFRI
4 pages
UNIT-IV
No ratings yet
UNIT-IV
81 pages
Entrepreneurial Leadership in The 21st Century
No ratings yet
Entrepreneurial Leadership in The 21st Century
11 pages
July 011 Newsletterkl Publication 1
No ratings yet
July 011 Newsletterkl Publication 1
33 pages
CHARTS Grade 9
No ratings yet
CHARTS Grade 9
42 pages
Plotting Techniques in MATLAB
No ratings yet
Plotting Techniques in MATLAB
17 pages
Solar Geometry
100% (2)
Solar Geometry
55 pages
Jharkhand Academic Council Result PDF
No ratings yet
Jharkhand Academic Council Result PDF
1 page
Statistical Analysis Mean and Standard Deviations
100% (1)
Statistical Analysis Mean and Standard Deviations
3 pages
3 ENSC 102L - Module-I (Activity 2-LA#2)
No ratings yet
3 ENSC 102L - Module-I (Activity 2-LA#2)
4 pages
Resume: Washim Raja
No ratings yet
Resume: Washim Raja
4 pages
Individualism Worksheet
No ratings yet
Individualism Worksheet
2 pages
Developing As Rational Persons
No ratings yet
Developing As Rational Persons
4 pages
BEAM 5m ACI-318-11
No ratings yet
BEAM 5m ACI-318-11
1 page
Environmental Science a - Unit 1 Part 3 - Scientific Inquiry GN SE
No ratings yet
Environmental Science a - Unit 1 Part 3 - Scientific Inquiry GN SE
10 pages
Airbus UTM Blueprint-1
No ratings yet
Airbus UTM Blueprint-1
29 pages
Club PPT Format
No ratings yet
Club PPT Format
14 pages
ViewStudentResult
No ratings yet
ViewStudentResult
1 page
Why Do Nations Obey International Law
100% (2)
Why Do Nations Obey International Law
62 pages
8th Unit 56 Study
No ratings yet
8th Unit 56 Study
5 pages
Execute Site Preliminary Works Building Tech l6 j24_watermark
No ratings yet
Execute Site Preliminary Works Building Tech l6 j24_watermark
3 pages
Arrhenius Plot
No ratings yet
Arrhenius Plot
4 pages
SonicMeasure DM s50L Instructions
No ratings yet
SonicMeasure DM s50L Instructions
1 page
Learner WorkBook
No ratings yet
Learner WorkBook
13 pages
Powershift Knowledge, Wealth, and Power
No ratings yet
Powershift Knowledge, Wealth, and Power
20 pages
MATH 4 - QUARTER 3 - LESSON 2 - Draw Parallel, Intersecting and Perpendicular Lines
No ratings yet
MATH 4 - QUARTER 3 - LESSON 2 - Draw Parallel, Intersecting and Perpendicular Lines
26 pages
MODULE - pdf.EAPP Quarter 1 Module 1
No ratings yet
MODULE - pdf.EAPP Quarter 1 Module 1
6 pages
Uv
No ratings yet
Uv
41 pages

POS Tagging (1)

Uploaded by

POS Tagging (1)

Uploaded by

POS Tagging

Rule-based POS Tagging

Properties of Rule-Based POS Tagging

Stochastic POS Tagging

Word Frequency Approach

Tag Sequence Probabilities

Properties of Stochastic POST Tagging

Working of Transformation Based Learning(TBL)

Advantages of Transformation-based Learning (TBL)

Disadvantages of Transformation-based Learning (TBL)

Hidden Markov Model (HMM) POS Tagging

Hidden Markov Model

Use of HMM for POS Tagging

You might also like