Assignment Two
Assignment Two
Jimma University
ZERIHUN
Why Important Morphological analysis
Morphological analysis is the process of providing grammatical information about the word on
the basis of properties of the morpheme it contains. It is an integral part of the larger natural
language processing projects such as text to speech synthesis, information extraction and
machine translation. It is the sub discipline of linguistics that deals with the internal structure of
words. Example: - Consider the following sets of English word pairs:
Verb Noun
Bake Baker
Eat Eater
Run Runner
Write Writer
In these word pairs we observe a systematic form-meaning correspondence: the presence of –er
in the words in the right column correlates with the meaning component ‘one who Vs’ where V
stands for the meaning of the corresponding verb in the left column.
Also morphological analysis is very meaningful for the determination of part-of-speech structure
in syntactic parsing, and analysis of a sentence. Information about verbal inflection is especially
important for the word order concept. Moreover, a word may define two or more expressions.
The different parts of the word represent the smallest units of meaning known as Morphemes.
Morpheme could be, the word precancellation can be morphologically scrutinized into three
separate morphemes: the prefix pre, the root cancella, and the suffix -tion. The interpretation of
morpheme stays same across all the words, just to understand the meaning humans can
break any unknown word into morphemes. For example, adding the suffix –ed to a verb,
conveys that the action of the verb took place in the past. The words that cannot be divided and
have meaning by themselves are called Lexical morpheme (e.g.: table, chair).The words (e.g.
-ed, -ing, -est, -ly, -ful) that are combined with the lexical morpheme are known as
Grammatical morphemes (eg. Worked, Consulting, Smallest, Likely, Use). Those grammatical
morphemes that occurs in combination called bound morphemes ( e.g. -ed, -ing).
Morphological analyzer and generator are the two essential and basic tools for building
any natural language processing application. It supplies information concerning
morphosyntactic properties of the words it analyses or constructs.
Morphological analyzer is a program for analyzing the morphology of an input word, the
analyzer reads the inflected surface form of each word in a text and provides its lexical form,
like for nouns it will provide gender, number, and case information, likewise for verbs it
will provide tense, aspect and modularity. Whereas generation is the inverse process i.e., given
a root and its grammatical features it will generate the word forms of the root word.
Also morphological analyzer is the program for analyzing the morphology of an input word. The
analyzer includes the recognition engine, identifying suffixes, and finding a stem within the input
word algorithms. A morphological analyzer takes a complete word form and the syntactic and
morphological properties of the word as its input. Morphological analyzers are composed of
three parts.
Morpheme lexeme
Set of rules governing the spelling and composition of morphologically complex words.
Decision algorithm
Example
Word Tag
Heat verb (noun)
Water noun (verb)
In prep (noun, adv)
A det (noun)
Large adj (noun)
Vessel noun
Part-of-speech tagging is the process of assigning a part-of-speech marker to each part-of-
speech tagging word in an input text. The input to a tagging algorithm is a sequence of
(tokenized) words and a tagset, and the output is a sequence of tags, one per token. Tagging is a
disambiguation task; words are ambiguous —have more than one ambiguous possible part-of-
speech—and the goal is to find the correct tag for the situation. For example, book can be a verb
(book that flight) or a noun (hand me that book). That can be a determiner (Does that flight serve
dinner) or a complementizer (I thought that your flight was earlier). The goal of POS-tagging is
to resolve these ambiguity resolution ambiguities, choosing the proper tag for the context.
Part-of-Speech tagging in itself may not be the solution to any particular NLP problem. It is
however something that is done as a pre-requisite to simplify a lot of different problems. Part of
Speech (hereby referred to as POS) Tags are useful for building parse trees, which are used in
building NERs (most named entities are Nouns) and extracting relations between words. POS
Tagging is also essential for building lemmatizers which are used to reduce a word to its root
form.
Example