0% found this document useful (0 votes)
20 views

NLP Textbook Continued

Uploaded by

Harsh mishra
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
20 views

NLP Textbook Continued

Uploaded by

Harsh mishra
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 43
wa ee qhe rules can be categorized in to two parts a. Lexical: Word level p. Context Sensitive Rules: Sentence level ‘the transformation-based approaches use a predefined set of handcrafted rules as well as automatically induced rules that are generated during training asic Idea: 4, Assign all possible tags to words 2, Remove tags according to set of rules of type: . ifword+ is an adj, adv, or quantifier and the following is a sentence boundary . And . word-1 is not a verb like “consider” . then . eliminate non-adv else eliminate adv. . Typically more than 1000 hand-written rules, but may be machine-learned. VBN vB vBD TOVB DTNM promised to back the bill pply rules to remove possibilities ample Rule: : bs VBD is an option and VBN|VBI EN Eliminate VBN : NM RB NP ZL ¢ The girl ‘i NP | | IN NP plucked _the flower | with along stick Figure 2: A sentence with NP, VP and PP Noun phrase Anoun phrase is a phrase whose head is a noun or a pronoun, optionally accompanied by a set of modifiers. It can function as a subject, object or compliment. The modifiersot anoun phrase can we determine hours or adjective phrase. The obligatory constituento! a noun phrase is the noun head-all other constituents are optional. These structures can be represented using the phrase structure rule. As discussed earlier, phrase structure rules are of the form A BC, which states that constituent A can be rewritten as {wo constitutes B and C. These rules specify the element can occur in a phrase and in wha order. Using this notation, we can represent the phrase structure rules for as follows. Er NP — Pronoun NP — Det Noun NP — Noun NP — Adj Noun NP — Det Adj Noun We can combine all these rule in a single phrase struc! NP — (Det) (Adj) Noun|Pronoun The constituents in parenthesis are optional. This rule states that a noun phrases consists of a noun, possibly preceded by determiner and adjective (in that order). This rule doe: not cover all possible NPs. A noun phrase may include post modifiers and more Fe one at adjective. For example, it may include a prepositional phrase (PP). More than one adjective is handled by allowing an adjective phrase (AP) for the adjective in the rule After incorporating PP and AP in the phrase structure rule, we get the following / NP — (Det) (AP) Noun (PP) e following are a few examples of noun phrases ey -.- (1) e foggy morning .... (2) Chilled water .... (3) beautiful lake in Kashmir .... (4) old banana shake .... (5) et's see how the above phrases can be generated using phrase structure rules. The (1) phrase consists only of a pronoun. The (2) phrase consists of a determiner, an adjective foggy) that stands for an entire adjective phrase, and a noun. The (3) phrase comprises and adjective phrase and noun, (4) phrase consists of a determiner (the), an adjective phrase (beautiful), a noun (lake) and a prepositional phrase (in Kashmir) and (5) phrase consists of an adjective followed by a sequence of nouns. A noun sentence is termed @s nominal. None of the phrase structure rules discussed so far are able to handle ominals. So, we modify our rules to cover this situation. P —+ (Det) (AP) Nom (PP) lom — Noun | Noun Nom noun phrase can act as a subject, an object or a predicate. The following sentences Predicate. : Ade Verb Phrase se is the verb phrase, which is headed by of verb, The ing The he Analogous to the noun oe at can modify of verb. This makes verb phrase g bit or ee age of reasons that C senten ' rere: se site organises various elements of the ce that depeng vashist ex. The verb phrase org b find the following are some examples of phrases: he verb find ity on t The boy kicked the bail. .... (2) Knesshdu slept in the garden. .... (3) The doy gave the girl a book. The Boy gave the girl a book with blue cover. .... (5) AS you can see from these examples a verb phrase can have a verb {VP —+ Verb in( 8 verd followed by an NP {VP — Verb NP in (2)}, a verb followed by a PP {VP Yq PP in (3)}, a verb followed by two NPs {VP — Verb NP NP in (4)}, or a verb followed by Swe NPs and a PP (VP — Verb NP NP PP in (5)}. In general, the number of NPs in BPis Aewted fo two, whereas it is possible to add more than two Ps. VP — Verb (NP) (NP) (PP)* ‘Things are further complicated by the fact that objects may also be entire clauses as in the sentence, | know that Taj is one of the Seven wonders. Hence, we must also allow for an atemative phrase statement rule, in with NP is replaced by S. Prepositional Phrase Prepositional phrases are headed bya Preposition. They followed by some other Constituent, usu; We played volleyball on the beach, We can have a preposition phrase that John went outside. ‘The phrase structure tule that cay PP — Prep (NP) Consist of a preposition, possibly lally a noun phrase, Consists of just a preposit iplures the above eventualities adjective Phrase The head of an adjective phrase (AP) j Pp is an adjecti may be prect ive. AP consists of eded by an adverb and followed by a PP. Here are Ashish is love. ae an adjective, which }e examples. The train is very late, My sister is fond of animals, The phrase structure rule for adjective phrase is AP — (adv) Adj (PP) Adverb Phrase An adverb phrase consists of an adverb, Possibly preceded by a degree adverb, Here isan example. Time passes very quickly. AdvP — (Intens) Adv Sentences Having discussed phrase structures, sentences. A sentence can have bearing structure commonly known structures are declarative structure, imperative structure, yes-no question structure and WH question structure. Sentences with a declarative structure have a subject followed by a predicate, The predicate of a declarative sentence is a noun phrase and predicate is a verb phrase, @.., like horse riding. The phrase structure rule for declarative sentences are sentences is S— NP VP Sentences with an imperative structure usually begin with the verb phrase and lack Subject. The subject of these types of sentences is implicit and is understood to be "you". These types of sentences are used for commands and suggestions, and hence are called imperative. The grammar rule for this kind of sentence structure is. Sp i a as follows: re kind of sentences 4 Examples of this Look at the door Give me the book. ‘Stop talking. ‘Show me the latest design. ructure ask questions which can be Ase, t tr ‘ .s-no question S| Ea oe a Trace contences begin with an auxiliary verb, followed by a Sujet, using yes or no. followed by a VP. Here are some examples Do you have a Red pen? 's awake and cotter? Is the game over? Can you show me your album? 2 We expand our grammar by adding another rule for the expansion of a: S— Aux NP VP Sentences with WH question struct WH words (who, which, where, phrase as a subject or may includ Which team won the match? lure are more complex. These senter what why and how). A WH question ‘© another subject. Consider the follo 3. Relations among lexemes & their senses — Homonymy, Polysemy, Synonymy, Hyponymy ——— $$$ One way to approach lexical semantics is to study the relationship among lexemes (an abstract representation of a “word”, the lexical entry in a dictionary). Semantics of a lexeme can be understood by analysing the relationship of lexemes with other lexemes Lexical semantics information is useful for wide variety of NLP applications. This section | discusses a variety of relationship that holds among lexemes and their senses. Homonym The first relationship that we discuss is homonymy which is perhaps the simplest relationship that exists among lexemes. Homonyms are words that have the same form put have different, unrelated meanings. A classic example of homonymy is Bank (river bank or financial institution). A related idea is that of homophones that refers to words that are pronounced in the same way but different meaning or spelling of both (e.g., be and bee, bear and bare). Polysemy Many words have more than one meaning of sense. Unlike homonyms, polysemes are words with related meanings. This linguistic phenomenon is called polysemy or lexical ambiguity. Words that have several senses are ambiguous and called polysemous. For example, the word “chair” can refer to a piece of furniture, a person, the act of presiding over a discussion etc. The word “employ” is a polysemy as its two meaning- to hire (employ a person) and to accept (employ an idea) are related. In a particular use, only one of these meanings is correct. Hyponymy The hypernym is a word with the more general sense. The word automobile is a hypernym for a car and a truck. The hyponym is a word with the most spe neaning. In the relationship between car and automobile, car is a hyponym 0 . Antonym is a semantic relationship that holds between words that expres meanings. The word Good is an antonym of Bad, and White is an an Synonym ; aa eee defines the relationship between different norte tie is similar Meaning. A simple way to decide whether two words are synony aia eck for substitutability. Two words are synonyms in a context if they can be substituted for each other without changing the meaning of the sentence. : These relationships are useful in organising words in lexical databases one Widely knowp lexical database is WordNet discussed in next topic. 4. WordNet WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concep Synsets are interlinked by means of conceptual-semantic and lexical relations. resulting network of meaningfully related words and concepts can be navigated with browser. WordNet is also freely and publicly available for download. WordNet’s str makes it a useful tool for computational linguistics and natural language processing. WordNet superficially resembles a thesaurus, in that it groups words together based on their meanings. However, there are some important distinctions. First, WordNet interlinks not just word forms—strings of letters—but specific senses of words. As a result, words that are found in close Proximity to one another in the network are semantically disambiguated. Second, WordNet labels the semantic relations among words, whereas the groupings of words in a thesaurus does not follow any explicit pattern other than meaning similarity. Structure The main relation among words in WordNet is syno and close or car and automobile. Synonyms--words th are interchangeable in many contexts--are grouped i of WordNet's 117 000 synsets is linked to other sy of “conceptual relations.” Additionally, a synset co in most cases, one or more short sentences illu Word forms with several distinct meanings are Thus, each form-meaning pair in WordNet is uni wenttly enco ng § _ oded relation among synsets is the super-subordinate relation hyperonymy, hyponymy or ‘A relation). It links more general synsets like fumiture} to increasingly specific ones like {bed} and {bunkbed}. hat the category furniture includes bed, which in turn includes cepts like bed and bunkbed make up the category furniture. All the root node {entity}. Hyponyniy relation is transitive: S a Kind of chair, and if a chair is a kind of furniture, then an armchair is a fund of furniture. WordNet distinguishes among Types (common nouns) and Instances {specific persons, countries and Geographic entities). Thus, armchair is a type of chair; Barack Obama is an instance of a president. Instances are always leaf (terminal) nodes in their hierarchies. Meronymy, the part-whole relation holds between synsets like {chair} and (back, backrest}, {seat} and {leg}. Parts are inherited from their super ordinates: if a chair has Jegs, then an armchair has legs as well. Parts are not inherited “upward” as they may be characteristic only of specific Kinds of things rather than the class as a whole: chairs and ends of chairs have legs, but not all kinds of furniture have legs. Verb synsets are arranged into hierarchies as well; verbs towards the bottom of the trees (troponyms) express increasingly specific manners characterizing an event, as in {communicate}-ftalk} {whisper}. The specific manner expressed depends on the semantic field; volume (as in. the example above) is just one dimension along which verbs can be elaborated. Others are speed (move-jog-run) or intensity of emotion (like- a dverbs in WordNet (hardly, mostly, really, etc.) as he ™Majori There are ony et een derived from adjectives via morphologicay fixation English adverbs al (surprisingly, strangely, etc.) icati f WordNet reat tions in problems related with IR and NLP WordNet has found numerous applications in pr of these are given below: Concept Identification in Natural Language: WordNet can be used to identity concep pertaining fo term, to suit them fo the full semantic richness and complexity of a g information need Word Sense Disambiguation: WordNet combines features of a number of the other resources commonly used in disambiguation work. It offers sense definitions of words, ‘Gentiies synsets of synonyms, defines a number of semantics relations and is freely available. This makes it the (currently) best known and most utilized resource for word sense disambiguation. Automatic Query Expansion: WordNet semantic relations can be used to expand Gueries so that the search for a document is not confined to the pattern-matching of Guery terms, but also covers synonyms, Document Summarization; WordNethas found useful application in text summarization. Few approaches utilize information from WordNet to compute lexical chains. 5. Robust Word Sense Disambi Re guation (WSD) - es Dictionary based approach In Many cases, a Sit for example the nou INgIe Word in a | language corre In Bank (i 3 ancial institution of pany move fast or to direct and manage). But this does not Create a problem for us. We hardly give any thought to understanding that what constitutes the correct me: or phrase, but we generally arrive at the correct one. The Process is al People are good at resolving ambiguity by aning of a word most effortiess, Considering the context of the written text or coconut princess. Except for jokes and puns, where it is int tended, ambiguity is not perceived as such. Howe’ ver, ambiguities existing at different levels are one of the major challenges in computational linguistics. Letus try to understand why Ambiguity is difficult. This is because it increases the range of possible interpretations of natural language. Suppose each word in an 8-word sentence is ambiguous and has three Possible interpretations. The total number of interpretations of the whole sentence is 3° 3561. Further, synthetic and pragmatic ambiguities make the actual number of interpretations even larger. Resolving all these interpretations in @ reasonable amount of time is difficult. There are words with a much larger number of Sensors that then considered in this example. This gives a clear picture of the difficulty involved in the automatic interpretation of natural languages Ambiguity is the property of linguistic expressions. Ambiguity means capable of being understood in more than one way of having more than one meaning, It refers to a situation where an expression can have more than one partition. Ambiguity can occur at four different levels: Lexical Lexical ambiguity is ambiguity of a single word. A word can be ambiguous with respect to its internal structure or to its syntactic class. For example in the sentence, /ook at the creen, look is verb where as in she gave me a stern look, it is a noun. However this type if ambiguity is viewed as part of speech tagging in NLP and is considered to have been solved with reasonable accuracy. ———S_-" Syntactic There are different ways I! Each structuring leads to a di with the telescope. It is uncleal telescope, or he saw her throug! word which is unclear. The meaning is attached to the girl or the man. which sequence of words can be grammatically in le, the ifferent interpretation. For eh le, the man saw the _ 1 & ambiguous whether the Man saw a girl , h the telescope. It is the syntax, not the Meaning of ty dependent on whether the preposition “with. Semantic Semantic ambiguity occurs when the meaning of the word themselves can be misinterpreted. For example, the meaning of words in the phrase can be combined jy different ways, leading to different interpretations. Iraqi head seeks arms The homograph “head” can be interpreted as a noun meaning either chief or anatomicd | The bar head of a body. Likewise the homograph arms can be interpreted as a plural noun | m2 rye meaning either weapons or body parts. i . oa The oct Pragmatic ‘sentenc Pragmatic ambiguity refers to a situation where the context of a phrase gives it multiple § tthe ri interpretations, tothe a For example, Give it to the kids. Here “it" may refer to many things depending one | The ba context. Consider a larger context. Cake is on the table. | have prepared some snacks. Give it to kids Itis not clear whether it refers to cake or snacks for both, Perhaps help us. aegieer comment aa Cake is on the table. | have prepared some stacks. Give Ms it to kids. Kids enjoyed cake and snacks Now it is clear that it refers to both snacks and the «, ambiguous required Discourse Processing Bacula we understand that words have different meanings based on the context of its usage in the sentence. If we talk about human languages, then they are ambiguous too because many words can be interpreted in multiple ways depending upon the context of their currence, Word sense disambiguation in natural language processing (NLP), may be defined as the ability to determine which meaning of word is activated by the use of words in a particular context. Lexical ambiguity, syntactic or semantic, is one of the very first problems that any NLP system faces, Part-of-speech (POS) taggers with high level of accuracy can solve Word's syntactic ambiguity. On the other hand, the problem of resolving semantic ambiguity is called WSD (word sense disambiguation). Resolving semantic ambiguity is harder than resolving syntactic ambiguity, For example, consider the two ‘examples of the distinct sense that exist for the word “bank” ~ The bank will not be accepting cash on Saturdays the river overflowed the bank. The occurrence of the word bank clearly denotes the distinct meaning. In the first sentence, it means commercial (finance) banks, while in the second sentence; it refers to the river bank. Henee, if it would be disambiguated by WSD then the correct meaning to the above sentences can be assigned as follows — The bank/financial institution will not be accepting cash on Saturdays. The river overflowed the bank/riverfront. The evaluation of WSD requires the following two inputs: ADictionary: The very first input for evaluation of WSD is dictionary, which is used to specify the senses to be disambiguated. Test Corpus: Another input required by WSD is the high-annotated test corpus that has the target or correct-senses. The test corpora can be of two types: * Lexical sample - This kind of corpora is used in the system, where itis required to disambiguate a small sample of words. All-words - This kind of corpora is uses disambiguate all the words in a pi SD) Knowledge-based Approach (wWSD) tion, these methods primarily rely on dictionaries sambiguation, They do not use corpora evidence for The Lesk method is the seminal dictionary-based method introduceg ion. The ‘on, on which the Lesk algorithm is based, jg tions for all words in context”. However, in 2000, Kilgartiff and Rosensweig gave the simplified Lesk eefition = “measure iti nt context”, which further overlap between sense definitions of word and curre means identify the correct sense for one word at a time Here the current context is the set of words in the surrounding sentence or paragraph The Lesk algorithm is based on the assumption that words in a given “neighbourhood” (section of text) will tend to share a common topic. A simplified version of the Lesk algorithm is to compare the dictionary definition of an ambiguous word with the terms contained in its neighbourhood Dictionary-based or ‘As the name suggests, for dis: treasures and lexical knowledge base. disambiguatis by Michael Lesk in 1986. The Lesk defini “measure overlap between sense defi Versions have been adapted to use WordNet. An implementation might look like this: 1. for every sense of the word being disambiguated one should count the amount of words that are in both neighbourhood of that word and in the dictionary definition of that sense 2. the sense that is to be chosen is the sense which has the biggest number of this count A frequently used example illustrating this algorithm is for the context “pine cone”. The following dictionary definitions are used: Pine: + _ Kinds of evergreen tree with neealle-shaped leaves * Waste away through sorrow or iliness Natural Language Processing Cone: « Solid body which Narrows to a point « Something of this Shape whether Solid or hollow e Fruit of certain evergreen trees As can be seen, Pine#1 C Cone#1 =0 z Pine#2 C Cone#1 =0 Pine#1 ¢ Cone#2 = 1 Pine#2 C Cone#2 = 0 Pine#1 C Cone#3 = 2 Pine#2 € Cone#3 = 0 the best intersection is Pine #1 C Cone #3 = 2. Sac analysis difficult? Exam ions Questions _ ; Expected Quest ig? why. semantic _4¢ What is semantic analysi mantic analysis. approaches to sel amples following relationships between word me; @ eX i tionships bet _4. What is semantic analysis? Discuss different semantic rela ip words. ' “What is WordNet? How is “sense” defined in WordNet? Explain with example. 5. What do you mean by word sense disambiguation (WSD)? discuss diction based approach for WSD. 6. What do you mean by word sense disambiguation (WSD)? Discuss knowledge based WSD What do you mean by word sense disambiguation (WSD)? discuss machine learning based( Navie based) approach for WSD. the ways in which context contri act theory conversational implicates ae tyage behaviour in PhIIOSOPhy, soca, a which examines Meaning tha i age, pragmatics studies how the transmission given ume vinguiato knowledge (€.9-, grammar, lexicon, a onthe context of the utterance, any pre-exist; 4 intent of the speaker, and other factors, hh is that studies Pragmatics is @ subi .g. Pragma nd othe’ hropology. Unlike to meanin: talk in interaction al linguistics and ant conventional or “coded” ina meaning depends not only on st etc.) of the speaker and listener, i .d, the inferred it knowledge about those involved, fe tava this respect, pragmatics explains how language users are able Me apparen ambiguity, since meaning relies on the manner, place, time, etc. of an utterance, Pragmatics deals with using and understanding sentences in different situations ang how the interpretation of the sentence is affected. The ability to understand anothe, speaker's intended meaning is called pragmatic competence. Pragmatics is different than semantics, which concerns the relations between signs and the objects they signify. Semantics refers to the specific meaning of language, Pragmatics, by contrast, involves all of the other social cues that accompany language ania ‘ocuses not on what people say but how they say it and how others inter utterances in social contexts. Utterances are literally the units of sound your pany those utterances are what give for dinner. Your child sees your Some cookies and says, ‘Betta T not take those, * OF you'll get believe your child Could be so rude.” Ina literal Sense, the dau: q WE NEED TO TALK” WHAT TF CAN wean 4) YouRE FiRep, B) UFoRGoT WHat Nour Your °) a ~ CAN YoU ReMi E> HAYS ATRIVAL Reaucer T (M Too. WRITE IN AN EMAG, SEP TO Some Example : “The question ‘can | cut you?” Means very different things if I'm standing next to you in line or if | am holding a knife”. “Is that water?” The action to be performed is different in a chemistry lab and on a dining table If you go to your editor and ask her to suggest a better sentence structure for a line, her immediate question to you will be, “What's the context?” Most of the time, due to flexibility of the natural language, complexities arise in interpreting the meaning of an isolated statement. Ref. Pragmatic analysis uses the context of utterance—when, why, by who, where, to whom something was said. It deals with intentions like criticize, inform, promise, request, and so on. For example, if | say *You are late,” is it information or criticism? In discourse integration, the aim is to analyze the statement in relation to the preceding or succeeding statements or even the overall paragraph in order to understand its meaning. Take this one: Chloe wanted it. (“It? in terms of cont depends on Chloe). Pragmatic analysis interprets the meaning unlike semantics. Ref. An Example of NLP Noun Det Noun Aux Verb Det Noun Prep Det Loxical analyai, x NY be Noun Phrase (Pato spegg, Noun Phrase Complex Verb Noun Phrase f igging) X ar, Prep Phrase ‘Semantic analysis we (Seneh)ie = =) Verb Phrase a i | i » Syntactic analysis Boy(b1), _ (Parsing) Playground(p1). _Netb Phrase ‘Chasing(d1,b1,p1). ince a A person saying this may Scared(x) if Chasing(_.x,_). be reminding another erson to get the dog ¥ Scared(b1) Pr back. Inference e 2 o 23 25 VSS Pragmatic analysis (speech act) g Pragmatics can be defined as : “It is the study of Speaker meaning” Itis concerned with the stud 'Y of meaning as communicated by speaker and interpreted by the listener. “It is the study of contextual meaning” Itinvolves interpretation of what influences what is said. EBSapgsee “It is the study of how more gets communicated than is said” This type of ‘study explores how great deal of what is unsaid is recognized as ‘Part of what 's communicated. “It is the Study of the expression of relative distance” On the assumption of how close and di istant the listener is Speakers much needs to be said. Natural Language Processing Machine Translation standard name for computerized systems reg, ion (MT) is a © Por from one natural language into another with or wa tions fre i inguisti t investi. Id of computational linguistics that igates the ina e language to another. Deve; Machine Transiati for the production of translal bfiel human assistance. It is @ SU! — computer software to translate text or speech fr a of fully-fledged bilingual machine translation (MT) sys of a fully with limited electronic resources and tools is challenging and demanding task. in onde, vo achieve reasonable translation quality in open source tasks, corpus-based maching translation approaches require large number of parallel corpora that are not always available, especially for less resourced language pairs. On the other hand, the tule based machine translation process is extremely difficult and fails to analyze accurately a large corpus of unrestricted text. Even though there has been an effort towards building English to Indian Language and Indian Language to Indian Language translation system, we do not have an efficient translation system as of today. 4 Why should we interest in using computers for translation at all? The first most important ‘eason is that there is too much that needs to be translated.and human translator cannot | a A oe reason is that technical materials are too boring for human translators, ¥ don't like translating them, so they will look for help from computers. Another J Major requirement that terminology is used do not like to repeat the same translation aaa a tend to seek variety; they = itwill be not goog: . ; — based translation can increase the Wiis for technical translation. "orate area like to have translation immediate . See ly, Machine Translation System Approac! Like translation done by y humans, Of linguistic knowled MT does rine 9°, morphology sim Direct Translation € Of the Simplest mag} in hine translation approach in ion is done with the help of a bilingual dictionary Rule Based Translation ARule-Based Machine Translation ( rules, called grammar tules, 3 biling ae System consists of a collection of Various ual lexicon o} ——— "dictionary, and software Programs . Interlingua Based Translation In this approach, the translation Consists of two stages, where the source Language (SL) is first converted into the Interlingua (IL) form. The main advantage of Interlingua approach is that the analyzer and Parser of SL is independent of the generator for the Target Language (TL) and this requires complete resolution of ambiguity in source language text. Statistical-based Approach Statistical machine translation (SMT) is a data-oriented statistical framework which is based on the knowledge and statistical models which are extracted from bilingual bilingual or multilingual corpora of the languages are required. In i jis MT, | multilingual ) onal .nt is translated according to the probability distribution function which me “ae (elf). Finding the best translation is done by picking the highest is indicated by ie i ion 1. probability, as shown in Equation 1: , a8 e= argmax p (e | f) = argmax P «fl Exampl s ause the examples of already existing translations. An Basic idea of this MTS ig uses a bilingual corpus as its main knowledge base example-based trensiation ber and itis essentially translation jowledge pine Translation (KBMT) requires complete understanding of go-Based translation into the target text. KBMT is implemented on pore rid knowledge and by e KBMT must be supported by wor yut the meanings of words and their combinations. 7. Principle-Based MT Principle-Based Machine Translatio n (PBMT) Systems are based on the pp © Pring rand | processing JP] Churker |» | Triple Extraction in Hindi/Marathi languag Output: Answer in Hindi/Marathi angi juage Step 1: Tokenize the input quest Question Step 5: Extract query triple from chunk ed erat Grouped list step 7: Traverse ontology to fetch ang we step 8: Form ‘ ulate answer as natural language t. sample input and output for Hind query mm Input Question: Fars 3 HF at fp er: Rah a haar ple input and output for Marathi query ut Question: FRrenfSrt ong wor ajo por: Frat on tren ee 6 question answering system for Marathi natural language using concept of ontology s a formal representation of knowledge base for extracting answers. Ontology is used express domain specific knowledge about semantic relations and restrictions in the iven domains. The ontologies are developed with the help of domain experts and the jery is analyzed both syntactically and semantically. The results obtained are accurate ough to satisfy the query raised by the user. The level of accuracy is enhanced since e query is analyzed semantically estion answering system is much more effective than traditional search engines as it ides accurate and precise answer to question rather than providing links to relevant Beuments or set of matching contents. Question answering system has become part of daily life of users, over a period of time many personal assistance software like Siri, Cortana, and Google Now are developed which provide precise and accurate answer to user question. 4. Sentiment Analysis ing task that deals with finding ‘ uage process! sentiment Analysis (GA) 1 @ natural IANGUEES POTS, eats with analyzing given piece of text. orientation of opinion in @ pie tude of 8 2 : emotions, feelings, and the attitude four, likes and ni o is involves identify the sent eee The target of arity: bcress, and then classify mel" Pa jons, i LS sil ttitude or inclination of determine the 2 ty of their speaking or writing. Their ays motional state of the subject, or the State iudgment, €! to affect a reader or listener, It ig ty they are ee subject they are communicating \d ities, blogs, social media, news articles, or i a ent analysis is t The purpose of sentin textual polar communicator through the cont may be reflected in their own } any emotional communication they | to determine a person's state of min This information can be mined from texts, comments. There are different classifi aspect-level. Document-level SA aims to classify an opinion of the whole document as expressing a positive or negative sentiment. Sentence-level SA aims to co sentiment expressed in each sentence which involves identifying whether sentence is subjective or objective. Aspect-level SA aims to classify the sentiment with respect to the specific aspects of entities which is done by identifying the entities and their aspects Sentiment Analysis is a natural language processing task that deals with finding orientation of opinion in a piece of text with fespect to a topic. It deals with analyzing emotions, feelings, and the attitude of a Speaker or a writer from a given piece of text. ication levels in SA: document-level, sentence-leve} and Sentiment Classification Techniques Sentiment classification is-a task under Sentiment Analysis (SA) that deals with automatically tagging text as pee LL classtors Rule-based | Classi Probabilistic Classifiers st Bayosian Network Maximum Entiopy _] Bictonary-based Lexicon Based ‘Approach Figure 5: Sentiment Classiticato lachine Learning Approach © machine learning method uses severe) sairing algor he machine learning approach applicable ntiment by training on & known dataset. T to sentiment analysis mostly belongs t© ‘supervised classification. In a machine learning ‘based techniques, two sets of ‘documents are needed: training and 2 test set. A training "set is used by an automatic classifier to learn the differentiating characteristics of ~ documents, and a test set is used e classifier performs. to check how well the A) Supervised Learning: The supervised learning method Supervised learning P! training data testing: Test rithms to determine the .ce of labeled training documents, jing): Learn a model using the ta to assess the model accuracy: ne oxisten is depend fors ike Decision Tree Classifiers, Ling issifit a cla if Pore vobabilstic ClassiFS. and Pr Js of SU There are many kinds Classifiers, Rule-based Classifiers 1) Probabilistic Classifier : Probabilistic classifiers use mixture ™ “ that each class is a component of the m mode! that provides the probability of samp for classification. The mixture mode} ASSUMeg 6 ss ture. Each mixture component is a generat, ing a particular term for that component, . Classifier (NB) a) Naive Bayes tassification model computes the posterior probability of a Class, Naive Bayes clas: it ument. The model works with based on the distribution of the words in the . BP woes dona BOWs feature extraction which ignores the po fective sot uses Bayes Theorem to predict the probability that a given feature set belongs to g Particular label. P(label | features)=(P(/abel)*P(features|label))/(P(features)) b) Bayesian Network (BN) Bayesian Network model which is a directed acyclic graph whose nodes represent random variables and edges represent conditional dependencies. BN is considered @ complete model for the variables and their relationships, Therefore, a complete joint probability distribution (JPD) over all the variables is Specified for a model. In ‘Text mining, the computation complexity of BN is very expensive; that is why, itis not frequently used, Maximum Entropy Classifier (ME) The Maxent Classifier also known as a Conditional expo labeled feature sets to vectors usit Ng encoding, Thi Calculate weights for each 19 encoding. This encod ¢) Score (or Probability) of a features and their wei by making a cla, r typically presented to the machine in a vector ector called kinds of linear classifiers; among them is Supp f classifiers that gamers attempt to determine g ) which is a s 00 classes. 1d linear separators between different classification because of HIRE Sparse nature of text, in which few features are irrelevant. but they tend to be correlated with one i separable categories The basic Sv iatcs woes no sou aa past each given input, which of two possil sses forms the output, iG ita fait probabilistic binary linear classifier. It does not depend on the probabilities Neural Network (NN) Neural Network consists of many neurons where the neuron is its basic unit. Aneural ork consists of units (neurons), arrange ‘some output. Each unit takes an input, applies a (often nonlinear) function to it then passes the output on to the next layer. Generally t etworks are defined feed-forward: a unit feeds its output to all the units on the next layer, but is no feedback to the previous layer. Weightings are applied to the signals sing from one unit to another, and it is these weightings which are tuned in the ining phase to adapt 2 neural network to the particular problem at hand This is e learning phase. Multilayer neural networks are used for non-linear boundaries hese multiple rs are used to indu | proximate enclosed regions belonging the earlier layers feed into the neurons ‘complex because the errors need fo be Pack P Ml) Decision tree classifier Decision tree classifier provides @ Niet in which a condition on the attribute in layers, which convert an input vector ce multiple piecewise linear boundaries, which are used to toa particular class. The outputs of the neurons inthe later layers. The training process is more agated over different layers. @ the data. The condition or The division of the data Tecords which are used for l! IV) Rule based classifiers bi led with a set of rules. The j ie ed classifiers, the data SI In rule-based classifiers re set expressed in disjunctive normal form Whi non the ae conditions are on the term presence. ag — t informative in sparse data. There are num is nol g phase construct all the rules depending pace is model side represents a conditior the right-hand side is the cl absence is rarely used because it I te rules, the trainin in order to generate criteria are support and confidence. The suppor raining data set which are relevant to the Tule, the right-hand side of the Tule ig of criteria on these criteria. The most two common is the absolute number of instances in the ti The Confidence refers to the conditional probability that t satisfied if the left-hand side is satisfied. Both decision trees and decision rules tend tg encode rules on the feature space, but the decision tree tends to achieve this goal with a hierarchical approach. B) Unsupervised learning: Unsupervised learning is that of trying to find hidden structure in unlabelled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution. The main purpose of text classification is to classify documents into a certain number of predefined categories. In order to accomplish that, @ large number of labelled training documents are used for supervised learning, as luster before. In text classification, it is sometimes difficult to create these labelled training documents, but it is easy to collect the The KNN classification ities to instances in the stance weighted voting. ssifier predicts the aa = *“PSFAISF*X VsQSserRgsaervrgges Lexicon-based approach example, start witl iti . For pI h positive and negative word lexicons, analyze the document for n sentiment new - a ie find. Then if the document has more positive word lexicons, it isp . i wee 1S negative. The lexicon based techniques to Sentiment analysis is unsupt arming because it does not require prior training in order to classify the are three methods to construct a sentiment lexicon: manual construction, corpus- methods and dictionary-based methods. The manual construction of sentiment in is a difficult and time-consuming task. In dictionary based techniques the idea is collect a small set of opinion words manually with known orientations, and then to this set by searching in the WordNet dictionary for their synonyms and antonyms. newly found words are added to the seed list. The next iteration starts. The iterative ss stops when no more new words are found. The dictionary based approach have itation is that it can't find opinion words with domain specific orientations. Corpus techniques rely on syntactic patterns in large corpora. Corpus-based methods roduce opinion words with relatively high accuracy. Most of these corpus based is need very large labeled training data. This approach has a major advantage e dictionary-based approach does not have. It can help find domain specific n words and their orientations jc sentiment analysis of text documents Break each text document down into its com tokens and parts of speech) Identify each sentiment-bearing phi iment score to each phrase follows a straightforward process: ponent parts (sentences, phrases, ase and component . * Assign a senti a * Optional: Combine scores for MINES 5. Text Categorization Or Classification 0. ssigning tags OF categories to text According to | tasks in Natural Language Processing (NLp) lysis, topic labelling, spam detection, and Text categorization is the process of as its content. It's one of the fundamental with broad applications such as sentiment anal intent detection. Unstructured data in the form of text is everywhere: emails, chats, web pages, social media, support tickets, survey responses, and more Text can be an edreenehy rich source of information, but extracting insights from it can be hard and time-consuming due to its unstructured nature. Businesses are turning to text classification for structuring text ina fast and cost-efficient way to enhance decision-making and automate processes. Text classification is a process of dividing a given set of documents into one or more predefined classes. This classification of text is done automatically. Usually machine learning techniques are used for automatic text classification. There are mainly two types of techniques namely supervised and unsupervised learning methods. Supervised learning methods assign predefined class label to the testing documents using classification algorithms whereas in unsupervised learning methods grouping of testing documents are done using techniques like clustering. There is also a semi-supervised learning method where, parts of the documents are labelled by the external mechanism. Text classifiers can be used to organize, structure, a For example, new articles can be organized by by urgency, chat conversations can be organize organized by sentiment, and so on. d categorize pretty much anything. support tickets can be organized guage, brand mentions can be Text Classification Techniques A growing number of machine leaming a learning methods have been applied to tree(C 4.5), K-Nearest Neighbor (K-NN), Bi approaches), Neural Networks (NN), Regt etc. Several clustering techniques are also Suffix Tree Clustering, Label Induction Hierarchical Clustering (SHOC) etc, @ specifically supervised include Decisie” A) Supervised Learning Method Is The supervised learning techniques a re ex; a) Decision Tree Plained below: b) K-Nearest Neighbor (KNN) KNN is a Statistical approach for text Classification where objects are classified by votit ini an sane labeled training examples with their smallest distance from each object. a Classification method is outstanding with its simplicity and is widely used techniques for text categorization iral network is also called artificial neural network is a mathematical model inspired biological neural networks. A neural network consists of an interconnected group artificial neurons, and it processes information using a connectionist approach to computation. Different neural network approaches have been applied to document categorization problems. While some of them use the simplest form of neural networks, known as perceptions, which consist only of an input and an output layer, others build more sophisticated neural networks with a hidden layer between the two rs. laive Bayes (NB) naive Bayes classifier is sim theorem with strong independence that the presence or absence of a presence or absence of any ce the probabilistic model, Naive supervised learning setting. ple probabilistic classifier based on applying Bayes assumptions. A naive Bayes classifier assumes feature of a class is unrelated to the nding on the precise nature of be trained very efficiently in a —— d Methods ‘i e) Vector Base’ f vector-based methods: The centroid algorithm and support Vector The two types of vector-" machines. From these two algorithms centroid is simplest id Algorithm: During the learning stage only the average feotug Vector for pclae is calculated and set as centroid-vector for the category. This algorithm, ceed if number of categories is very large. Centroid algorttony computes similarity of test document with each centroid using cosine similarity measure Itassigns a document, class with whose centroid a document has greatest similarity, Support Vector Machine (SVM): The main idea of SVM is to find a hyper-plane that best separates the documents and the margin, distance separating the border of subset and the nearest vector document, is large as possible. The nearest Samples of the hyper-plane named support vectors are selected. The calculated hyper-plane Permits to separate the space in two areas. To classify the new documents, calculate the area of the space and assign them the corresponding category. B) Clustering Techniques Clustering of documents is mainly used to minimize the amount of text by categorizing °F grouping similar data items. This grouping isia common way for human processing information, and one of the good techniques for clustering helps to build different varieties which provide automated tools, The following is a brief introduction to some of the clustering techniques: a) K-means Algorithm Natural Language Processing p) Label Induction Groupin, Lingo algorithm is b; _ '9 (LINGO) Algorithm ased on vector Space m and frequent words/ odel. First it edurton GHC a from the input documents, wee ite user readable . nal Term . Further by perf (SVD) method to reduce nee Matrix with Singular vei eee clusters and then assigns di ‘erm document matrix, and then it find the labels of ‘Ocuments to that cluster labels based on the similarity value. ) Ontology Based Classification ditional classification met i SS ecender ane hods ignore relationship between words, they consider each “ Tesult. But, there exist a semantic relation between terms such synonym, hyponymy etc. Therefore, for better classification results, there is need rstand the Senter Gf the text document. The ontology has different meaning ent users, in this classification task. Ontology stores words that are related to lar domain. Therefore, with the use of domain specific ontology, it becomes easy ify the document even if the document does not contain the class name in it le : Automatic Text Categorization of Marathi Language Documents m takes input as a set of Marathi Language text documents. These h include input validation, tokenization, gical analysis. Then the features are signed syste! ents undergo preprocessing steps whic! rd removal, stemming and morpholo: ed from preprocessed tokens. Then at last supervised learning methods and gy based classification are applied to get output as classified Marathi documents r class labels. The supervised learning methods considered here are Naive Bayes Modified eet Neighbor (MKNN) and Support Vector Machine (SVM). - — Input Document (English NLP) Rich Semamtic Graph (RSG) Creation ‘Tokenization & Filtration & Name Entity Recognition ‘OpenNLP/ ‘Syniax Analysis Stanford P: POS Tagging Pa Preprocessed Sentences Rich Semantic Graph generation (Whole document) ition Rules i Documents This approach consists of the following pha: ses: a. Marathi text document as input. b. Rich Semantic Graph (RSG) creation pha in Rich semantic graph creat tion stey finds the sentence and produces Lae of the input text document is done, ens f ai cee or the complete document. For every corns Raat 's the words into predefined categories such as ae ‘ a lon and organization. After this it generates the graph for every it con a oa Icatenates rich semantic sub-graphs. At last the sub-graphs al gether to show the complete document correctly. g Rich Semantic Graph (RSG) reduction phase: ich semantic graph reduction phase targets to reduce the obtained rich semantic iph of the source document to more reduced graph. Here a set of rules are applied the obtained rich semantic graph to reduce it by merging, consolidating or deleting raph nodes. ary generation from reduced RSG: mmarized text generation phase targets to obtain the abstractive text summar the reduced Rich Semantic Graph (RSG). To reach the target, this phase Ses the domain ontology; it has the data required in the same domain of RSG in the final output, In addition, the Word Nst ontology is used t obtain multiple nonyms. The obtained multiple texts are accessed and according to the word sy! d, the most ranked text is considered Single text document in Marathi Language arar are 3 uw UR TIC aantan ease Sea SH RTERI SRTALATRN SHEA MNT SNH PHT ACCRA gaa ATaT SET HRS BROT ET WCC HE TTT 1e was a very good human being ‘pita geTeMl 272. Baba AM ut : Reduced Maeningful summary. u kaa eee ae FHCRTA RT EA MET STS ek aa FT TT TET OTE 1. qa? ona eve fyoroar® STUNT ATE, wa 2 aire orgs ortendl cefer 8 aE on 3 yams artand crear Te sree vent onrat Gof er, Output NER tagged Data NER GERAName> ana KA ArooaTa TUBA ATE A, NER aitTee/Month> UIA wefa & are, een, UREA TE HAA LTS 3H Seay Questions Q. Write a note on: _——T Machine translation 2. Information retrieval vs information Extraction. 3. Question answering system. 4. Text Categorization —S Text Summarization 6. sentiment analysis. 7. Named Entity

You might also like