Natural Language Processing
Natural Language Processing
study of how to make computers do things at which, at the moment, people are better (Rich and Knight [1991]) Theory of how the human mind works (Mark Fox)
AI Objectives
Make machines smarter Understand what intelligence is Make machines more useful (practical purpose)
A computer can be considered to be smart only when a human interviewer, conversing with both an unseen human being and an unseen computer, can not determine which is which.
Major AI Areas
Expert Systems
Natural
Language Processing
Speech Understanding Robotics and Sensory Systems Computer Vision and Scene Recognition Neural Computing Fuzzy Logic
Interaction Level
Natural Language Processing is a technique where machine can become more human and there by reducing the distance between human being and the machine can be reduced. Therefore in simple sense NLP makes human to communicate with the machine easily. NLP applications are very useful in everyday life for example a machine that takes instructions by voice.
Interaction Level
The level that computer and human interact. NL used for make Interaction level near to human.
Graphical UI NL UI Human Interaction level Command-line Computer
Natural?
Natural Language? Natural Language is one of fundamental aspects of human behaviors. Provide easy interaction with computer Refers to the language spoken by people, e.g. English, Japanese, Hindi as opposed to artificial languages, like C++, Java, etc.
Robotics
Expert System
Information Retrieval
Machine Translation
Language Analysis
Semantics
Parsing
used to extract the meaning from input in order to perform the useful task as a result. Automatic analysis of human language by computer algorithms.
Huge amounts of data Internet = at least 20 billions pages and exponentially increasing
Applications for processing large amounts of texts require NLP expertise
Text-based applications This involves applications such as searching for a certain topic or a keyword in a data base, extracting information from a large document, translating one language to another or summarizing text for different purposes.
Dialogue based applications Some of the typical examples of this are answering systems that can answer questions, services that can be provided over a telephone without an operator, teaching systems, voice controlled machines (that take instructions by speech) and general problem solving systems.
Phonology
Deals with the interpretation of speech sounds within and across words. Three types of rules used in phonological analysis: 1) phonetic rules for sounds within words; 2) phonemic rules for variations of pronunciation when words are spoken together, and; 3) prosodic rules for fluctuation in stress and intonation across a sentence.
Morphology
Morphology is the first stage of analysis once input has been received. It looks at the ways in which words break down into their components and how that affects their grammatical status.
Morphology
Morphemes are the smallest meaningful units of language. cars car+PLU Children Child+PLU
Syntax
Syntax involves applying the rules of the target languages grammar, its task is to determine the role of each word in a sentence and organize this data into a structure that is more easily manipulated for further analysis.
Issues in Syntax
1.
the dog ate my homework - Who did what? Identify the part of speech (POS)
Dog = noun ; ate = verb ; homework = noun English POS tagging: 95% (Can be improved)
Issues in Syntax
NP(Ravindra)
VP(loves Khusi)
Verb Noun(R)
NP
loves
Noun(K) Ravindra Love Khusi
Semantics
Semantics are the examination of the meaning of words and sentences. Semantics convey Useful information relevant to the scenario as a whole.
Issues in Semantics
Understand language! How? plant = industrial plant plant = living organism Words are ambiguous Importance of semantics?
Issues in Semantics
previously tagged by a human Train a learning algorithm How to choose the learning algorithm? How to obtain the 100 tagged examples?
Pragmatics
Pragmatics is the sequence of steps taken that exposes the overall purpose of the statement being analyzed. This will be broken down into ambiguous entities and will be disambiguate to facilitate understanding.
Discourse
Concerns how the immediately preceding sentences affect the interpretation of the next sentence. For example, interpreting pronouns and interpreting the temporal aspects of the information.
Issues in Discourse
Anaphora Resolution: to resolve referring expression The dog entered my room. It scared me Mary bought a book for Kelly. She didnt like it. She refers to Mary or Kelly. -- possibly Kelly It refers to what -- book.
Symbolic Approach:
Perform
Based
language
linguistic phenomena
theories of representation
Research
Microsoft Natural Language Processing Group The team is broadening the scope of the NLP effort by developing parallel systems in several languages. The languages covered are Chinese, English, French, German, Japanese, Korean and Spanish.
Research
Canon Natural Language Processing Group research and development of large vocabulary speech understanding software, for interactive spoken systems;
Applications of NLP
Google: Translate.google.com
Machine Translation
Machine Translation is the process of translating from source language text into target language. There are 2 types of MT: Rule based MT Statistical MT
Machine Translation
Rule based MT Explicit use and manual creation of linguistically informed rules and representations Statistical MT Corpus based, i.e. learned from examples of translations called parallel or bilingual corpora
ANGLABHARTI (1991), a machine-aided translation system specifically designed for translating English to Indian languages at IIT Kanpur. Anglabharti uses a pseudo-interlingua approach. It analyses English only once and creates an intermediate structure called PLIL (Pseudo Lingua for Indian Languages).
Question Answering
Is a system that automatically answer questions posed by humans in natural language Three steps involved in question answering: Question Manipulation and classification Matching Answer selection
Future of NLP
Well there are so many applications we can dream with NLP techniques. How about robots that understand and follow instructions by human voice or driving by talking to the car like in some science fiction movies. Well they all can be real one day. Imagine we have a computer system that can follow simple human instructions and do what ever we want it to do. How convenient will it be ? But lets leave all that to the FUTURE.........
Conclusions
A lot of research is going into developing new applications and investigating new techniques and approaches that will make Statistical NLP more feasible in the near future. So we will be able to see improved applications of NLP in the near future.
References
Blogs on Natural Language Processing from the Microsofts official site. Tutorial on NLP by Saad Ahmad (University of northern Iowa) Coppin, B. (2004). Artificial Intelligence Illuminated.Sudbury, Massachusetts: Jones and Bartlett Publishers Di Eugenio, B. (2001).Natural-Language Processing for Computer-Supported Instruction. Intelligence. Winter 2001
Thank You