Lec1-UNIT5 -MORE SIMPLER
Lec1-UNIT5 -MORE SIMPLER
Introduction
Part 1
Sudeshna Sarkar
17 July 2019
Natural Language Processing
• NLP is focused on developing systems that allow
computers to communicate with people using natural
language.
• Also concerns how computational methods can aid the
understanding of human language.
• Automating Language
– Analysis Language Representation
– Generation Representation Language
– Acquisition Obtaining the representation and necessary
algorithms, from knowledge and data
2
Language Processing
• Goals can be very ambitious
– True text understanding
– Good quality translation
• Or goals can be practical
– Web search engines
– Question Answering
– Machine Translation services on the Web
– Speech synthesis
– Voice recognition
– Conversational Agents
– Summarization
• Natural language technology not yet perfected
But still good enough for several useful applications
Logistics
• Moodle / Piazza forum for slides and discussion
• Moodle for assignment upload
• Textbook: Dan Jurafsky and James Martin Speech and
Language Processing
• Instructor: Sudeshna Sarkar
• Teaching Assistants:
– Debanjana Kar
– Ishani Mondal
– Sukannya Purakayastha
Logistics
• Attendance: Compulsory (5 marks)
• Assignments / Projects/ Quiz : 40 marks
• Midterm : 25 marks
• Endterm : 30 marks
Pre-requisites
• Probability and Statistics
• Machine learning
• Neural networks
Course Focus
• Linguistic Issues • Science Goal:
• Modeling Techniques Understand the way
– Probabilistic language operates
– ML
• Engineering Methods • Engineering Goal: Build
• Multilinguality systems that analyse
and generate language
What does it mean to “know” a language?
analysis
NL R
generation
8/ 38
Examples of End Systems
• Text classification
• Machine translation, information extraction, dialog
interfaces, question answering…
• human-level comprehension
Brief History of NLP
• 1940s –1950s: Foundational Insights
– Two foundational paradigms
• Automaton
• Probabilistic / Information-Theoretic Models
• 1957-1970: The two camps
– Symbolic paradigm: Chomsky and others on formal
language theory and generative syntax
– Stochastic paradigm
• AI-complete
• The city council refused the demonstrators a permit
because they ______ violence
25-Jul-19 31
The Reductionist Approach
Source Language Analysis Target Language Generation
Text Normalization Text Rendering