0% found this document useful (0 votes)
11 views6 pages

01 Course Introduction 14-11

Uploaded by

idhitappu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views6 pages

01 Course Introduction 14-11

Uploaded by

idhitappu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 6

Hi, I'm Dan Jurafsky, and Chris Manning

and I are very happy to welcome you to our


course on natural language processing.
This is a particularly exciting time to be
working on natural language processing.
The vast amount of data on the Web and
social media have made it possible to
build fantastic new applications. Let's
look at one of them. Question answering.
You may know that IBM's Watson won the
Jeopardy challenge on February sixteen,
2011. Answering questions like William
Wilkinson's book inspired this author's
most famous novel. And you may know that
the answer is Bram Stoker who famously
wrote [sound] Dracula. Another important
task is information extraction. For
example, imagine that I have the following
email from my colleague Chris about
scheduling a meeting. We'd like software
to automatically notice that there are
dates, like tomorrow; times, like ten to
eleven:30; in a room, like Gates 159;
extract those information, create a new
calendar entry, and then populate a
calendar with this kind of structured
information, with the event, date, start,
and end, for a calendar program. And
modern email and calendar programs are
capable of doing this from text. Another
application of this kind of information
extraction, involves sentiment analysis.
Imagine that you're, interested in cameras
and you're reading a lot of reviews of
cameras on the web, so here's a bunch of,
bunch of reviews. We'd like to
automatically determine, from the reviews,
that what people care about in cameras,
are particular attributes. If they're
buying a camera, they want to know if it
has good zoom or affordability, or size
and weight. So, you want to automatically
determine those attributes. And then we'd
like to automatically, for any particular
attribute, determine how the reviewers
felt about those attributes. For example,
if a reviewer said nice and compact to
carry, that's a positive sentiment, and
here's another positive example. But a,
but a phrase like flimsy is a negative
sentiment. So we'd like to automatically
detect for each sentence what the
sentiment is, and then aggregate for each
feature for, say, presume for
affordability, so we might decide that
this camera, the reviewers really like the
flash. But they weren't so happy about the
ease of use. We might measure the positive
and negative sentiment. About each
attribute and then aggregate those.
Machine translation is another important
new application and machine translation
can be fully automatic. So for example, we
might have a source sentence in Chinese
and here's Stanford's phrasal MT system
translating that into English. But MT can
also be used to help human translators. So
here we might have an Arabic text and the
human translator translating it into
English might need some help from the MT
system, for example, a collection of
possible next words that the MT system can
build automatically and help the human
translator. Let's look at the state of the
art in language technology. Like every
field, NLP's divided up into specialties
and sub-specialties. A number of these
problems are pretty close to solved. So,
for example, spam detection, while it's
very hard to completely detect spam in our
email boxes, we don't have, 99 percent
spam, and that's because spam detection is
a relatively, easy classification task. A
couple of important component tasks, part
of speech tagging and named entity
tagging. We'll talk about those, later in
the course. And those work at pretty high
accuracies. We're gonna get 97 percent
accuracy in part of speech tagging, and
we'll see how that's important for
parsing. In other tasks, we're making good
progress. Not as commercial not as
completely solved but, there are systems
out there that are, that are being used.
So we talked about sentiment analysis the
task of deciding, thumbs up or thumbs down
on a sentence or a product. Component
technologies like word sense
disambiguation deciding if we're talking
about a rodent or a computer mouse when
people talk about mouses in a search.
We'll talk about parcing which is good
enough now to be used in lots of
applications, and machine translation
usable on the web. A number of
applications however are still quite hard.
So for example, answering hard questions
like how effective is this medicine in
treating that disease, by looking at the
web or by summarizing information we know
is quite hard. Similarly, while we made
some progress on, deciding that, the
sentence xyz company acquired abc company
yesterday means something similar to abc
has been taken over by xyz. The general
problem of detecting that two phrases or
sentences mean the same thing the
paraphrase tasks still quite hard. Even
harder is the task of summarization,
reading a number of, let's say, news
articles that say that the oh the
Dow Jones is up or the S&P500
has jumped, and housing prices
rose, and aggregating that to give user
information, like, in summary, the economy
is good. And finally, one of the hardest
tasks in natural language processing:
carrying on a complete human-machine
communication in dialogue. So, here's a
simple example asking about what movie is
playing when and buy movie tickets, and
you can get applications that do that
today. But the general problem of
understanding everything the user might
ask for, and returning a sensible
response, is quite difficult. Why is
natural language processing so difficult?
One cute example are the kinds of,
ambiguity problems that are called crash
blossoms. So, ambiguity is any case where
a surface form might have multiple
interpretations. A crash blossom is the
name for a kind of headline that has two
meanings, and the ambiguity causes, a
humorous interpretation. So, reading this
first headline, "Violinist Linked to JAL
Crash Blossoms." You might think that the
main verb is linked and the violinist is
being linked to what. He's being linked to
Japan Airline's crash blossoms. Well, what
are crash blossoms? Well this headline
gave the name to this phenomenon because
the actual interpretation that the
headline writer intended, the main verb
was blossoms. Who does the blossoming, the
violinist, and this fact about being
linked to JA crash was a modifier of
violinist. Similar kinds of syntactic
ambiguities. So here "Teacher Strikes Idle
Kids", the writer intended the main verb
to be idle. The strikes caused the kids to
be idle, but, of course, the humorous
interpretation is that the teacher is
striking. Strike is the verb. And we have
a teacher. Striking idle kids. Another
important kind of ambiguity, is word sense
ambiguity. So in our third example, red
tape holds up new bridges, the writer
intended holds up, to mean something like
delay. Call that sense one of holds up.
But the amusing interpretation is the
second sense of holds up, which we might
write down as to support. And now, we get
the interpretation that literal red-tape,
as opposed to bureaucratic red-tape, is
actually supporting a bridge. And, we can
see lots of other kinds of, ambiguities in
these actual headlines. Now, it turns out
that it's not just amusing headlines that
have ambiguity. Ambiguity is pervasive
throughout natural language text. Let's
look at a sensible, non-ambiguous-looking
headline from the New York Times. So the
headline shortened it here a bit, is Fed
raises interest rates, buy that seems
unambiguous. We have a verb here, a vital
parser [inaudible] raises. What gets
raised? A noun phrase, a vital role to
announce here interest rates. And we have
a verb phase, so raising interest rates
and then we have the Fed. Make a little
noun phrase. And then we'll say, this is a
sentence that has a noun phrase, Fed, and
a verb phrase, raises. And what gets
raised is interest rates. So, this is
called a phrase structure parse. We'll
talk about that, later in the course,
phrase structure. So, we could also write
a dependency parse. So, we say the head
verb, raises, has an argument which is
fed, and has another dependent, which is
rates. And, rates has another, itself has
a dependent, interest. So, we can see the
main verb is raising. Well, another
interpretation of the very same sentence,
one that people don't see but that parsers
see right away, is that it's not raises
that's the main verb of the sentence, but
interest. Somebody interests something,
and, that something that gets interested
is rates. And what is interesting these
rates, well. It's fed raises, raises by
the fed. So its a completely different
sentence with a different interpretation
that something is interesting, the rates,
whatever that could mean, and it seems an
unlikely interpretation for people. But of
course, for a parser, this is a perfectly
reasonable interpretation that we have to
learn how to rule out. In fact, the
sentence can get even more difficult. This
is, the actual headline was some, somewhat
longer so we had fed raises interest rates
half a percent. Here we could imagine that
rates is the verb and now we have what is
reading fed raises interest. The interest
in federal raises. Are rating, half a
percent, so we might have a, a dependency
structure like this. So again, interest.
Rates. The raises are what do the
interesting and the Fed is a modifier of
raises. So, whether with our, phrase
structure parse, or dependency parse, and
even more so as we add more words when get
more and more ambiguity, that have to be
solved in order to build a parse, for each
sentence. Now, the format of the course
you're going to have in video quizzes and
most lectures will include a little quiz.
And they're there just to check basic
understanding. They're simple multiple
choice questions. You can retake them if
you get them wrong. Let's see one right
now. A number of other things make natural
language understanding difficult. One of
them is the non standard English that we
frequently see in, text like Twitter
feeds, where we have, capitalization and,
unusual spelling of words, and hash tags
and user ID's and so on. So, all of our,
parsers and part of speech taggers that
we're gonna make use of are often trained
on very clean newspaper text English but,
the actual English in the, in the wild.
Will cause us a lot of problems. We'll
have a lot of segmentation problems for
example if we see that the string y o r k
dash any w as part as New York New Haven,
how do we know, the correct segmentation
is New York? And New Haven. So the New
York, New Haven railroad. And not
something like. York-dash-new. This word
here is not a word like in-dash-law. We
have to solve the segmentation problem
correctly. We have problems with idioms,
and with, new words that haven't be- seen
before. And, we'll also have problems with
entity names, like the movie, A Bug's
Life, which has English words in it, and
so it's often difficult to know where the
movie name starts and ends. And this comes
up very often in biology. Where we have
genes and proteins named with English
words. The task of natural understanding
is very difficult. What tools do we need?
Well, we need knowledge about language,
knowledge about the world and a way to
combine these knowledge sources. So
generally the way we do this is to use
probabilistic models that are built from
language data. So, for example, if we see
the word Maison, in French, we are very
likely to translate that as the word house
in English. On the other hand if we see
the word avocation all in French, we are
very unlikely to translate that as the
general avocado. And training these
probabilistic models in general can be
very hard. But it turns out that we can do
an approximate job of probabilistic models
with rough text features and we'll
introduce those rough to, text features as
we go on. So our goal in the class is
teaching key theory and methods for
statistical natural language processing.
We'll talk about the Viterbi algorithm,
nieve base, and maxen classifiers. We'll
introduce N gram language modeling and
statistical parcing. We'll talk about the
inverted index and TFIDF and vector models
of meaning that are important in
information retrieval. And we'll do this
for practical, robust, real world
applications. We'll talk about information
extraction, about spelling correction,
about information retrieval. The skills
you'll need for the task, you'll need
simple linear algebra so you should know
what a factor is and what a matrix is, you
should have some basic probability theory,
and you need to know how to program an
either job over python because there'll be
weekly programming assignments, you know
have your choice of languages. We're very
happy to welcome you to our course on
Natural Language Processing and we look
forward to seeing you in following
lectures.

You might also like