Understanding: Chapter 14: Rich & Knight Dr. Suthikshn Kumar
Understanding: Chapter 14: Rich & Knight Dr. Suthikshn Kumar
Understanding
What is Understanding? To understand something is to transform it from one representation into another, where this second representation has been chosen to correspond to a set of available actions that could be performed and where the mapping has been designed so that for each event, an appropriate action will be performed. Success of failure of an understanding program can rarely be measured in an absolute sense but must instead be measured with respect to a particular task to be performed. Example : Interpretation of sentence I need to go to New York as soon as possible by airline reservation system and family friend.
Computer understanding
Applied to images, speech and text What makes Understanding Hard? 1. The complexity of the target representation into which the matching is being done. 2. The type of the mapping: one-one, many-one, one-many or many-many 3. The level of interaction of the components of the source representation 4. The presence of noise in the input to the understander.
The results of understanding this story could be presented using the conceptual dependency model. This representation is considerably more complex than that for the simple query. Constructing such a complex representation is more difficult than constructing a simple one since more information must be extracted from the input sentences. Extracting information often requires the use of additional knowledge about the world described by the sentences.
Type of Mapping
Understanding if the process of mapping an input from its original form to a more useful one. The simplest kind of mapping to deal with is one-to-one i.e., each different statement maps to a single target representation that is different from that arising from any other statement. Very few input systems are totally one-to-one. Consider the language of arithmetic expressions in many programming languages
A:= B+C*D
:=
A + B *
One-to-one mapping
They are generally simplest to perform, they are rare in interesting input systems for several reasons. One important reason is that in many domains, inputs must be interpreted not absolutely, but relatively, with respect to some reference point. When images are being interpreted, size and perspective will change as a function of the viewing position. Thus a single object will look different in different images. Examples: pictures taken close to the scene and pictures taken from farther away Example in english: a tall giraffe and a tall poodle
Many-to-one mapping
Natural languages both in spoken and written form require many-to-one mappings. Speech: No two people speak alike. In fact, one person does not always say a given word the same way. Speech Spectrogram of same word looks different. All the following sentences will map to same search in a English front end to a keyword data retrieval system:
Tell me all about the last presidential election. Id like to see all the stories on the last presidential election. I am interested in the last presidential election. (SEARCH KEYWORD = ELECTION & PRESIDENT )
One-to-Many mapping
It often require a great deal of domain knowledge in order to make the correct choice among the available representation. An example: They are flying planes:
(They are (flying airplanes)) (They (are flying) airplanes) (They are (flying planing-tools)) (They (are flying) planing-tools)
This sentence in isolation is ambiguous.Clues, both from previous sentences and from the physical context in which the sentence occurs, usually make one of the interpretations appear to be correct. English has the properties of many-to-many mappings, in which there are many ways to say the same thing and a given statement may have many meanings.
In most interesting understanding contexts, each input is composed of several components ( lines, words, symbols ) The mapping process is the simplest if each component can be mapped without concern for the other components of the statement. Programming languages provide good examples of languages in which there is very little interaction among the components of an input. In many natural language sentences on the other hand, changing a single word can alter not just a single node of the interpretation, but rather its entire structure. Components of english sentence typically interact more heavily with each other than do the components of artificial languages such as programming languages. Example: Local Ambiguity in a speech problem:
The cat scares all the birds away. A cats cares are few.
The correct grouping cannot be determined without looking at the larger context in which the sounds occurred. In Image understanding problems as well a similar proble involving local indeterminancy arises.
Understanding is the process of interpreting an input and assigning it meaning. In many understanding situations the input to which meaning should be assigned is not always the input that is presented to the understander. Because of the complex environment in which understanding usually occurs, other things often interfere with the basic input before it reaches understander. We must take an input signal and separate the speech component from the background noise component inorder to understand the speech. In Image understanding: if you look out of your car window in search of a particular store sign, the image you will see of the sign may be interfered with by many things such as your windshield wipers or trees alongside the road. Typing errors are common, if language is being used interactively to communicate with a computer system.
Waltz Algorithm
1. Find the lines at the border of the scene boundary and label them. These lines can be found by finding an outline such that no vertices are outside it. We do this first because this labeling will impose additional constraints on the other labeling in the figure. Number the vertices of the figure to be analyzed. These numbers will correspond to the order in which the vertices will be visited during the labeling process. To decide on numbering do the following:
a. b. Start at any vertex on the boundary of the figure. Since boundary lines are known, the vertices involving them are more highly constrained than are interior ones. Move from the vertex along the boundary to an adjacent unnumbered vertex and continue until all boundary vertices have been numbered. Number interior vertices by moving from a numbered vertex to some adjacent unnumbered one. By always labeling a vertex next to one that has already been labeled, maximum use can be made of constraints.
2.
c.
Algorithm Waltz
3. Visit each vertex V in order and attempt to label it by doing the following: a. Using the set of possible vertex labelings given in figure, attach to V a list of possible labelings. b. See whether some of these labelings can be eliminated on the basis of local constraints. c. Use the set of labelings just attached to V to constrain the labelings at vertices adjacent to V.
Algorithm Waltz
This algorithm will always find the unique, correct figure labeling if one exists. If a figure is ambiguous, the algorithm will terminate with at least one vertex still having more than one labeling attached to it. This algorithm was applied to a larger class of figures in which cracks and shadows might occur.
Summary
In this chapter, we outlined the major difficulties that confront programs designed to perform perceptual tasks. We also described the use of the constraint satisfaction procedure as one way of surmounting some of those difficulties. The problem of speech and image understanding are important in construction of stand alone programs to solve one particular task. But they also play an important role in the larger field of robotics, which has its goal the construction of intelligent robots capable of functioning with some degree of autonomy.