0% found this document useful (0 votes)
70 views43 pages

INFO 2950: Prof. Carla Gomes Gomes@cs - Cornell.edu

This document provides an introduction to modeling computation using grammars and formal languages. It discusses how grammars can be used to generate sentences in a language and determine if a given sentence is part of that language. It then covers three common structures used to model computation: grammars, finite state machines, and Turing machines. Grammars are used to model both programming languages and natural languages. The document gives examples of phrase structure grammars and defines key terminology such as productions, derivations, and the language generated by a grammar.

Uploaded by

moin khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views43 pages

INFO 2950: Prof. Carla Gomes Gomes@cs - Cornell.edu

This document provides an introduction to modeling computation using grammars and formal languages. It discusses how grammars can be used to generate sentences in a language and determine if a given sentence is part of that language. It then covers three common structures used to model computation: grammars, finite state machines, and Turing machines. Grammars are used to model both programming languages and natural languages. The document gives examples of phrase structure grammars and defines key terminology such as productions, derivations, and the language generated by a grammar.

Uploaded by

moin khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 43

INFO 2950

Prof. Carla Gomes


[email protected]

Module
Modeling Computation:
Languages and Grammars
Rosen, Chapter 12.1
1
Modeling Computation

Given a task:
Can it be performed by a computer?
We learned earlier the some tasks are unsolvable.
For the tasks that can be performed by a computer, how can they be
carried out?
We learned earlier the concept of an algorithm.
– A description of a computational procedure.
How can we model the computer itself, and what it is doing when it
carries out an algorithm?

Models of Computation –
we want to model the abstract process of computation itself.
We’ll cover three types of structures used in modeling computation:
Grammars
• Used to generate sentences of a language and to determine if a given sentence is
in a language
• Formal languages, generated by grammars, provide models for programming
languages (Java, C, etc) as well as natural language --- important for
constructing compilers
Finite-state machines (FSM)
• FSM are characterized by a set of states, an input alphabet, and transitions that
assigns a next state to a pair of state and an input. We’ll study FSM with and
without output. They are used in language recognition (equivalent to certain
grammar)but also for other tasks such as controlling vending machines
Turing Machine – they are an abstraction of a computer; used to compute
number theoretic functions

3
Early Models of Computation

Recursive Function Theory


– Kleene, Church, Turing, Post, 1930’s (before computers!!)
Turing Machines – Turing, 1940’s (defined: computable)
RAM Machines – von Neumann, 1940’s (“real computer”)
Cellular Automata – von Neumann, 1950’s
(Wolfram 2005; physics of our world?)
Finite-state machines, pushdown automata
– various people, 1950’s
VLSI models – 1970s ( integrated circuits made of thousands of transistors form a single chip)
Parallel RAMs, etc. – 1980’s
Computers as Transition
Functions
A computer (or really any physical system) can be modeled
as having, at any given time, a specific state sS from
some (finite or infinite) state space S.
Also, at any time, the computer receives an input symbol
iI and produces an output symbol oO.
– Where I and O are sets of symbols.
• Each “symbol” can encode an arbitrary amount of data.
A computer can then be modeled as simply being a
transition function T:S×I → S×O.
– Given the old state, and the input, this tells us what the
computer’s new state and its output will be a moment later.
Every model of computing we’ll discuss can be viewed as
just being some special case of this general picture.
Language Recognition Problem

Let a language L be any set of some arbitrary objects s


which will be dubbed “sentences.”
– “legal” or “grammatically correct” sentences of the language.
Let the language recognition problem for L be:
– Given a sentence s, is it a legal sentence of the language L?
• That is, is sL?

Surprisingly, this simple problem is as general as our very


notion of computation itself! Hmm…
Ex: addition ‘language’ “num1-num2-(num1+num2)”
Languages and Grammars
Finite-State Machines with Output
Finite-State Machines with No Output
Language Recognition
Turing Machines
Languages & Grammars

Phrase-Structure Grammars
Types of Phrase-Structure Grammars
Derivation Trees
Backus-Naur Form
Intro to Languages

English grammar tells us if a given combination of words is a valid sentence.


The syntax of a sentence concerns its form while the semantics concerns
its meaning.
e.g. the mouse wrote a poem

From a syntax point of view this is a valid sentence.

From a semantics point of view not so fast…perhaps in Disney land

Natural languages (English, French, Portguese, etc) have very complex rules of
syntax and not necessarily well-defined.

9
Formal Language

Formal language – is specified by well-defined set of rules of syntax

We describe the sentences of a formal language using a grammar.

Two key questions:


1 - Is a combination of words a valid sentence in a formal language?
2 – How can we generate the valid sentences of a formal language?

Formal languages provide models for both natural languages and


programming languages.

10
Grammars

A formal grammar G is any compact, precise


mathematical definition of a language L.
– As opposed to just a raw listing of all of the language’s
legal sentences, or just examples of them.
A grammar implies an algorithm that would generate
all legal sentences of the language.
– Often, it takes the form of a set of recursive definitions.
A popular way to specify a grammar recursively is to
specify it as a phrase-structure grammar.
Grammars (Semi-formal)

Example: A grammar that generates a subset of the English language

sentence  noun _ phrase predicate

noun _ phrase  article noun

predicate  verb 12
article  a
article  the

noun  boy
noun  dog

verb  runs
verb  sleeps
13
A derivation of “the boy sleeps”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 the noun verb
 the boy verb
 the boy sleeps

14
A derivation of “a dog runs”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 a noun verb
 a dog verb
 a dog runs 15
Language of the grammar:

L = { “a boy runs”,
“a boy sleeps”,
“the boy runs”,
“the boy sleeps”,
“a dog runs”,
“a dog sleeps”,
“the dog runs”,
“the dog sleeps” }

16
Notation

noun  boy
noun  dog

Variable Terminal
or Production
rule Symbols of
Non-terminal the vocabulary

Symbols of
the vocabulary
17
Basic Terminology

► A vocabulary/alphabet, V is a finite nonempty set of elements


called symbols.
• Example: V = {a, b, c, A, B, C, S}
► A word/sentence over V is a string of finite length of elements
of V.
• Example: Aba
► The empty/null string, λ is the string with no symbols.
► V* is the set of all words over V.
• Example: V* = {Aba, BBa, bAA, cab …}
► A language over V is a subset of V*.
• We can give some criteria for a word to be in a language.
Phrase-Structure Grammars

A phrase-structure grammar (abbr. PSG)


G = (V,T,S,P) is a 4-tuple, in which:
– V is a vocabulary (set of symbols)
• The “template vocabulary” of the language.
– T  V is a set of symbols called terminals
• Actual symbols of the language.
• Also, N :≡ V − T is a set of special “symbols” called nonterminals.
(Representing concepts like “noun”)
– SN is a special nonterminal, the start symbol.
• in our example the start symbol was “sentence”.
– P is a set of productions (to be defined).
• Rules for substituting one sentence fragment for another
• Every production rule must contain at least one nonterminal on its left
side.
Phrase-structure Grammar

► EXAMPLE:

 Let G = (V, T, S, P),

 where V = {a, b, A, B, S}
 T = {a, b},
 S is a start symbol
 P = {S → ABa, A → BB, B → ab, A → Bb}.

G is a Phrase-Structure Grammar.
What sentences can be generated
with this grammar?
Derivation

Definition

Let G=(V,T,S,P) be a phrase-structure grammar.

Let w0=lz0r (the concatenation of l, z0, and r) w1=lz1r be strings over V.

If z0  z1 is a production of G we say that w1 is directly derivable from w0 and we


write wo => w1.

If w0, w1, …., wn are strings over V such that w0 =>w1,w1=>w2,…, wn-1 => wn, then we
say that wn is derivable from w0, and write w0=>*wn.

The sequence of steps used to obtain wn from wo is called a derivation.


Language

Let G(V,T,S,P) be a phrase-structure grammar. The


language generated by G (or the language of G)
denoted by L(G) , is the set of all strings of terminals
that are derivable from the starting state S.

L(G)= {w  T* | S =>*w}

24
Language L(G)

► EXAMPLE:
Let G = (V, T, S, P), where V = {a, b, A, S}, T = {a, b}, S is a start
symbol and P = {S → aA, S → b, A → aa}.

The language of this grammar is given by L (G) = {b, aaa};

• we can derive aA from using S → aA, and then derive aaa using A →
aa.

• We can also derive b using S → b.


Another example

Grammar: G=(V,T,S,P) T={a,b} P=


S  aSb
S 
V={a,b,S}

Derivation of sentence :ab


S  aSb  ab

S  aSb S  26
S  aSb
S 
Grammar:

aabb
Derivation of sentence
:

S  aSb  aaSbb  aabb

S  aSb S 
27
Other derivations:

S  aSb  aaSbb  aaaSbbb  aaabbb


S  aSb  aaSbb  aaaSbbb
 aaaaSbbbb  aaaabbbb
So, what’s the language of the
grammar with the productions? S  aSb
S  28
Language of the grammar with the productions:

S  aSb
S 
n n
L  {a b : n  0}

29
PSG Example – English Fragment

We have G = (V, T, S, P), where:


V = {(sentence), (noun phrase),
(verb phrase), (article), (adjective),
(noun), (verb), (adverb), a, the, large,
hungry, rabbit, mathematician, eats, hops,
quickly, wildly}
T = {a, the, large, hungry, rabbit, mathematician,
eats, hops, quickly, wildly}
S = (sentence)
P = (see next slide)
Productions for our Language

P = { (sentence) → (noun phrase) (verb phrase),


(noun phrase) → (article) (adjective) (noun),
(noun phrase) → (article) (noun),
(verb phrase) → (verb) (adverb),
(verb phrase) → (verb),
(article) → a, (article) → the,
(adjective) → large, (adjective) → hungry,
(noun) → rabbit, (noun) → mathematician,
(verb) → eats, (verb) → hops,
(adverb) → quickly, (adverb) → wildly }
A Sample Sentence Derivation

(sentence) On each step,


(noun phrase) (verb phrase) we apply a
production to a
(article) (adj.) (noun) (verb phrase) fragment of the
(art.) (adj.) (noun) (verb) (adverb) previous sentence
template to get a
the (adj.) (noun) (verb) (adverb) new sentence
the large (noun) (verb) (adverb) template. Finally,
the large rabbit (verb) (adverb) we end up with a
sequence of
the large rabbit hops (adverb) terminals (real
words), that is, a
the large rabbit hops quickly sentence of our
language L.
Another Example

V T

Let G = ({a, b, A, B, S}, {a, b}, S, P

{S → ABa, A → BB, B → ab, AB → b}).


One possible derivation in this grammar is:
S  ABa  Aaba  BBaba  Bababa
 abababa.
Defining the PSG Types
Type 0: Phase-structure grammars – no restrictions on the
production rules
Type 1: Context-Sensitive PSG:
– All after fragments are either longer than the corresponding before
fragments, or empty:
if b → a, then |b| < |a|  a = λ .
Type 2: Context-Free PSG:
– All before fragments have length 1 and are nonterminals:
if b → a, then |b| = 1 (b  N).
Type 3: Regular PSGs:
– All before fragments have length 1 and nonterminals
– All after fragments are either single terminals, or a pair of a terminal
followed by a nonterminal.
if b → a, then a  T  a  TN.
Types of Grammars -
Chomsky hierarchy of languages

Venn Diagram of Grammar Types:

Type 0 – Phrase-structure Grammars


Type 1 –
Context-Sensitive
Type 2 –
Context-Free
Type 3 –
Regular
Classifying grammars

Given a grammar, we need to be able to find the


smallest class in which it belongs. This can be
determined by answering three questions:
Are the left hand sides of all of the productions single
non-terminals?
If yes, does each of the productions create at most one
non-terminal and is it on the right?
Yes – regular No – context-free
If not, can any of the rules reduce the length of a
string of terminals and non-terminals?
Yes – unrestricted No – context-sensitive
Definition: Context-Free Grammars

Grammar G  (V , T , S , P)

Vocabulary Terminal Start


symbols variable

Productions of the form:


A x
Non-Terminal String of variables
and terminals
Derivation Tree of A Context-free Grammar

► Represents the language using an ordered rooted tree.

► Root represents the starting symbol.


► Internal vertices represent the nonterminal symbol that
arise in the production.
► Leaves represent the terminal symbols.

► If the production A → w arise in the derivation, where w


is a word, the vertex that represents A has as children
vertices that represent each symbol in w, in order from
left to right.
Language Generated by a
Grammar

Example: Let G = ({S,A,a,b},{a,b}, S,


{S → aA, S → b, A → aa}). What is L(G)?
Easy: We can just draw a tree
of all possible derivations. S
– We have: S  aA  aaa.
– and S  b. aA b
Answer: L = {aaa, b}.
Example of a
aaa derivation tree
or parse tree
or sentence
diagram.
Example: Derivation Tree

► Let G be a context-free grammar with the productions


P = {S →aAB, A →Bba, B →bB, B →c}. The word w =
acbabc can be derived from S as follows:
S ⇒ aAB →a(Bba)B ⇒ acbaB ⇒ acba(bB) ⇒ acbabc
Thus, the derivation tree is given as follows:
S

a A B

B b a b B

c c
Backus-Naur Form

sentence  noun phrase verb phrase


noun phrase  article [adjective] noun
verb phrase  verb [adverb]
article  a | the Square brackets []
adjective  large | hungry mean “optional”
noun  rabbit | mathematician
verb  eats | hops Vertical bars
mean “alternatives”
adverb  quickly | wildly
Generating Infinite Languages

A simple PSG can easily generate an infinite language.


Example: S → 11S, S → 0 (T = {0,1}).
The derivations are:
– S0
– S  11S  110
– S  11S  1111S  11110
L = {(11)*0} – the
– and so on…
set of all strings
consisting of some
number of concaten-
ations of 11 with itself,
followed by 0.
Another example

Construct a PSG that generates the language L =


{0n1n | nN}.
– 0 and 1 here represent symbols being concatenated n
times, not integers being raised to the nth power.
Solution strategy: Each step of the derivation
should preserve the invariant that the number of
0’s = the number of 1’s in the template so far, and
all 0’s come before all 1’s.
Solution: S → 0S1, S → λ.
Context-Sensitive Languages

 The language { anbncn | n  1} is context-sensitive


but not context free.
A grammar for this language is given by:
S  aSBC | aBC
CB  BC
aB  ab
Terminal bB  bb
and bC  bc
non-terminal cC  cc
A derivation from this grammar is:-
S  aSBC
 aaBCBC (using S  aBC)
 aabCBC (using aB  ab)
 aabBCC (using CB  BC)
 aabbCC (using bB  bb)
 aabbcC (using bC  bc)
 aabbcc (using cC  cc)
 which derives a2b2c2.

You might also like