0% found this document useful (0 votes)

97 views36 pages

At Module-3

The document discusses properties of regular languages and context-free grammars. It covers topics like proving languages are not regular using the pumping lemma, closure properties of regular languages under operations like union and complement, and definitions of context-free grammars. Specific examples are provided to demonstrate proofs that languages are not regular and constructions of regular expressions and DFAs for languages under operations.

Uploaded by

md shakil ahsan mazumder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views36 pages

At Module-3

Uploaded by

md shakil ahsan mazumder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

MODULE 03

3. Properties of regular languages-

3.1 Proving Languages not to be Regular
3.2 The Pumping Lemma for Regular Languages
3.3 Closure properties of regular languages –
3.3.1 closure of regular languages under Boolean operations-union,
3.3.2 complementation,
3.3.3 intersection & difference,
3.3.4 Equivalence and minimization of automata

3.4 Context Free Grammars and Languages: Definition of context-free grammars,

3.5 derivations using a grammar,
3.6 leftmost and rightmost derivations,
3.7 The language of a grammar and sentential forms,
exercise problems,
3.8 Parse trees – constructing a parse tree, the yield of a parse tree, inference, derivations and
parse trees,
3.9 Applications of context free grammars – Markup languages,
3.10 XML and document-type definitions,
3.11 Ambiguity in grammars and languages – ambiguous grammars.
3.1 Proving Languages not to be Regular

The Pumping Lemma is used for proving that a language is not

regular. Here is the Pumping Lemma.

If L is a regular language, then there is an integer n > 0 with the

property that: (*) for any string x ∈ L where |x| ≥ n, there are
strings u, v, w such that

(i) x = uvw,
(ii) v 6= ǫ,
(iii) |uv| ≤ n,
(iv) uvkw ∈ L for all k ∈ N. To prove that a language L is not
regular, we use proof by contradiction.

Here are the steps.

1. Suppose that L is regular.

2. Since L is regular, we apply the Pumping Lemma

and assert the existence of a number n > 0 that satisfies
the property (*).

3. Give a particular string x such that

(a) x ∈ L,
(b) |x| ≥ n.
This the trickiest part. A wrong choice here will
make step 4 impossible.

4. By Pumping Lemma, there are strings u, v, w such

that (i)-(iv) hold. Pick a particular number k ∈ N and
argue that uvkw 6∈ L, thus yielding our desired
contradiction. What follows are two example proofs using
Pumping Lemma.

3.2 The Pumping Lemma for Regular Languages

Theorem
Let L be a regular language. Then there exists a constant ‘c’ such that for every
string w in L −
|w| ≥ c
3=3

We can break w into three strings, w = xyaz, such that –

X,y,z,xy,yz,xz,xyz=7=0-6=7=c=x,y,z,a,xy,ya,az,zx,xya,yaz,azx,xyaz=12
W=xy= x,y,xy=3
W=y= y=1>0

 |y| > 0
 |xz| ≤ c
 For all k ≥ 0, the string xy z is also in L.
k

Applications of Pumping Lemma

Pumping Lemma is to be applied to show that certain languages are not regular. It
should never be used to show a language is regular.
 If L is regular, it satisfies Pumping Lemma.
 If L does not satisfy Pumping Lemma, it is non-regular.

Method to prove that a language L is not regular

 At first, we have to assume that L is regular.
 So, the pumping lemma should hold for L.
 Use the pumping lemma to obtain a contradiction −
o Select w such that |w| ≥ c

o Select y such that |y| ≥ 1

o Select x such that |xy| ≤ c

o Assign the remaining string to z.

o Select k such that the resulting string is not in L.

Hence L is not regular.

Problem
Prove that L = {a b | i ≥ 0} is not regular.
i i

Solution −
 At first, we assume that L is regular and n is the number of states.
 Let w = a b . Thus |w| = 2n ≥ n.
n n

 By pumping lemma, let w = xyz, where |xy| ≤ n.

 Let x = a , y = a , and z = a b , where p + q + r = n, p ≠ 0, q ≠ 0, r ≠ 0. Thus |y| ≠ 0.
p q r n

 Let k = 2. Then xy z = a a a b . 2 p 2q r n

 Number of as = (p + 2q + r) = (p + q + r) + q = n + q
 Hence, xy z = a b . Since q ≠ 0, xy z is not of the form a b .
2 n+q n 2 n n

 Thus, xy z is not in L. Hence L is not regular.

2
3.3 Properties of Regular Languages
So far we have seen different ways of specifying regular language: DFA, NFA, ε-NFA, regular
expressions and regular grammar. We noted that all these different expressions are equal in
power by showing the equivalences. Regular expressions and grammars are considered as
generators of regular language while the machines (DFA, NFA, ε-NFA) are considered as
acceptors of the language.
Now we will look at the properties of regular language. The properties can be broadly classified
as two parts: (A) Closure properties and (B) Decision properties

(A) Closure Properties

1. Complementation
If a language L is regular its complement L' is regular.
Let DFA(L) denote the DFA for the language L. Modify the DFA as follows to obtain DFA(L').

1.Change the final states to non-final states.

2.Change the non-final states to final states.
Since there exists a DFA(L') now, L' is regular.
This can be shown by an example using a DFA. Let L denote the language containing strings that
begins and ends with a. Σ = {a, b}. The DFA for L is given below.
Q0-> Q1->Q2= {aa}

Note: q3 denotes the

dead state.
Once you enter q3,
you remain in it
forever.

L' denotes the language that does not contain strings that begin and end with a. This implies L'
contains strings that
(A,B,C,D)
L=(A)
L’=(B,C,D)
L
 begins with a and ends with a
L’
 begins with a and ends with b
 begins with b and ends with a
 begins with b and ends with b
The DFA for L' is obtained by flipping the final states of DFA(L) to non-final states and vice-versa.
The DFA for L' is given below.


q0 ensures ε is accepted


q1 ensures all strings that
begin with a and end with
b are accepted.


q3 ensures all strings that
begin with b (ending with
either a or b) are
accepted.

Important Note: While specifying the DFA for L, we have also included the dead state q3. It is
important to include the dead state(s) if we are going to derive the complement DFA since, the
dead state(s) too would become final in the complementation. If we didn't add the dead state(s)
originally, the complement will not accept all strings supposed to be accepted.
In the above example, if we didn't include q3 originally, the complement will not accept strings
starting with b. It will only accept strings that begin with a and end with b which is only a subset of
the complement.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER COMPLEMENTATION.

2. Union

If L1 and L2 are regular, then L1 ∪ L2 is regular.

This is easier proved using regular expressions. If L1 is regular, there exists a regular expression
R1 to describe it. Similarly, if L2 is regular, there exists a regular expression R2 to describe it. R1
+ R2 denotes the regular expression that describe L1 ∪ L2. Therefore, L1 ∪ L2 is regular.
This again can be shown using an example. If L1 is a language that contains strings that begin
with a and L2 is a language that contain strings that end with a, then L1 ∪ L2 denotes the language
the contain strings that either begin with a or end with a.

- a(a+b)* is the regular expression that denotes L1.

- (a+b)*a is the regular expression that denotes L2.
- L1 ∪ L2 is denoted by the regular expression
a(a+b)* + (a+b)*a. Therefore, L1 ∪ L2 is regular.
In terms of DFA, we can say that a DFA(L1 ∪ L2) accepts those strings that are accepted by either
DFA(L1) or DFA(L2) or both.

 DFA(L1 ∪ L2) can be constructed by adding a new start state and new final state.
 The new start state connects to the two start states of DFA(L1) and DFA(L2) by εtransitions.
 Similarly, two ε transitions are added from the final states of DFA(L1) and DFA(L2) to the new
final state.
 Convert this resulting NFA to its equivalent DFA.

As an exercise you can try this approach of DFA construction for union for the given example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER UNION.

3. Intersection
If L1 and L2 are regular, then L1 ∩ L2 is regular.
Since a language denotes a set of (possibly infinite) strings and we have shown above that
regular languages are closed under union and complementation, by De Morgan's law can be
applied to show that regular languages are closed under intersection too.

L1 and L2 are regular ⇒ L1' and L2' are regular (by Complementation property)
L1' ∪ L2' is regular (by Union property)
L1 ∩ L2 is regular (by De Morgan's law)
(A’UB’)’=A’ ‘ U’ B’ ‘
A Insec B
In terms of DFA, we can say that a DFA(L1 ∩ L2) accepts those strings that are accepted by both
DFA(L1) and DFA(L2).
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER INTERSECTION.

4. Concatenation
If L1 and L2 are regular, then L1 . L2 is regular.

L1= a*
L2=b*
L1.L2= a*.b*= ab*

L1UL2=L1+L2=L1|L2=a*|b*=(a|b)*
This can be easily proved by regular expressions. If R1 is a regular expression denoting L1 and
R2 is a regular expression denoting L2, then we R1 . R2 denotes the regular expression denoting
L1 . L2. Therefore, L1 . L2 is regular.
In terms of DFA, we can say that a DFA(L1 . L2) can be constructed by adding an ε-trainstion from
the final state of DFA(L1) - which now ceases to be the final state - to the start state of DFA(L2).
You can try showing this using an example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER CONCATENATION.

5. Kleene star
If L is regular, then L* is regular.
This can be easily proved by regular expression.
If L is regular, then there exists a regular expression R.
We know that if R is a regular expression, R* is a regular expression too. R* denotes the
language L*. Therefore L* is regular.
In terms of DFA, in the DFA(L) we add two ε transitions, one from start state to final state and
another from final state to start state. This denotes DFA(L*). You can try showing this for an
example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER KLEENE STAR.

6. Difference
If L1 and L2 are regular, then L1 - L2 is regular.
We know that L1 - L2 = L1 ∩ L2'

L1 and L2 are regular ⇒ L1 and L2' are regular (by Complementation property)
L1 ∩ L2' is regular (by Intersection property)
L1 - L2 is regular (by De Morgan's law)
In terms of DFA, we can say that a DFA(L1 - L2) accepts those strings that are accepted by both
DFA(L1) and not accepted by DFA(L2). You can try showing this for an example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER DIFFERENCE.

7. Reverse
If L is regular, then LR is regular.
Let DFA(L) denote the DFA of L. Make the following modifications to construct DFA(LR).

1. Change the start state of DFA(L) to the final state.

2. Change the final state of DFA(L) to the start state.

In case there are more than one final state in DFA(L), first add a new final state and add
ε- transitions from the final states (which now cease to be final states any more) and
perform this step.

3. Reverse the direction of the arrows.

You can try showing this using an example.

CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER REVERSAL.

Difference Between Union & Concatenation

the kleene star of concatenation gives

(ab)∗={ϵ,ab,abab,ababab,...}(ab)∗={ϵ,ab,abab,ababab,...}
while the kleene star of union gives

(a+b)∗=(a|b)*=(aUb)={ϵ,a,b,aa,ab,ba,bb,…}(a+b)∗={ϵ,a,b,
aa,ab,ba,bb,…}
so you got it correctly, and indeed all the words you write belong to the language.
3.4 Introduction to Grammars and Languages:
In the literary sense of the term, grammars denote syntactical rules for conversation in
natural languages. Linguistics have attempted to define grammars since the inception
of natural languages like English, Sanskrit, Mandarin, etc.
26+26=52a-z A-Z, eps=53
My name is kamal

The theory of formal languages finds its applicability extensively in the fields of
Computer Science. Noam Chomsky gave a mathematical model of grammar in 1956
which is effective for writing computer languages.

JAVA, /c C++
Int = INT= int= iNt
Int a,b,c…
float

Grammar
A grammar G can be formally written as a 4-tuple (N, T, S, P) where −
 N or V is a set of variables or non-terminal symbols.
N

 T or ∑ is a set of Terminal symbols.

 S is a special variable called the Start symbol, S ∈ N
 P is Production rules for Terminals and Non-terminals. A production rule has the

form α → β, where α and β are strings on V ∪ ∑ and least one symbol of α

belongs to V .N

Terminal and non terminal

Singma: {a,b,c,….z,A,B,C,D…,eps}=53=T= a……z,0 &1, eps

N=Vn= eps A---------a----------- >B= A….Z

Example
Grammar G1 −
({S, A, B}, {a, b}, S, {S → AB, A → a, B → b})
(Vn, T, S, P}
Here,
 S, A, and B are Non-terminal symbols;
 a and b are Terminal symbols
 S is the Start symbol, S ∈ N
 Productions, P : S → AB, A → a, B → b

Example
Grammar G2 −
(({S, A}, {a, b}, S,{S → aAb, aA → aaAb, A → ε } )
Here,
 S and A are Non-terminal symbols.
 a and b are Terminal symbols.
 ε is an empty string.
 S is the Start symbol, S ∈ N
 Production P : S → aAb, aA → aaAb, A → ε

Definition of context-free grammars,

Context free grammar is a formal grammar which is used to generate all possible strings
in a given formal language.

Context free grammar G can be defined by four tuples as:

G= (V, T, P, S)

Where,

G describes the grammar

T describes a finite set of terminal symbols.

V describes a finite set of non-terminal symbols

P describes a set of production rules

S is the start symbol.

In CFG, the start symbol is used to derive the string. You can derive the string by
repeatedly replacing a non-terminal by the right hand side of the production, until all
non-terminal have been replaced by terminal symbols.

Example:

L= {wcwR | w € (a, b)*}

Production rules:

1. S → aSa
2. S → bSb
3. S → c

Now check that abbcbba string can be derived from the given CFG.

1. S ⇒ aSa
2. S ⇒ abSba
3. S ⇒ abbSbba
4. S ⇒ abbcbba

By applying the production S → aSa, S → bSb recursively and finally applying the
production S → c, we get the string abbcbba

3.5 derivations using a grammar

Derivations from a Grammar
Strings may be derived from other strings using the productions in a grammar. If a
grammar G
Strings may be derived from other strings using the productions in a grammar. If a
grammar G has a production α → β, we can say that x α y derives x β y in G. This
derivation is written as −
x α y ⇒G x β y

Example
Let us consider the grammar −
G2 = ({S, A}, {a, b}, S, {S → aAb, aA → aaAb, A → ε } )
Some of the strings that can be derived are −
S ⇒ aAb using production S → aAb
⇒ aaAbb using production aA → aAb
⇒ aaaAbbb using production aA → aaAb
⇒ aaabbb using production A → ε
The set of all strings that can be derived from a grammar is said to be the language
generated from that grammar. A language generated by a grammar G is a subset
formally defined by
L(G)={W|W ∈ ∑*, S ⇒G W}
If L(G1) = L(G2), the Grammar G1 is equivalent to the Grammar G2.

Example
If there is a grammar
G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b}
Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted
string is ab, i.e.,
L(G) = {ab}

Example
Suppose we have the following grammar −
G: N = {S, A, B} T = {a, b} P = {S → AB, A → aA|a, B → bB|b}
The language generated by this grammar −
L(G) = {ab, a b, ab , a b , ………}
2 2 2 2

= {a b | m ≥ 1 and n ≥ 1}
m n

Construction of a Grammar Generating a Language

We’ll consider some languages and convert it into a grammar G which produces those
languages.

Example
Problem − Suppose, L (G) = {a b | m ≥ 0 and n > 0}. We have to find out the
m n

grammar G which produces L(G).

Solution
Since L(G) = {a b | m ≥ 0 and n > 0}
m n

the set of strings accepted can be rewritten as −

L(G) = {b, ab,bb, aab, abb, …….}
Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’
including null.
To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions −
S → aS , S → B, B → b and B → bB
S → B → b (Accepted)
S → B → bB → bb (Accepted)
S → aS → aB → ab (Accepted)
S → aS → aaS → aaB → aab(Accepted)
S → aS → aB → abB → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated
by the production set.
Hence the grammar −
G: ({S, A, B}, {a, b}, S, { S → aS | B , B → b | bB })

Example
Problem − Suppose, L (G) = {a b | m > 0 and n ≥ 0}. We have to find out the grammar
m n

G which produces L(G).

Solution −
Since L(G) = {a b | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as −
m n

L(G) = {a, aa, ab, aaa, aab ,abb, …….}

Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’
including null.
To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the
productions –

S → aA, A → aA , A → B, B → bB ,B → λ

S → aA → aB → aλ → a (Accepted)

S → aA → aaA → aaB → aaλ → aa (Accepted)

S → aA → aB → abB → abλ → ab (Accepted)

S → aA → aaA → aaaA → aaaB → aaaλ → aaa (Accepted)

S → aA → aaA → aaB → aabB → aabλ → aab (Accepted)

S → aA → aB → abB → abbB → abbλ → abb (Accepted)

Thus, we can prove every single string in L(G) is accepted by the language generated
by the production set.
Hence the grammar −
G: ({S, A, B}, {a, b}, S, {S → aA, A → aA | B, B → λ | bB })

Chomsky Classification of Grammars

According to Noam Chomosky, there are four types of grammars − Type 0, Type 1,
Type 2, and Type 3. The following table shows how they differ from each other −
Grammar Grammar Accepted Language Accepted Automaton
Type

Type 0 Unrestricted grammar Recursively enumerable Turing Machine

language

Type 1 Context-sensitive Context-sensitive Linear-bounded

grammar language automaton

Type 2 Context-free Context-free language Pushdown automaton

grammar

Type 3 Regular grammar Regular language Finite state

automaton

Take a look at the following illustration. It shows the scope of each type of grammar −

Type - 3 Grammar
Type-3 grammars generate regular languages. Type-3 grammars must have a single
non-terminal on the left-hand side and a right-hand side consisting of a single terminal
or single terminal followed by a single non-terminal.
The productions must be in the form X → a or X → aY
where X, Y ∈ N (Non terminal)
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.

Example

X → ε
X → a | aY
Y → b

Type - 2 Grammar
Type-2 grammars generate context-free languages.
The productions must be in the form A → γ
where A ∈ N (Non terminal)
and γ ∈ (T ∪ N)* (String of terminals and non-terminals).
These languages generated by these grammars are be recognized by a non-
deterministic pushdown automaton.

Example

S → X a
X → a
X → aX
X → abc
X → ε
Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The productions must be in
the form
αAβ→αγβ
where A ∈ N (Non-terminal)
and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals)
The strings α and β may be empty, but γ must be non-empty.
The rule S → ε is allowed if S does not appear on the right side of any rule. The
languages generated by these grammars are recognized by a linear bounded
automaton.

Example

AB → AbBc
A → bcA
B → b
Type - 0 Grammar
Type-0 grammars generate recursively enumerable languages. The productions have
no restrictions. They are any phase structure grammar including all formal grammars.
They generate the languages that are recognized by a Turing machine.
The productions can be in the form of α → β where α is a string of terminals and
nonterminals with at least one non-terminal and α cannot be null. β is a string of
terminals and non-terminals.

Example
S → ACaB
Bc → acB
CB → DB
aD → Db

Leftmost and Rightmost Derivations

The Language of a Grammar and Sentential forms,

Exercise Problems
Parse trees – constructing a parse tree, the yield of a parse tree,
inference, derivations and parse trees

Context-Free Grammar Introduction

Definition − A context-free grammar (CFG) consisting of a finite set of grammar rules

is a quadruple (N, T, P, S) where
 N is a set of non-terminal symbols.
 T is a set of terminals where N ∩ T = NULL.
 P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the production
rule P does have any right context or left context.
 S is the start symbol.

Example

 The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.

 The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε
 The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε

Generation of Derivation Tree

A derivation tree or parse tree is an ordered rooted tree that graphically represents the
semantic information a string derived from a context-free grammar.

Representation Technique
 Root vertex − Must be labeled by the start symbol.
 Vertex − Labeled by a non-terminal symbol.
 Leaves − Labeled by a terminal symbol or ε.
If S → x x …… x is a production rule in a CFG, then the parse tree / derivation tree will
1 2 n

be as follows −
There are two different approaches to draw a derivation tree −
Top-down Approach −
 Starts with the starting symbol S
 Goes down to tree leaves using productions

Bottom-up Approach −
 Starts from tree leaves
 Proceeds upward to the root which is the starting symbol S

Derivation or Yield of a Tree

The derivation or the yield of a parse tree is the final string obtained by concatenating
the labels of the leaves of the tree from left to right, ignoring the Nulls. However, if all
the leaves are Null, derivation is Null.
Example
Let a CFG {N,T,P,S} be
N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε
One derivation from the above CFG is “abaabb”
S → SS → aSbS → abS → abaSb → abaaSbb → abaabb
Sentential Form and Partial Derivation Tree
A partial derivation tree is a sub-tree of a derivation tree/parse tree such that either all
of its children are in the sub-tree or none of them are in the sub-tree.
Example
If in any CFG the productions are −
S → AB, A → aaA | ε, B → Bb| ε
the partial derivation tree can be the following −
If a partial derivation tree contains the root S, it is called a sentential form. The above
sub-tree is also in sentential form.

Leftmost and Rightmost Derivation of a String

 Leftmost derivation − A leftmost derivation is obtained by applying production
to the leftmost variable in each step.
 Rightmost derivation − A rightmost derivation is obtained by applying
production to the rightmost variable in each step.

Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be −
X → X+X → a+X → a + X*X → a+a*X → a+a*a
The stepwise derivation of the above string is shown as below −
The rightmost derivation for the above string "a+a*a" may be −
X → X*X → X*a → X+X*a → X+a*a → a+a*a
The stepwise derivation of the above string is shown as below −
Left and Right Recursive Grammars
In a context-free grammar G, if there is a production in the form X → Xa where X is a
non-terminal and ‘a’ is a string of terminals, it is called a left recursive production.
The grammar having a left recursive production is called a left recursive grammar.
And if in a context-free grammar G, if there is a production is in the form X →
aX where X is a non-terminal and ‘a’ is a string of terminals, it is called a right
recursive production. The grammar having a right recursive production is called
a right recursive grammar.
If a context free grammar G has more than one derivation tree for some string w ∈
L(G), it is called an ambiguous grammar. There exist multiple right-most or left-most
derivations for some string generated from that grammar.
Problem
Check whether the grammar G with production rules −
X → X+X | X*X |X| a
is ambiguous or not.

Solution
Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 − X → X+X → a +X → a+ X*X → a+a*X → a+a*a
Parse tree 1 −

Derivation 2 − X → XX → X+XX → a+ XX → a+aX → a+a*a

Parse tree 2 −
Since there are two parse trees for a single string "a+a*a", the grammar G is
ambiguous.

3.9 Applications of context free grammars – Markup

languages
Applications of Context-Free Grammers Gur Saran Adhar Grammars are used to
describe programming languages. Most importantly there is a mechanical way of turning
the description as a Context Free Grammar (CFG) into a parser, the component of the
compiler that discovers the structure of the source program and represents that
structure as a tree. For example, The Document Type Definition (DTD) feature of XML
(Extensible Markup Language) is essentially a context-free grammar that describes the
allowable HTML tags and the ways in which these tags may be nested. For example,
one could describe a sequence of characters that was intended to be interpreted as a
phone number by and.

Example-1: Typical programming languages use parentheses and or brackets in a

nested and balanced fashion. That is, we must be able to match some left parenthesis
against a right parenthesis that appears immediately to its right, remove both of them
and repeat. If we eventually eliminate all the parenthesis, then the string was balanced.

Example of strings with balanced parenthesis are (()), ()(), (()()), while )(, and (() are not
balanced. A grammar with the following productions generates all and only the strings
with balanced parenthesis:

B → BB | (B) | λ
The first production, B → BB, says that concatenation of two strings of balanced
parenthesis is balanced. That is, we can match the parenthesis in two strings
independently. The second production, B → (B), says that if we place a pair of
parenthesis around a balanced string, then the result is balanced. The third production,
B → λ is the basis, which says that an empty string is balanced.

Example-2:

There are numerous aspects of typical programming language that behave like
balanced parentheses. Beginning and ending of code blocks, such as begin and end in
Pascal, or the curly braces { . . . } of C, are examples.

There is a related pattern thar appears occasionally, where ”parentheses” can be

balanced with the exception that there can be a unbalanced left parentheses.

An example is the treatment of if and else in C. An if-clause can appear unbalanced by

any else-clause, or it may be balanced by a matching else-clause.

A grammer that generates the possible sequence of if and else (represented by i and e,
respectively) is:

S → SS | iS | iSe | λ

For instance, ieie, iie, and iei are possible sequences of if and else’s and each of these
strings is generated by the above grammer. Some examples of illegal sequences not
generated by the grammer are, ei, ieeii, iee.

Example-3:

We give below CFG that describes some parts of the structure of HTML (Hypertext
Markup Language).

Let G be a grammar with the set of variables:

V = {S, < Noun phrase >, < Verb phrase >, < Adjective phrase >, < Noun >, < Verb >, <
Adjective >} the alphabet set: Σ = {big, stout, John, bought, white, car, J im, cheese, ate,
green} with the

rules: (1) S →< Noun phrase >< Verb phrase >

Then the grammar generates, in particular, the following strings: John bought car Jim
ate cheese big Jim ate green cheese John bought big car big stout John bought big
white car Unfortunately, the grammer also generates sentences like: big stout car
bought big stout car big cheese ate Jim green Jim ate green big Jim.

3.10 XML and document-type definitions

Document Type Definition – DTD

A Document Type Definition (DTD) describes the tree structure of a document

and something about its data. It is a set of markup affirmations that actually
define a type of document for the SGML family, like GML, SGML, HTML, XML.
A DTD can be declared inside an XML document as inline or as an external
recommendation. DTD determines how many times a node should appear, and
how their child nodes are ordered.

There are 2 data types, PCDATA and CDATA

 PCDATA is parsed character data.
 CDATA is character data, not usually parsed.

Syntax:
<!DOCTYPE element DTD identifier
[
first declaration
second declaration
.
.
nth declaration
]>
Example:

DTD for the above tree is:

XML Document with an Internal DTD:
 XML

<?xml version="1.0"?>

<!DOCTYPE address [

<!ELEMENT address (name, email, phone, birthday)>

<!ELEMENT name (first, last)>

<!ELEMENT first (#PCDATA)>

<!ELEMENT last (#PCDATA)>

<!ELEMENT email (#PCDATA)>

<!ELEMENT phone (#PCDATA)>

<!ELEMENT birthday (year, month, day)>

<!ELEMENT year (#PCDATA)>

<!ELEMENT month (#PCDATA)>

<!ELEMENT day (#PCDATA)>

<name>

<first>Rohit</first>

<last>Sharma</last>

</name>

<email>[email protected]</email>

</birthday>

</address>

The DTD above is interpreted like this:

 !DOCTYPE address defines that the root element of this document is

address.

 !ELEMENT address defines that the address element must contain four
elements: “name, email, phone, birthday”.
 !ELEMENT name defines that the name element must contain two elements:
“first, last”.

 !ELEMENT first defines the first element to be of type “#PCDATA”.

 !ELEMENT last defines the last element to be of type “#PCDATA”.

 !ELEMENT email defines the email element to be of type “#PCDATA”.

 !ELEMENT phone defines the phone element to be of type “#PCDATA”.

 !ELEMENT birthday defines that the birthday element must contain three
elements “year, month, day”.
 !ELEMENT year defines the year element to be of type “#PCDATA”.
 !ELEMENT month defines the month element to be of type
“#PCDATA”.
 !ELEMENT day defines the day element to be of type “#PCDATA”.

XML document with an external DTD:

 XML

<?xml version="1.0"?>

<!DOCTYPE address SYSTEM "address.dtd">

<name>

<first>Rohit</first>

<last>Sharma</last>

</name>

<email>[email protected]</email>

</birthday>

</address>

address.dtd:

 <!ELEMENT address (name, email, phone, birthday)>

 <!ELEMENT name (first, last)>

 <!ELEMENT first (#PCDATA)>
 <!ELEMENT last (#PCDATA)>

 <!ELEMENT email (#PCDATA)>

 <!ELEMENT phone (#PCDATA)>

 <!ELEMENT birthday (year, month, day)>

 <!ELEMENT year (#PCDATA)>
 <!ELEMENT month (#PCDATA)>
 <!ELEMENT day (#PCDATA)>
Output:

Attention reader! Don’t stop learning now. Get hold of all the important HTML
concepts with the Web Design for Beginners | HTML course.
3.11 Ambiguity in grammars and languages – ambiguous
grammars.
Ambiguity in Grammar
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivation or more than one parse tree for the given input
string. If the grammar is not ambiguous, then it is called unambiguous.

If the grammar has ambiguity, then it is not good for compiler construction. No method
can automatically detect and remove the ambiguity, but we can remove ambiguity by re-
writing the whole grammar without ambiguity.

Example 1:
Let us consider a grammar G with the production rule

1. E → I
2. E → E + E
3. E → E * E
4. E → (E)
5. I → ε | 0 | 1 | 2 | ... | 9

Solution:

For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is
ambiguous.

Example 2:
Check whether the given grammar G is ambiguous or not.

1. E → E + E
2. E → E - E
3. E → id

Solution:

From the above grammar String "id + id - id" can be derived in 2 ways:

First Leftmost derivation

1. E → E + E
2. → id + E
3. → id + E - E
4. → id + id - E
5. → id + id- id

Second Leftmost derivation

1. E → E - E
2. →E+E-E
3. → id + E - E
4. → id + id - E
5. → id + id - id

Since there are two leftmost derivation for a single string "id + id - id", the grammar G is
ambiguous.

Example 3:
Check whether the given grammar G is ambiguous or not.

1. S → aSb | SS
2. S → ε

Solution:

For the string "aabb" the above grammar can generate two parse trees

Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.

Example 4:
Check whether the given grammar G is ambiguous or not.

1. A → AA
2. A → (A)
3. A → a
Solution:

For the string "a(a)aa" the above grammar can generate two parse trees:

Since there are two parse trees for a single string "a(a)aa", the grammar G is
ambiguous.

Unofficial Captain's Guide To Sunless Sea JSON Objects
0% (1)
Unofficial Captain's Guide To Sunless Sea JSON Objects
66 pages
Parsing and Manipulating Text PDF
No ratings yet
Parsing and Manipulating Text PDF
33 pages
4-DFA to RE conversion-24-01-2025
No ratings yet
4-DFA to RE conversion-24-01-2025
11 pages
Wa0012.
No ratings yet
Wa0012.
50 pages
CS242_Module 4
No ratings yet
CS242_Module 4
66 pages
Chapter 2 REGULAR EXPRESSION
No ratings yet
Chapter 2 REGULAR EXPRESSION
26 pages
Properties of Regular Languages
No ratings yet
Properties of Regular Languages
4 pages
Regular Language Properties
No ratings yet
Regular Language Properties
65 pages
Chapter 2
No ratings yet
Chapter 2
25 pages
109 - Chapter 03 - Regular Expression and Language
No ratings yet
109 - Chapter 03 - Regular Expression and Language
41 pages
Closure prop-DFA and NFA
No ratings yet
Closure prop-DFA and NFA
49 pages
Chapter 3 REGULAR EXPRESSION
No ratings yet
Chapter 3 REGULAR EXPRESSION
28 pages
Formal Languages
No ratings yet
Formal Languages
47 pages
Chapter 3 Regular Expression
No ratings yet
Chapter 3 Regular Expression
25 pages
4-Regular Languages and Properties of Regular Languages-02-09-2024
No ratings yet
4-Regular Languages and Properties of Regular Languages-02-09-2024
59 pages
PPT 1(Chapter 4)
No ratings yet
PPT 1(Chapter 4)
15 pages
COMP 3803 - Solutions Assignment 2
No ratings yet
COMP 3803 - Solutions Assignment 2
9 pages
RegularLanguageProperties-myppt
No ratings yet
RegularLanguageProperties-myppt
68 pages
1 4 RegularLanguagesProperties
No ratings yet
1 4 RegularLanguagesProperties
35 pages
Module-2_Regular_Expressions
No ratings yet
Module-2_Regular_Expressions
95 pages
04 PDF
No ratings yet
04 PDF
43 pages
CamScanner 03-18-2024 19.50.56
No ratings yet
CamScanner 03-18-2024 19.50.56
5 pages
Chapter Two Regular Expression and Regular Language
No ratings yet
Chapter Two Regular Expression and Regular Language
30 pages
7B Midterm Review Solutions
No ratings yet
7B Midterm Review Solutions
11 pages
Unit-2. Regular Expressions and Languages-1
No ratings yet
Unit-2. Regular Expressions and Languages-1
103 pages
Module 3 RE&CFG
No ratings yet
Module 3 RE&CFG
108 pages
Chapter 03 - Regular Expression and Language
No ratings yet
Chapter 03 - Regular Expression and Language
42 pages
Lec 05
No ratings yet
Lec 05
36 pages
CH 3 - Regular Languages Amd Regular Grammars
No ratings yet
CH 3 - Regular Languages Amd Regular Grammars
67 pages
Hwsoln 04
No ratings yet
Hwsoln 04
8 pages
Properties of Regular Languages: Reading: Chapter 4
No ratings yet
Properties of Regular Languages: Reading: Chapter 4
58 pages
Unit 2 - Theory of Computation - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Theory of Computation - WWW - Rgpvnotes.in
10 pages
2.1regular Expression-UNIT - II
No ratings yet
2.1regular Expression-UNIT - II
31 pages
4 Regular Language Properties
No ratings yet
4 Regular Language Properties
44 pages
More Properties of Regular Languages
No ratings yet
More Properties of Regular Languages
59 pages
chapter 3
No ratings yet
chapter 3
10 pages
Chapter 3 REGULAR EXPRESSION
No ratings yet
Chapter 3 REGULAR EXPRESSION
26 pages
slides
No ratings yet
slides
8 pages
Automata Ch2
No ratings yet
Automata Ch2
37 pages
Homework
No ratings yet
Homework
8 pages
CS273 Theory of Automata & Fomal Languages: (WEEK-2) Lecture-3 & 4
No ratings yet
CS273 Theory of Automata & Fomal Languages: (WEEK-2) Lecture-3 & 4
41 pages
Regular_Sets_CFL
No ratings yet
Regular_Sets_CFL
3 pages
TCS UT 1 QB
No ratings yet
TCS UT 1 QB
8 pages
07 RegLangClosureProperties
No ratings yet
07 RegLangClosureProperties
23 pages
PPTX CH04
No ratings yet
PPTX CH04
15 pages
Formal Languages, Automata and Computability: (For Next Time: Read Chapter 1.3 of The Book)
No ratings yet
Formal Languages, Automata and Computability: (For Next Time: Read Chapter 1.3 of The Book)
56 pages
FA - 10-1 - Closure Properties of Regular Languages
No ratings yet
FA - 10-1 - Closure Properties of Regular Languages
5 pages
Regular and Nonregular Languages
No ratings yet
Regular and Nonregular Languages
69 pages
Week 7 - DFA Operations, Pumping Lemma
No ratings yet
Week 7 - DFA Operations, Pumping Lemma
43 pages
3B-Formal Languages
No ratings yet
3B-Formal Languages
24 pages
Vision 2023 Toc Chapter 4 Regular Language 26
No ratings yet
Vision 2023 Toc Chapter 4 Regular Language 26
5 pages
Dmbs U1
No ratings yet
Dmbs U1
12 pages
unit-3
No ratings yet
unit-3
71 pages
Unit 3
No ratings yet
Unit 3
98 pages
mt1 23s 008 SOLN
No ratings yet
mt1 23s 008 SOLN
7 pages
Unit-2_TOC
No ratings yet
Unit-2_TOC
5 pages
DFA Construction Ideas
No ratings yet
DFA Construction Ideas
4 pages
Computation_Theory_Lecture_2
No ratings yet
Computation_Theory_Lecture_2
5 pages
tafal unit-2,3,4 theory important questions
No ratings yet
tafal unit-2,3,4 theory important questions
19 pages
Theory of Automata Lecture 7 Slides
No ratings yet
Theory of Automata Lecture 7 Slides
22 pages
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Solid Creat Mod-2
No ratings yet
Solid Creat Mod-2
35 pages
Solid Creat Mod-3
No ratings yet
Solid Creat Mod-3
17 pages
Mes Redo
No ratings yet
Mes Redo
30 pages
Module 3 Solutions PCS Ia2 Q.banks
No ratings yet
Module 3 Solutions PCS Ia2 Q.banks
13 pages
Solid Creat Mod-1
No ratings yet
Solid Creat Mod-1
18 pages
At Module-4
No ratings yet
At Module-4
17 pages
DCN QB
No ratings yet
DCN QB
1 page
Module-1 1
No ratings yet
Module-1 1
46 pages
Os Q.B Module 1-4
No ratings yet
Os Q.B Module 1-4
6 pages
Module 1
No ratings yet
Module 1
67 pages
Os Q.bank Module 5
No ratings yet
Os Q.bank Module 5
1 page
Osi Model and TCP/IP Model: Dr. Rajat
No ratings yet
Osi Model and TCP/IP Model: Dr. Rajat
24 pages
DBMS QB
No ratings yet
DBMS QB
4 pages
Theory of Computation
No ratings yet
Theory of Computation
16 pages
Complier Design Gate Question
No ratings yet
Complier Design Gate Question
22 pages
INPUT BUFFERING,SPECIFICATION OF TOKENS,RECOGNITION OF TOKEN
No ratings yet
INPUT BUFFERING,SPECIFICATION OF TOKENS,RECOGNITION OF TOKEN
3 pages
Python For Data Science Cheat Sheet: Subset Slice
50% (2)
Python For Data Science Cheat Sheet: Subset Slice
1 page
Unit 4 - Turing Machineuu
No ratings yet
Unit 4 - Turing Machineuu
50 pages
Learning To Automatically Solve Logic Grid Puzzles
No ratings yet
Learning To Automatically Solve Logic Grid Puzzles
11 pages
Regular Languages: Outline: Finite State Automata (FSA)
No ratings yet
Regular Languages: Outline: Finite State Automata (FSA)
100 pages
Jde Otc Ledger
No ratings yet
Jde Otc Ledger
2 pages
Logic Section
No ratings yet
Logic Section
16 pages
Gujarat Technological University: W.E.F. AY 2018-19
No ratings yet
Gujarat Technological University: W.E.F. AY 2018-19
3 pages
Memo Java
No ratings yet
Memo Java
100 pages
Module 1
No ratings yet
Module 1
48 pages
Pointer
No ratings yet
Pointer
2 pages
MACHINE LEARNING New
No ratings yet
MACHINE LEARNING New
2 pages
2m Co1,2
No ratings yet
2m Co1,2
3 pages
T1.2 Predicate Logic-Rev2220
No ratings yet
T1.2 Predicate Logic-Rev2220
34 pages
DLP Practicals 1 To 7
No ratings yet
DLP Practicals 1 To 7
24 pages
Topic Outline Basic Formulas and Functions
No ratings yet
Topic Outline Basic Formulas and Functions
12 pages
TC Notes
No ratings yet
TC Notes
108 pages
CD File
100% (1)
CD File
47 pages
Rough Set Theory
No ratings yet
Rough Set Theory
19 pages
Warm Up Lesson Presentation Lesson Quiz: Holt Mcdougal Geometry Holt Geometry Holt Mcdougal Geometry
No ratings yet
Warm Up Lesson Presentation Lesson Quiz: Holt Mcdougal Geometry Holt Geometry Holt Mcdougal Geometry
23 pages
Scoa-Question Bank PDF
No ratings yet
Scoa-Question Bank PDF
8 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
100 ULC703 Introduction
No ratings yet
100 ULC703 Introduction
15 pages
Logic Notes-MSI
No ratings yet
Logic Notes-MSI
40 pages
Words
No ratings yet
Words
22 pages
Algorithm Design & Representation (Flowchart)
No ratings yet
Algorithm Design & Representation (Flowchart)
7 pages