Pcdunit2 Continuation
Pcdunit2 Continuation
Recognition of Tokens
Regular Expression
• A regular expression is a pattern which
specifies a set of strings of characters; it is
said to match certain strings.
• Declarative way of defining/describe
regular languages
Example
• Letter ( letter | digit )*
Terminology of Languages
• Alphabet : a finite set of symbols (ASCII characters)
• String :
– Finite sequence of symbols on an alphabet
– Sentence and word are also used in terms of string
is the empty string
– |s| is the length of string s.
• Language: sets of strings over some fixed alphabet
the empty set is a language.
– {} the set containing empty string is a language
• Operators on Strings:
– Concatenation: xy represents the concatenation of strings x and y.
– s =s s=s
– sn = s s s ……… s ( n times) s0 =
Operations on Languages
• Concatenation:
– L1L2 = { s1s2 | s1 L1 and s2 L2 }
• Union
– L1 L2 = { s | s L1 or s L2 }
• Exponentiation:
– L0 = {} L1 = L L2 = LL
• Kleene Closure
– L =
* Li
• Positive Closure
– L+ =
Li
i 1
L+ denotes “One or more concatenations of “ L
Example
• L1 = {a,b,c,d} L2 = {1,2}
• L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}
• L1 L2 = {a,b,c,d,1,2}
token attribute-value
Transition Diagrams for Identifiers and Keywords
install_num(
)
install_num(
)
install_num(
)
Write a RE for the language L accepting all the strings
ending with 00 over the alphabet ∑ = {0 , 1}
(0+1)* 00
(a+b)* a (a+b)(a+b)
Construct a RE for the L which accepts all strings with atleast two b over
the alphabet ∑ ={a,b}
11,1111,111111 (11)*
1.(11)* for odd
Construct a RE for the L over the set ∑ ={a,b} in which the
total number of a is divisible by 3.
R=(aaa)+ divisible by 3
R= (b*ab*ab*ab*)+
Write the RE for the L over the set of strings over {a,b,c} that
contain exactly one b
R= (a/c)* b (a/c)*
Write the RE for the L over the set of strings over {a,b,c} that
contain no two consecutive b’s
(b/ͼ) (a/c/ab/cb)*
Write the RE for the L over the set of strings over alphabet
{a,b,c} containing an even number of a’s
((b/c)* a (b/c)* a)* (b/c)*
Describe the language denoted by the following RE’s
•0(0/1)* 0
•(0/1)* 0 (0/1) (0/1)
•0* 10*10*10* or a*ba*ba*ba*.
•(00/11)* ((01/10)* (01/10) (00/11)* )*
The set of all string of 0’s and 1’s starting and ending with 0
The set of all strings of 0’s and 1’s with the third symbol from the right end is 0
The set of all strings of 0’s and 1’s with the number of 1’s in the string is 3
The set of all strings of 0’s and 1’s with even number of 0’s and 1’s.
Parse trees are comparatively less Syntax trees are comparatively more dense
dense than syntax trees. than parse trees.
Considering the following grammar-
E→E+T|T
T→TxF|F
F → ( E ) | id
( a + b ) * ( c – d ) + ( ( e / f ) * ( a + b ))