0% found this document useful (0 votes)
178 views

Practical File: Be (Cse) 6 Semester

The document describes the implementation of a lexical analyzer for a C++ language using Flex tools, including the structure of a Flex input file with declarations, definitions, rules, and user subroutines sections, and examples of regular expressions and rules used to recognize tokens like identifiers, integers, strings, keywords, and operators and print them.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views

Practical File: Be (Cse) 6 Semester

The document describes the implementation of a lexical analyzer for a C++ language using Flex tools, including the structure of a Flex input file with declarations, definitions, rules, and user subroutines sections, and examples of regular expressions and rules used to recognize tokens like identifiers, integers, strings, keywords, and operators and print them.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

PRACTICAL FILE

BE (CSE) 6th Semester


(Group: 4)

Compiler Design
January, 2018 – May, 2018

Submitted By
Om Prakash Sharma
Roll Number:
UE153068
Submitted To
Mr. Aakashdeep Sir

Computer Science and Engineering


University Institute of Engineering and Technology
Panjab University, Chandigarh – 160014, INDIA
2018
INDEX
S. No. Name of Practical/Program Date Page No. Sign Remarks
PRACTICAL No.1
Aim: - Implementation of Lexical Analyzer for C++ Language.
Theory: -
Lexical Analysis is the first phase of compiler also known as scanner. It
converts the input program into a sequence of Tokens.

Token: -
A lexical token is a sequence of characters that can be treated as a unit in the
grammar of the programming languages.

Example of tokens:
▪ Type token (id, number, real, . . .)
▪ Punctuation tokens (IF, void, return, . . .
▪ Alphabetic tokens (keywords)

Implementation: - List Data Structure of Python is used for storing all


the default operators, keywords and symbols. Then each character in the file
are compared against them. isnumeric() and isalpha() functions are also used
in this program for comparison. Special if blocks and string comparisons are
used for finding out the single and Multi-line comments and also for
operators like >>,++ and --.

Code: -
f = open("program.txt","r")

str = f.read()

#print(str)

f.close()

ops = ['+','*','/','-',"++","--"]

keywords =
["auto","break","bool","case","char","const","continue","default",

"do","double","else","enum","extern","float","for","goto",
"if","int","iostream","include","long","namespace","main","std",

"register","return","short","signed","sizeof","static",

"struct","switch","typedef","union","unsigned","using",

"void","volatile","while","true","false"]

symbols = ["(",")","{","}",";",",","<<",">>","#","<",">"]

temp = []

i=0

while i<len(str):

#print(i)

char = str[i]

if char=='/' and str[i+1]=='*': # Checking Multiline Comments

j=i+2

while str[j]!="*" and str[j+1]!="/":

temp.append(str[j])

j=j+1

print('"',''.join(temp).strip(),'" is a Multi-line Comment')

temp=[]

i=j+1

elif char=='/' and str[i+1]=='/': # Checking Singleline Comments

j=i+2

while str[j]!="\n":

temp.append(str[j])

j=j+1

print('"',''.join(temp).strip(),'" is a Single-line Comment')


temp=[]

i=j

elif char=='"': # Checking Input and Output String

j=i+1

while str[j]!='"':

temp.append(str[j])

j=j+1

print('" ',''.join(temp).strip(),' " is a string')

temp=[]

i=j

elif char.isnumeric()==True: # Checking Numbers

j=i

while str[j].isnumeric()==True:

temp.append(str[j])

j=j+1

print(''.join(temp).strip(), ' is a number')

temp=[]

i=j-1

elif ops.count(char)==1: # Checking Operators

if char=="+" and str[i+1]=="+": # Checking ++

i=i+1

char="++"

elif char=="-" and str[i+1]=="-": # Checking --

i=i+1
char="--"

print(char , "is an operator")

elif symbols.count(char)==1: # Checking Symbols

if char=="<" and str[i+1]=="<": # Checking <<

i=i+1

char="<<"

elif char==">" and str[i+1]==">": # Checking >>

i=i+1

char=">>"

print(char , "is a symbol")

elif char.isalnum()==True:

temp.append(char)

elif (char==" " or char=="\n") and len(temp)!=0:

#print(temp)

#print(''.join(temp))

if keywords.count(''.join(temp))==1:

print(''.join(temp).strip(), 'is a keyword')

temp=[]

else:

print(''.join(temp).strip(), 'is an identifier')

temp=[]

i=i+1
OUTPUT SNAPSHOT: -

Entered Test case 1.

#include <iostream>
using namespace std;
int main()
{
int n, i;
bool isPrime = true;

cout <<"Enter a positive integer ";


cin >> n;

for (i = 2; i <= n / 2; ++i)


{
if (n % i == 00)
{
isPrime = false;
break;
}
}
if (isPrime)
cout << "This is a prime number";
else
cout << "This is not a prime number";

return 0;
}
Output Snapshot.

Entered Test Case 2.

Entered Program: -

#include <iostream>
using namespace std;
int main()
{
string str = "C++ Programming";

// you can also use str.length()


cout << "String Length = " << str.size();

return 0;
}

Output Snapshot
Entered Test Case 3.
Entered Program

#include <iostream>
using namespace std;

int main()
{
cout << "Hello, World!";
return 0;
}

Output Snapshot
PRACTICAL No.2

Aim: - Implementation of Lexical Analyzer using Flex Tools.


Theory: -
Flex is a fast lexical analyzer generator. You specify the scanner you want in
the form of patterns to match and actions to apply for each token. flex takes
your specification and generates a combined NFA to recognize all your
patterns, converts it to an equivalent DFA, minimizes the automaton as much
as possible, and generates C code that will implement it. flex is similar to
another tool, lex, designed by Lesk and Schmidt, and bears many similarities
to it.

A flex Input File: -

flex input files are structured as follows:

%{
Declarations
%}
Definitions
%%
Rules
%%
User subroutines

The optional Declarations and User subroutines sections are used for
ordinary C code that you want copied verbatim to the generated C file.
Declarations are copied to the top of the file, user subroutines to the bottom.
The optional Definitions section is where you specify options for the
scanner and can set up definitions to give names to regular expressions as a
simple substitution mechanism that allows for more readable entries in the
Rules section that follows.
The required Rules section is where you
specified the patterns that identify your tokens and the action to perform
upon recognizing each token.
Flex Rules

Character classes [0-9] This means alternation of the characters in the


range listed (in this case: 0|1|2|3|4|5|6|7|8|9).
More than one range may be specified, e.g.
[0-9A-Za-z] as well as specifying individual
characters, as with [aeiou0-9].

Character exclusion ^ The first character in a character class may be ^ to


indicate the complement of the set of characters
specified. For example, [^0-9] matches any
nondigit character.

Arbitrary character . The period matches any single character except newline.

Single repetition x? 0 or 1 occurrence of x.

Nonzero repetition x+ x repeated one or more times; equivalent to xx*.

Specified repetition x{n,m} x repeated between n and m times.

Beginning of line ^x Match x at beginning of line only.

End of line x$ Match x at end of line only.

Context-sensitivity ab/cd Match ab but only when followed by cd. The


lookahead characters are left in the input stream
to be read for the next token.

Literal strings "x" This means x even if x would normally have


special meaning. Thus, "x*" may be used to
match x followed by an asterisk. You can turn off
the special meaning of just one character by
preceding it with a backslash, .e.g. \. matches
exactly the period character and nothing more.

Definitions {name} Replace with the earlier defined pattern called


name. This kind of substitution allows you to reuse
pattern pieces and define more readable patterns.
Implementation: - Regular Expressions used are :-
DIGIT [0-9]

LETTER [A-Za-z]

Comment [//]

LST [ \n\t\r]+

ID ({LETTER}|_)({LETTER}|{DIGIT})*

String (\"({LETTER}|[ ]+)*\")

SEPERATORS [;,.:]

Code: -
%{

#include <math.h>

#include <conio.h>

%}

DIGIT [0-9]

LETTER [A-Za-z]

Comment [//]

LST [ \n\t\r]+

ID ({LETTER}|_)({LETTER}|{DIGIT})*

String (\"({LETTER}|[ ]+)*\")

SEPERATORS [;,.:]

Keywords
(include|iostream|bool|auto|double|using|cin|cout|namespace|std|int|struct|bre
ak|else|long|switch|case|enum|register|typedef|char|extern|return|union|const|f
loat|short|unsigned|continue|for|signed|void|default|goto|sizeof|volatile|do|if|st
atic|while)
%%

{LST}+

{SEPERATORS} {printf("A Seperator: %s\n",yytext);}

{DIGIT}+ {printf("An integer: %s\n",yytext);}

{String} {printf("A String: %s\n",yytext);}

{Keywords} {printf("A keyword: %s\n",yytext);}

{ID} {printf("An Identifier: %s\n",yytext);}

"("|"#"|")"|"++"|"+"|"--"|"-
"|"*"|"/"|"%"|"<<"|"<"|">>"|">"|"="|"=="|"!="|">="|"<="|"&&"|"&"|"||"|"|"|"^"
|"["|"]"|"{"|"}"|"?" {printf("An Operator: %s\n",yytext);}

. {printf("Unrecognized character: %s\n",yytext);}

%%

main()

yyin=fopen("program.txt","r");

yylex();

getch();

int yywrap()

return 1; }
OUTPUT SNAPSHOT: -
Entered Program 1.
#include <iostream>
using namespace std;
int main()
{
int n, i;
bool isPrime = true;

cout <<"Enter a positive integer ";


cin >> n;

for (i = 2; i <= n / 2; ++i)


{
if (n % i == 00)
{
isPrime = false;
break;
}
}
if (isPrime)
cout << "This is a prime number";
else
cout << "This is not a prime number";

return 0;
}
Output: -

Entered Program 2.
#include <iostream>
using namespace std;

int main()
{
string str = "C++ Programming";

// you can also use str.length()


cout << "String Length = " << str.size();

return 0;
}

Ouput: -
PRACTICAL No. 3

Aim: - To Convert Regular Expression into NFA.


Theory: -

The regular expression parser, which we will create here will support these
three operations:

1. Kleen Closure or Star operator ("*")


2. Concatenation (For example: "ab")
3. Union operator (denoted with character "|")

However, many additional operators can be simulated by combining these


three operators. For instance:

1. A+ = AA* (At least one A)


2. [0-9] = (0|1|2|3|4|5|6|7|8|9)
3. [A-Z] = (A|B|...|Z), etc.

What is NFA?

NFA stands for nondeterministic finite-state automata. NFA can be seen


as a special kind of final state machine, which is in a sense an abstract model
of a machine with a primitive internal memory.

Let us look at the mathematical definition of NFA.

An NFA A consists of:

a. A finite set I of input symbols


b. A finite set S of states
c. A next-state function f from S x I into P(S)
d. A subset Q of S of accepting states
e. An initial state s0 from S

denoted as A(I, S, f, Q, s0)


Implementation: - Mainly List Data Structure is used in the Program.
Variable s_id is used for denoting state numbers.

List Transitions stores all the transitions in the format

[init_state, input_character, final_state]

List state keeps track of the starting and final state of each operation.

Code: -
# Python program to convert infix expression to postfix

# Class to convert the expression

class Conversion:

# Constructor to initialize the class variables

def __init__(self, capacity):

self.top = -1

self.capacity = capacity

# This array is used a stack

self.array = []

# Precedence setting

self.output = []

self.precedence = {'|': 1, '.': 2, '*': 3}

# check if the stack is empty

def isEmpty(self):

return True if self.top == -1 else False


# Return the value of the top of the stack

def peek(self):

return self.array[-1]

# Pop the element from the stack

def pop(self):

if not self.isEmpty():

self.top -= 1

return self.array.pop()

else:

return "$"

# Push the element to the stack

def push(self, op):

self.top += 1

self.array.append(op)

# A utility function to check is the given character

# is operand

def isOperand(self, ch):


return ch.isalpha()

# Check if the precedence of operator is strictly

# less than top of stack or not

def notGreater(self, i):

try:

a = self.precedence[i]

b = self.precedence[self.peek()]

return True if a <= b else False

except KeyError:

return False

# The main function that converts given infix expression

# to postfix expression

def infixToPostfix(self, exp):

# Iterate over the expression for conversion

for i in exp:

# If the character is an operand,

# add it to output

if self.isOperand(i):

self.output.append(i)

# If the character is an '(', push it to stack


elif i == '(':

self.push(i)

# If the scanned character is an ')', pop and

# output from the stack until and '(' is found

elif i == ')':

while ((not self.isEmpty()) and self.peek() != '('):

a = self.pop()

self.output.append(a)

if (not self.isEmpty() and self.peek() != '('):

return -1

else:

self.pop()

# An operator is encountered

else:

while (not self.isEmpty() and self.notGreater(i)):

self.output.append(self.pop())

self.push(i)

# pop all the operator from the stack

while not self.isEmpty():

self.output.append(self.pop())
regex = "".join(self.output)

return regex

# Driver program to test above function

exp = "a.b*.(a|b)*"

obj = Conversion(len(exp))

regex = obj.infixToPostfix(exp)

print(regex)

s_id = 1

Transitions = []

state = []

for i in range(len(regex)):

ch = regex[i]

if ch == "a" or ch == "b":

st_state = s_id

end_state = s_id + 1

state.append([st_state, end_state])

s_id = s_id + 2

Transitions.append([st_state , ch , end_state])

elif ch == ".":

b = state.pop()
a = state.pop()

st_state = s_id

end_state = s_id + 1

state.append([st_state,end_state])

s_id = s_id + 2

Transitions.append([st_state , "e" , a[0]])

Transitions.append([b[1], "e" , end_state])

Transitions.append([a[1] , "e" , b[0]])

elif ch == "|":

b = state.pop()

a = state.pop()

st_state = s_id

end_state = s_id + 1

state.append([st_state, end_state])

s_id = s_id + 2

Transitions.append([st_state, "e", a[0]])

Transitions.append([st_state, "e", b[0]])

Transitions.append([a[1], "e", end_state])

Transitions.append([b[1], "e", end_state])

elif ch == "*":

a = state.pop()

st_state = s_id
end_state = s_id + 1

state.append([st_state, end_state])

s_id = s_id + 2

Transitions.append([st_state, "e", a[0]])

Transitions.append([st_state, "e", end_state])

Transitions.append([a[1], "e", a[0]])

Transitions.append([a[1], "e", end_state])

for i in range(len(Transitions)):

print(Transitions[i])

OUTPUT SNAPSHOT: -

Test case 1.
Test Case 2.

Test Case 3.
PRACTICAL No. 4
Aim: - To convert NFA into DFA.
Theory: -
Deterministic Finite Automata (DFA)

DFA consists of 5 tuples {Q, ∑, q, F, δ}.


Q : set of all states.
∑ : set of input symbols. ( Symbols which machine takes as input )
q : Initial state. ( Starting state of a machine )
F : set of final state.
δ : Transition Function, defined as δ : Q X ∑ --> Q.

In a DFA, for a particular input character, machine goes to one state only. A
transition function is defined on every state for every input symbol. Also in
DFA null (or ε) move is not allowe, i.e., DFA can not change state without
any input character.
For example, below DFA with ∑ = {0, 1} accepts all strings ending with 0.
Implementation: -
Two main functions used in this Program are: -

1.epsilon_closure(T) – This functions takes a state as an input and outputs


the epsilon closure of that state.

2.Move(a, T) – This function takes two input first being the input character
and second set of states. This function returns a list containing all the states
that can be reached on applying “a” input on set of states T.

List DFA_trans is used for storing the transitions in the format

[initial_state, Input Character, final_state]

Code: -
# epsilon closure and Move function

def epsilon_closure(T):

t_stack=T[:]

#print("T is ",T,"t_stack is",t_stack,"result is",result)

while len(t_stack)!=0:

t = t_stack.pop()

for i in range(len(Transitions)):

if Transitions[i][0]==t and Transitions[i][1]=="e":

u = Transitions[i][2]

if u not in T:

T.append(u)

t_stack.append(u)

T.sort()
return(T)

#print(epsilon_closure([5]))

def Move(a, T):

result = []

for i in range(len(T)):

for t in Transitions:

if t[0]==T[i] and t[1]==a:

if t[2] not in result:

result.append(t[2])

result.sort()

return result

T=epsilon_closure([7])

print(T)

T=Move("a",T)

print(T)

# NFA to DFA Conversion . . . . . .

NFA_st_state = []

NFA_st_state.append(state[-1][0])

DFA_states = []

DFA_states.append([epsilon_closure(NFA_st_state),"unmark"])

#print(DFA_states)
DFA_trans=[]

input_symbol = ["a","b"]

#flag=0

def unmark(T):

for t in T:

if t[1]=="unmark":

return True

def Not_in(U, DFA_states):

for i in range(len(DFA_states)):

if DFA_states[i][0]==U:

return False

return True

while unmark(DFA_states):

for i in range(len(DFA_states)):

if DFA_states[i][1]=="unmark":

T=DFA_states[i][0]

DFA_states[i][1]="mark"

break

#print(T)

#print(DFA_states)

for ch in input_symbol:
U = Move(ch,T)

U = epsilon_closure(U)

if len(U)!=0:

DFA_trans.append([T,ch,U])

if Not_in(U, DFA_states):

DFA_states.append([U,"unmark"])

#print(DFA_states)

print("DFA_states")

for i in range(len(DFA_states)):

print(DFA_states[i][0])

for i in range(len(DFA_states)):

if state[-1][0] in DFA_states[i][0]:

print("\nstart state is ", DFA_states[i][0])

for i in range(len(DFA_states)):

if state[-1][1] in DFA_states[i][0]:

print("Final state is ", DFA_states[i][0])

print("\nDFA_transitions\n")

print("initial state ", "Input Character ", "final state")

for i in range(len(DFA_trans)):

print(DFA_trans[i][0],"\t ",DFA_trans[i][1],"\t \t ",DFA_trans[i][2])


OUTPUT SNAPSHOT: -
Test Case 1. (a|b)*

Test Case 2. (a|b)


Test Case 3. (a.b)
PRACTICAL NO. 5
AIM: - To write code for Left Recursion Removal.
Theory: -

The production is left-recursive if the leftmost symbol on the right side is the
same as the non terminal on the left side. For example,
expr → expr + term.

If one were to code this production in a recursive-descent parser, the parser


would go in an infinite loop.

We can eliminate the left-recursion by introducing new nonterminals and new


productions rules.

For example, the left-recursive grammar is:

E →E+T|T
E →T*F|F
F → (E) | id.

We can redefine E and T without left-recursion as:

E → TE`
E`→ + TE` | E
T → FT`
T → * FT` | E
F → (E) | id

Implementation: -
String formatting is used for removing in the following manner: -

new_trans1.append([trans[i][1:] + s_state +" ' "]) and

new_trans.append([trans[i] + s_state +"'"])

A - > Aa | b

here new_trans is the transition A -> bA’

and new_trans1 is the transition A’ -> aA | ‘e’


Code: -
c=int(input("No, of start state"))

for i in range(c):

s_state = input("Input Start State ")

count = int(input("Enter Number Of Projections "))

trans = []

for i in range(count):

trans.append(input("Enter Transition "))

print(s_state, "->", trans)

new_trans=[]

new_trans1=[]

for i in range(len(trans)):

if trans[i][0]==s_state:

new_trans1.append([trans[i][1:] + s_state +"'"])

else:

new_trans.append([trans[i] + s_state +"'"])

new_trans1.append("e")

print(s_state, "->", new_trans)

print(s_state,"' ->", new_trans1)


OUTPUT SNAPSHOT: -

Test Case 1.

Test Case 2.
Test Case 3.
PRACTICAL NO. 6
AIM: - To implement Recursive Descent Parser.
Theory: - Recursive Descent Parsing
Recursive descent is a top-down parsing technique that constructs the parse
tree from the top and the input is read from left to right. It uses procedures
for every terminal and non-terminal entity. This parsing technique recursively
parses the input to make a parse tree, which may or may not require back-
tracking. But the grammar associated with it (if not left factored) cannot avoid
back-tracking. A form of recursive-descent parsing that does not require any
back-tracking is known as predictive parsing.
This parsing technique is regarded recursive as it uses context-free grammar
which is recursive in nature.

Implementation: -
The Grammar used for the parser is

E -> T | T + E

T -> int | int * T | (E)

Variable pt and save are used for backtracking and to keep the track of the
pointer.

If the pointer reaches the end of the string entered by the user then the input
is accepted by the parser.

Otherwise, Not Accepted.


Code: -
"""

The Grammar used for the parser is

E -> T | T + E

T -> int | int * T | (E)

"""

string=[]

pt=0

n = int(input("Enter length of string\t"))

for i in range(n):

string.append(input())

def match(char):

global pt

if pt==len(string):

return False

if string[pt]==char:

pt = pt + 1

return True

else :

return False

def E1():

return T()
def E2():

return T() and match('+') and E()

def E():

global pt

save=pt

if E2()==True:

return True

else:

pt = save

return E1()

def T1():

return match('int')

def T2():

return match('int') and match('*') and T()

def T3():

return match('(') and E() and match(')')

def T():
global pt

save=pt

if T2()==True:

return True

else:

pt = save

if T3()==True:

return True

else:

pt = save

return T1()

def main():

E()

global pt

if pt==len(string):

print("Accepted",pt)

else :

print("Not Accepted",pt)

if __name__=="__main__":

main()
OUTPUT SNAPSHOT: -
1. Entered String ( int * int )

2. Entered String ( int )

3. Entered Sting ( int + int )


EXPERIMENT NO. 7
AIM: - To implement Code for Left Factoring Removal.
Theory: -
Left factoring is a grammar transformation that is useful for producing a
grammar suitable for predictive, or top-down, parsing. When the choice
between two alternative A-productions is not clear, we may be able to
rewrite the productions to defer the decision until enough of the input has
been seen that we can make the right choice.

For example, if we have the two productions


stmt -> if exp r then stmt else stmt
| if exp r then stmt

on seeing the input if, we cannot immediately tell which production to


choose to expand stmt.

Implementation: -
Compare(a, b): - function is used for comparing two strings that start with
same character. This function returns the max length of the common string in
a and b.

Minimum pattern that is common in all the productions is found out using
the Compare(a, b) function inside the for loop with the help of variable of
min to keep track the minimum count.

Next string formatting is used for printing the desired output.


CODE: -
def compare(a,b):

count=0

if len(a)<len(b):

j=len(a)

else: j=len(b)

for i in range(j):

if(a[i]==b[i]):

count=count+1

else: break

return count

s_state = input("Input Start State ")

count = int(input("Enter Number Of Projections "))

trans = []

for i in range(count):

trans.append(input("Enter Transition "))

print(s_state, "->", trans)

s1=[]

for i in range(len(trans)):

s1.append([trans[i][0],i])

s1

done=[]

l1=[]
l2=[]

for i in range(len(s1)):

t=[]

if s1[i][0] not in done:

t.append(s1[i])

done.append(s1[i][0])

for j in range(i+1,len(s1)):

if s1[j][0]==s1[i][0]:

t.append(s1[j])

if(len(t)>1):

min=compare(trans[t[0][1]],trans[t[1][1]])

for i in range(1,len(t)-1):

if compare(trans[t[i][1]],trans[t[i+1][1]])<min:

min=compare(trans[t[i][1]],trans[t[i+1][1]])

print("Common String is ",trans[t[0][1]][:min])

l1.append([trans[t[0][1]][:min]+s_state+"'"])

for i in range(len(t)):

if len(trans[t[i][1]][min:])==0:

l2.append("e")

else:

l2.append([trans[t[i][1]][min:]])

else:
l1.append([trans[t[0][1]]])

else: pass

print(s_state," -> ",end = " ")

print(l1)

print(s_state,"' -> ",end=" ")

print(l2)
OUTPUT SNAPSHOT: -
Test Case 1.

Test Case 2.

Test Case 3.
EXPERIMENT NO. 8

AIM: - To find the FIRST and FOLLOW of the Grammar


input by user.
THEORY: -
The construction of both top-down and bottom-up parsers is aided by two
functions, FIRST and FOLLOW, associated with a grammar G. During top-
down parsing, FIRST and FOLLOW allow us to choose which production to
apply, based on the next input symbol.

Rules to compute FIRST set:

1. If x is a terminal, then FIRST(x) = { ‘x’ }


2. If x-> Є, is a production rule, then add Є to FIRST(x).
3. If X->Y1 Y2 Y3….Yn is a production,
1. FIRST(X) = FIRST(Y1)
2. If FIRST(Y1) contains Є then FIRST(X) = { FIRST(Y1) – Є } U {
FIRST(Y2) }
3. If FIRST (Yi) contains Є for all i = 1 to n, then add Є to FIRST(X).
Rules to compute FOLLOW set:

1) FOLLOW(S) = {$} // where S is the starting Non-Terminal

2) If A -> pBq is a production, where p, B and q are any grammar symbols,

then everything in FIRST(q) except Є is in FOLLOW(B).

3) If A->pB is a production, then everything in FOLLOW(A) is in FOLLOW(B).

4) If A->pBq is a production and FIRST(q) contains Є,

then FOLLOW(B) contains {FIRST(q) – Є} U FOLLOW(A)

CODE: -

Finding First of the Grammar


n = int(input("Enter Total no. of non-terminals in the Grammar "))
nt = []
for i in range(n):
a = input("Enter the non-terminal ")
nt.append(a)
prod = []
for i in range(n):
print("Enter the number of productions of ",nt[i])
m = int(input())
temp = []
for j in range(m):
p = input("Enter the production ")
temp.append(p)
prod.append(temp)

for i in range(n):
print(nt[i]," -> ",end=' ')
#for j in range(len(prod[i])):
print(prod[i][:],sep=" | ")

first = []
for i in range(n):
first.append([])

# Recursion is used for finding the first in case of epsilon

def findfirst(c):
global first
n = nt.index(c)
for item in prod[n]:
if item[0] == "e":
#print("lower")
first[n].append("e") # adding epsilon in first if in production
elif item[0].islower()==True or item[0].isalpha()==False:
#print("Epsilon")
if item == "id": # for adding id instead of i
first[n].append("id")
else:
first[n].append(item[0]) # adding terminals in first if exist in
production
elif item[0].isupper() == True:
#print("Upper")
findfirst(item[0])
for f in first[nt.index(item[0])]:
first[n].append(f)
if "e" in first[nt.index(item[0])]:
findfirst(item[1]) # recursion call
for f in first[nt.index(item[1])]:
first[n].append(f)

return first[n]

findfirst("E")
findfirst("X")
findfirst("Y")

for i in range(n):
print("first of",nt[i],"is",first[i])
Finding Follow of the Grammar

follow = []
for i in range(n):
follow.append([])

# Rule 1. Place $ in Follow(S)


follow[0].append("$")

# Rule 2. If there is a production A -> aBb then everything in


# FIRST(b) except e is in follow (B)

for i in range(len(nt)):
for j in range(len(prod)):
for k in range(len(prod[j])):
for l in range(len(prod[j][k])):
if prod[j][k][l] == nt[i] and l!=len(prod[j][k])-1:
#print("matches",nt[i],"and",prod[j][k][l],"and next in line
is",prod[j][k][l+1])
if prod[j][k][l+1].islower()==True or prod[j][k][l+1].isalpha()
== False:
follow[i].append(prod[j][k][l+1])
elif prod[j][k][l+1].isupper() == True:
for m in first[nt.index(prod[j][k][l+1])]:
if m!="e" and m not in follow[nt.index(prod[j][k][l])]:
follow[nt.index(prod[j][k][l])].append(m)

# Rule 3. If there is a production A -> aB, or a production A -> aBb, where


FIRST(b) contains e,
# then everything in FOLLOW(A) is in FOLLOW(B)

for i in range(len(prod)):
for j in range(len(prod[i])):
if len(prod[i][j])>=2:
if prod[i][j][-1].isupper()==True:
#print("last capital",prod[i][j][-1])
for m in follow[i]:
#print(m,"in follow of",prod[i][j][-1])
if m not in follow[nt.index(prod[i][j][-1])]:
follow[nt.index(prod[i][j][-1])].append(m)
if prod[i][j][-2].isupper()==True and prod[i][j][-1].isupper()==True
and "e" in first[nt.index(prod[i][j][-1])]:
#print("second last capital",prod[i][j][-2])
#print("e in first of",prod[i][j][-1])
for m in follow[i]:
if m not in follow[nt.index(prod[i][j][-2])]:
follow[nt.index(prod[i][j][-2])].append(m)

for i in range(n):
print("follow of",nt[i],"is",follow[i])
OUTPUT SNAPSHOT: -
Test Case 1.
The Grammar Entered by the User

First Set of the Non-terminals in the Grammar

Follow Sets of the Non-terminals in the Grammar


Test Case 2.
The Grammar Entered by the User

First Set of the Non-terminals in the Grammar

Follow set of the non-terminals in the Grammar

You might also like