University College London: Module Code COMP2010
University College London: Module Code COMP2010
ASSESSMENT COMP2010C
PATTERN
DATE 09-May-14
TIME 14:30
2013/14-COMP2010C-001-EXAM-84
©2013 University College London TURN OVER
Section A (Answer BOTH Questions)
1. Finite State Automata and Lexical Analysis
(a) Write a regular expression, over the ASCII alphabet, to capture comments defined as a string
surrounded by / / and / / without an intervening / /, unless it appears between quotes "". [5]
(b) Wikipedia defines CamelCase as "CamelCase (camel case) or medial capitals is the practice of writing
compound words or phrases such that each word or abbreviation begins with a capital letter.".
Consider the following description of identifiers: "Identifiers are alphanumeric, but must start with
a lowercase letter and may not contain consecutive uppercase letters."
i. Write a DFA that accepts these identifiers. [6]
ll. Ignoring numbers, does your DFA accept only identifiers in CamelCase? Why or why not? [3]
(c) For ~ = {a,b,c}, produce an NFA that accepts the following regular expression: [10]
(d) Convert the following NFA over the alphabet ~ = {a, b} into a DFA that recognizes the same
language. [10]
faa:
tl = n EQ 1
CJUMP tl labell
t2 = n - 1
PARAM t2
t3 = CALL faa
t4 = n MULT t3
return t4
labell:
return 1
[15J
(c) Translate the following Java code into TAC IR for register machine. Optimise it as much as possible
during the translation, and provide a list of all optimisations that have been applied. To simplify
the translation, you can make the following assumptions:
• Java method Math. sin (double x) can be called with the name _sin.
• Java method Math. cos (double x) can be called with the name _cos.
• Java static field Math. PI can be accessed with IR variable name PI.
double a = OJ
double b = 0;
int i = n;
while(i > oH
[15J
Total for Question 2: [34]
3. Syntax Analysis
(a) Remove left recursion from the following grammar. [8]
S ~ Xx I y I Sz
X ~ X z I Sxd I yd IE
(b) The C statement" (a) * b" is ambiguous: it can be interpreted either as the multiplication of two
variables or as the cast of the dereferenced value of b to the type a To the lexer, a is simply
an identifier; it has no way of determining whether a is a variable or a type. Scannerless parsing
solves this and other tokenization ambiguities by combining lexing and parsing, by parsing at the
granularity of characters, not lexemes, i. e. contiguous sequences of characters.
i. How does scannerless parsing solve the ambiguity of "(a) * b"? [6]
ii. How would you need to modify your solution to the course project to implement scannerless
~~ ~
(c) Consider the following grammar.
S~xA
A~Bx ID
B~xS I CxD
C~yD IE
D~ zD I E
i. Compute the FIRST and FOLLOW sets for the above grammar. [5]
ii. Construct the LL(l) parsing table for this grammar. [5]
(d) When is a grammar ambiguous? Your answer should define an ambiguous grammar, and discuss a
concrete example of that grammar's ambiguity. [4]
Total for Question 3: [32]
memberList
} recName;
where record is a keywordj memberList is a list of semicolon separate member declarations, each
member being defined as type memberNamej recName is the name of the declared record. A member
can have type of int, float, string, or another record. An empty record (i.e. one without any
member) is not allowed. An example of record declarations are below:
record{
int age;
string firstName;
string lastName;
} person:
record{
person manager;
int size;
string teamNamej
} team;
[16J
(c) Consider the declaration of records below, which deliberately contains a fault. Answer the following
questions.
1. Describe what the fault is.
2. Can a parser detect the fa}l1t? Elaborate your answer with the reason why.
3. With a carefully implemented compiler of this language, describe when and how the fault will
be recognised by the compiler.
record{
string firstName;
string lastName;
string country;
} person;
record{
string title;
string isbn;
pearson author;
} book;
[10]
Total for Question 4: [32]