0% found this document useful (0 votes)
27 views

University College London: Module Code COMP2010

Computer science, maths and statistics.

Uploaded by

Dragan Petkanov
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

University College London: Module Code COMP2010

Computer science, maths and statistics.

Uploaded by

Dragan Petkanov
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

UNIVERSITY COLLEGE LONDON

EXAMINATION FOR INTERNAL STUDENTS

MODULE CODE COMP2010

ASSESSMENT COMP2010C
PATTERN

MODULE NAME Compilers

DATE 09-May-14

TIME 14:30

TIME ALLOWED 2 Hours 30 Minutes

2013/14-COMP2010C-001-EXAM-84
©2013 University College London TURN OVER
Section A (Answer BOTH Questions)
1. Finite State Automata and Lexical Analysis
(a) Write a regular expression, over the ASCII alphabet, to capture comments defined as a string
surrounded by / / and / / without an intervening / /, unless it appears between quotes "". [5]
(b) Wikipedia defines CamelCase as "CamelCase (camel case) or medial capitals is the practice of writing
compound words or phrases such that each word or abbreviation begins with a capital letter.".
Consider the following description of identifiers: "Identifiers are alphanumeric, but must start with
a lowercase letter and may not contain consecutive uppercase letters."
i. Write a DFA that accepts these identifiers. [6]
ll. Ignoring numbers, does your DFA accept only identifiers in CamelCase? Why or why not? [3]
(c) For ~ = {a,b,c}, produce an NFA that accepts the following regular expression: [10]

(b I f)(a(E - {a})*a I cc)"b

(d) Convert the following NFA over the alphabet ~ = {a, b} into a DFA that recognizes the same
language. [10]

Total for Question 1: [34]

COMP2010 -2- Turn Over


2. Intermediate Code and Code Optimisation
(a) Provide a benefit and a disadvantage of the Intermediate Representation for stack machines, com­
pared to that for register machines, with brief description. [4]
(b) The following is a Three Address Code (TAC) Intermediate Representation for a register machine:
it contains a function called faa, which takes an integer parameter n. Identify and describe wha.t
the code does in plain English, and write down the high-level language counterpart in Java-like
pseudocode.

faa:

tl = n EQ 1

CJUMP tl labell

t2 = n - 1

PARAM t2

t3 = CALL faa

t4 = n MULT t3

return t4

labell:

return 1

[15J
(c) Translate the following Java code into TAC IR for register machine. Optimise it as much as possible
during the translation, and provide a list of all optimisations that have been applied. To simplify
the translation, you can make the following assumptions:
• Java method Math. sin (double x) can be called with the name _sin.
• Java method Math. cos (double x) can be called with the name _cos.
• Java static field Math. PI can be accessed with IR variable name PI.

public double bar(int n, double x){

double a = OJ

double b = 0;

int i = n;

while(i > oH

a += Math.sin(2 * Math.PI + 4 * i);


b = Math.cos(2 * Math.PI + x);
i 1;
}
return a + b;
}

[15J
Total for Question 2: [34]

COMP2010 -3- TUrn Over


Section B (Answer ONE Question)

3. Syntax Analysis
(a) Remove left recursion from the following grammar. [8]

S ~ Xx I y I Sz
X ~ X z I Sxd I yd IE

(b) The C statement" (a) * b" is ambiguous: it can be interpreted either as the multiplication of two
variables or as the cast of the dereferenced value of b to the type a To the lexer, a is simply
an identifier; it has no way of determining whether a is a variable or a type. Scannerless parsing
solves this and other tokenization ambiguities by combining lexing and parsing, by parsing at the
granularity of characters, not lexemes, i. e. contiguous sequences of characters.
i. How does scannerless parsing solve the ambiguity of "(a) * b"? [6]
ii. How would you need to modify your solution to the course project to implement scannerless
~~ ~
(c) Consider the following grammar.

S~xA

A~Bx ID
B~xS I CxD
C~yD IE
D~ zD I E

i. Compute the FIRST and FOLLOW sets for the above grammar. [5]
ii. Construct the LL(l) parsing table for this grammar. [5]
(d) When is a grammar ambiguous? Your answer should define an ambiguous grammar, and discuss a
concrete example of that grammar's ambiguity. [4]
Total for Question 3: [32]

COMP2010 -4- Turn Over


4. AST and Semantic Analysis


(a) Provide at least two weaknesses of hash-based symbol table. [6J
(b) Write a context-free grammar G to define a list of custom record declarations. A single record
declaration isas follows:
record{

memberList

} recName;

where record is a keywordj memberList is a list of semicolon separate member declarations, each
member being defined as type memberNamej recName is the name of the declared record. A member
can have type of int, float, string, or another record. An empty record (i.e. one without any
member) is not allowed. An example of record declarations are below:
record{

int age;

string firstName;

string lastName;

} person:

record{

person manager;

int size;

string teamNamej

} team;

[16J
(c) Consider the declaration of records below, which deliberately contains a fault. Answer the following
questions.
1. Describe what the fault is.
2. Can a parser detect the fa}l1t? Elaborate your answer with the reason why.
3. With a carefully implemented compiler of this language, describe when and how the fault will
be recognised by the compiler.

record{

string firstName;

string lastName;

string country;

} person;

record{

string title;

string isbn;

pearson author;

} book;

[10]
Total for Question 4: [32]

COMP2010 -5- END OF EXAM

You might also like