Attribute Grammars PDF
Attribute Grammars PDF
whole_part = whole_part
digit
| { digit } digit
;
The names of the node classes for these rules are AWholePart and ADigitWholePart. The names come
from the non-terminal on the left hand side of the two production rules (whole_part) prefixed by A and
the name inside the braces (when there is such a name as in the second rule above). The name of the
method you have to override and the type of its parameter is based on these class names.
Attribute Evaluation
Attribute evaluation rules are implemented and the order of evaluation is determined through method
overrides. For example, in order to insert an evaluation rule before traversing an AWholePart node we
would have to override method inAWholePart. Similarly, to insert an evaluation rule after traversing an
AWholePart node, we have to override outAWholePart. Both of these methods take a single argument of
type AWholePart (See Figure 1).
We only have to override a method when we need to insert some action either before or after the traversal
of the parent node s children. The following shows a small sample part of such a class. The superclass
DepthFirstAdapter does a depth first traversal of the parse tree generated by SableCC. This example
shows the method overrides required if actions have to be inserted both before and after the traversal of an
AWholePart node and an ADigitWholePart node.
________________________________________________________________________________
class YourTreeDecorator extends DepthFirstAdapter {
void inAWholePart(AWholePart node) {
// your code for execution prior to traversing an AWholePart node.
}
void outAWholePart(AWholePart node) {
// your code for execution after traversing an AWholePart node.
}
void inADigitWholePart(ADigitWholePart node) {
// your code for execution prior to traversing an ADigitWholePart node.
}
void outADigitWholePart(ADigitWholePart node) {
// your code for execution after traversing an ADigitWholePart node.
}
}
Figure 1: A program fragment of a class to traverse parse trees.
____________________________________________________________
Usually "in" methods have to be overridden to implement inherited attributes, and "out" methods must be
overridden to implement synthesized attributes. For example, by definition, a synthesized attribute of a
parent node is computed from the attributes its children. Therefore, the child nodes must have been
traversed (so the attributes of the children have been computed) prior to computing the attribute of the
parent. Hence, the evaluation rules for computing a synthesized attribute for a parent must be evaluated
after evaluating the rules for (i.e. traversing) the child nodes. To do this the "out" methods would have to
be overridden. Figure C gives an example of the computation of a synthesized attribute.
Setting and Retrieving Attributes
Before we can implement an attribute grammar, it is important that we understand how to set and retrieve
attributes; this will be important for labs 4 and 5 and for the final exam. The calls of methods setOut and
getOut allow us to set an attribute for a particular node and then to later retrieve that attribute. In the
examples that follow (and in labs 4 and 5), only one attribute will be associated with any particular node.
Consider for example the following production rule and its associated evaluation rule. The subscripts are so
the two occurrences of whole_part in the production rule (left and right-hand side) can be distinguished
(the subscript 0 denotes the occurrence on the left-hand side).
SableCC Syntax Specification
Evaluation Rule
whole_part = whole_part
(1)
(2)
(3)
(4)
(5)
digit
Figure 2: An example of a production rule and its associated action; the method above shows how the
action would be implemented.
_____________________________________________________________________________________
Figure 2 illustrates the way that attributes are set and then later retrieved. The node parameter in the
method above represents the syntax tree for the parent node, the left-hand side of the above rule. The
SableCC tool generates methods so the children of a syntax tree node can easily be retrieved. For example,
the above rule has two children (whole_part and digit); they can be retrieved through calls to
getWholePart() and getDigit() (see lines 1 and 3 above).
Line 1 of the method in Figure 2 shows how an attribute is retrieved. The method getOut retrieves the
attribute of its argument, in this case, the attribute of node.getWholePart() (i.e., whole_part1.value).
(obviously, the attribute for the node representing whole_part1 must have been set by another rule
previously, otherwise the value attribute will be null). In this example, the attribute has type Long (the
built-in Java class). An attribute can be any user-defined type or built-in class (i.e., a subclass of Object).
Usually attributes will have to be downcast as in line 1 since method getOut returns an Object.
Line 3 of the above method shows how the string value of a token can be retrieved. Retrieving the string
value of a token, through method getText(),will be needed for tokens such as identifiers in labs 4 and 5.
Method getText() is only defined for token classes not for classes representing non-terminals. Also, there
is no method named translateDigit generated by the SableCC tool, so it would have to be implemented
elsewhere (see Figure C below for the complete example).
Line 4 of the method shows the action needed to compute the value of the new attribute (a new Long
object).
Line 5 shows how the attribute for the parent node is set. In this case, the attribute for the non- terminal on
the left-hand side is set to the value computed in line 4 which must have type Long.
Figures A, B, C, and D give a complete example of an attribute grammar (Figure A) and how it would be
implemented using a SableCC parser specification (Figure B) and a Java program (Figures C and D) that
decorates the tree.
Important Note on Exceptions
When using the framework generated by SableCC, we always have to be very careful about setting the
attributes. That means we have to make sure that we know what the type of the attribute should be so that
we set it properly. This is because attributes can only have type Object, so there are no error messages if
we make a mistake when setting an attribute (or when failing to set an attribute).
If the parent expects the child node to have an attribute, then the action associated with the child node must
properly set that attribute using setOut. Note carefully the attribute types in Figure C below. PWholePart
always has an attribute with type Long because its two subclasses, AWholePart and ADigitWholePart
(the nodes representing the two alternatives), set the attribute to a Long object. However, PFractionPart,
i.e., AFractionPart and ADigitFractionPart, always has an attribute with type Double. Also, the type of
the attribute for PNumeral is always Double.
A Simple Example
Figures A, B, and C give a very simple example of an attribute grammar and how it would be implemented
using the SableCC tool and the Java programming language. This attribute grammar translates a numeral
into its base 10 decimal equivalent. Notice that the syntax directed definition (attribute grammar) in Figure
A is more mathematical and abstract than the implementation given in Figure C; it is more abstract in the
sense that the type of the value attribute is not specified (we just know that it is a decimal number).
However, in the implementation in Figure C, the type of the attribute is very important. If the type of the
attribute has not been properly set by any of the children nodes, then the parent will get a null pointer
exception or class cast exception. Carefully study this example before starting lab 4. Lab 4 assumes that
this example is well understood.
______________________________________________________________________________________
Evaluation Rule
N
W
F
D
N.value = W.value
W .
N.value = W.value
W . F
. F
N.value = F.value
W D
W0 = W1.value * 10 + D.value
W.value = D.value
D F
D.value = 0
D.value = 1
D.value = 2
D.value = 3
D.value = 4
D.value = 5
D.value = 6
D.value = 7
D.value = 8
D.value = 9
[0..0xffff];
= ;
0x000a; // line feed
0x000d; // carriage return
0x000c; // form feed
0x0009; // tab
new_line = lf | cr | cr lf ;
not_cr_lf = [all - [cr + lf]];
// ************************ Tokens ***********************************
Tokens
white_space
= (space | ht | ff | new_line);
[0 .. 9] ;
. ;
// Ignored Tokens
// *********************** Productions ******************************
Productions
goal
= white_space* numeral [ws]:white_space*
;
numeral
= {whole} whole_part decimal_point?
| {fraction} whole_part? decimal_point fraction_part
;
whole_part
= whole_part digit
| {digit} digit
;
fraction_part
= digit fraction_part
| {digit} digit
;
Figure B: The parser specification for the attribute grammar given in Figure A.
_____________________________________________________________________________________
java.io.*;
lexer.*;
parser.*;
node.*;
System.out.println("Starting Parser");
Parser p = new Parser(lex);
(2)
(3)
(4)
(5)
(6)
} catch (ParserException e) {
System.out.println(e.getMessage());
} catch (Exception e) {
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}
Figure D: The implementation of the attribute grammar given in Figure A based on the parser
specification of Figure B.
_____________________________________________________________________________________
Exercises
1. (a) The implementation given in Figures B and C defines digit (D) as a single token rather than as
10 separate tokens and production rules. Define the grammar specification for digit as it is shown in
Figure A, that is, define the ten SableCC production rules and ten tokens so the grammar specification
matches the one for the non-terminal D as shown in Figure A.
1. (b) Implement the attribute grammar for digit based on the SableCC parser specification you
defined in part (a). That is, define the class DigitTranslator that calculates the value of a digit. The
type of the attributes should be Long objects. Hint: Do NOT use method translateDigit; the
implementation can be done much more simply without this method.
1. (c) If the specification of the grammar given in Figure B used the ten rules and tokens you defined
in part (a), how would Figure C change?
2. (a) Define an attribute grammar (CFG and evaluation rules) that translates binary numerals into
decimal numbers (i.e., create an attribute grammar like the one in Figure A, except it should be for
binary numerals). Binary numerals are strings over the alphabet { 0, 1, . }. Binary numerals can have
a binary point that separates the binary whole number from the binary fraction. The value of a binary
numeral is computed using the usual base 2 arithmetic. For example, 101.011 has the value 5.375 as a
decimal number, i.e.,
| (0 | 1)+ .
| (0 | 1)* . (0 | 1)+
2. (b) Implement the attribute grammar defined in part (a) (SableCC parser specification and tree
decorator class) that computes the decimal value of binary numerals (i.e., compute the value
attribute for the production rules in the CFG defined in part (a)).