0% found this document useful (0 votes)
134 views8 pages

Attribute Grammars PDF

This document discusses implementing attribute grammars using the SableCC parser generator. It explains that attribute grammars augment context-free grammars with attributes and evaluation rules to define semantic and non-contextual aspects of a language. When using SableCC, attribute grammars are implemented by creating a parser, extending the DepthFirstAdapter class to override methods, and using setOut/getOut methods to set and retrieve attributes during the tree traversal. Evaluation rules are implemented by overriding specific methods based on the node classes generated by SableCC.

Uploaded by

memoegyptian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views8 pages

Attribute Grammars PDF

This document discusses implementing attribute grammars using the SableCC parser generator. It explains that attribute grammars augment context-free grammars with attributes and evaluation rules to define semantic and non-contextual aspects of a language. When using SableCC, attribute grammars are implemented by creating a parser, extending the DepthFirstAdapter class to override methods, and using setOut/getOut methods to set and retrieve attributes during the tree traversal. Evaluation rules are implemented by overriding specific methods based on the node classes generated by SableCC.

Uploaded by

memoegyptian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IMPLEMENTING ATTRIBUTE GRAMMARS

Using the SableCC Parser Generator


Attribute Grammars
An attribute grammar is a context free grammar augmented with attributes and evaluation rules. The
semantic and non-context-free aspects of a language can be defined by associating one or more attributes
with the symbols of a grammar. An attribute is a semantic property of a grammar symbol, and both
terminal and non-terminal symbols can have semantic attributes. The evaluation rules specify how to
calculate the attributes associated with each grammar symbol. Evaluation rules are sometimes called
semantic actions.
An attribute grammar is a high-level specification of the translation rules for a language; it hides many
implementation details such as the order of evaluation and the programming language specific types of its
attributes. Parser generating tools, like SableCC, make it easy to implement an attribute grammar.
There are two kinds of attributes, inherited and synthesized. Intuitively, a synthesized attribute is
determined from information arising from the internal constituents of a construct (i.e., the right-hand side
of a production rule), whereas an inherited attribute is determined by information coming from the external
context of a construct.
Another way to think about synthesized versus inherited attributes is in terms of the parse tree (i.e., syntax
tree). Production rules define the parse tree structure, that is, the symbol on the left-hand side of a
production rule is the parent node and the symbols on the right hand side are its children. A synthesized
attribute is calculated using the attributes of the nodes children. For example, the type of an expression is
calculated using the types of its sub-expressions. Specifically, in Java, the type of expression `1 + 0.01
(float) is calculated from the types of its sub-expressions 1 (int) and `0.01 (float), its internal constituents.
On the other hand, an inherited attribute is an attribute of a child node that is calculated from the attributes
of its parent and possibly its siblings. Intuitively, calculating the value of inherited attributes involves, at
least in part, attribute information flowing from parent to child in the parse tree; however information flows
only from child to parent in the calculation of synthesized attributes.
The process of computing the attribute values of the nodes of a parse tree is called annotating or decorating
the tree. For example, running your lab 5 type checker decorates the expression nodes of the syntax tree
with their types, that is, it annotates the sub-expression nodes with their types and then uses these types to
determine and annotate the parent expression node with its type. Running the ParserDriver.main method
shown in Figure D also decorates a parse tree. The details will be explained in the following sections.
Implementing Attribute Grammars Using SableCC
When using SableCC, we create attribute grammars as follows. First we specify a parser for a context free
grammar. The parser generates a syntax tree that can be walked by classes defined in the analysis
package. We add evaluation rules by overriding methods of one of these analysis (tree traversal) classes,
e.g., DepthFirstAdapter. We associate attributes with nodes using the setOut method provided by the
tree traversal classes. We retrieve these attributes using the method getOut. Methods setOut and getOut
are used to set and retrieve synthesized attributes.
To implement an attribute grammar, all we need to know is which class to extend and which methods to
override. The class to be extended is the DepthFirstAdapter class generated by SableCC and located in
the analysis package (subdirectory); this class does a depth first traversal of the parse tree although it does
not do anything during this traversal. To implement an attribute grammar, we will have to add evaluation
rules by overriding some of the methods of DepthFirstAdapter, that is, we will need to create a subclass
that overrides these methods. The names of the methods we override depends on the names of the classes
generated by SableCC. Recall that each grammar rule is represented by a class in the node package; the
names of these classes are determined by the grammar specification. (Figure C gives many examples of
how we can override the methods of DepthFirstAdapter.)
For example, consider the following SableCC grammar rules.

whole_part = whole_part

digit

| { digit } digit
;
The names of the node classes for these rules are AWholePart and ADigitWholePart. The names come
from the non-terminal on the left hand side of the two production rules (whole_part) prefixed by A and
the name inside the braces (when there is such a name as in the second rule above). The name of the
method you have to override and the type of its parameter is based on these class names.
Attribute Evaluation
Attribute evaluation rules are implemented and the order of evaluation is determined through method
overrides. For example, in order to insert an evaluation rule before traversing an AWholePart node we
would have to override method inAWholePart. Similarly, to insert an evaluation rule after traversing an
AWholePart node, we have to override outAWholePart. Both of these methods take a single argument of
type AWholePart (See Figure 1).
We only have to override a method when we need to insert some action either before or after the traversal
of the parent node s children. The following shows a small sample part of such a class. The superclass
DepthFirstAdapter does a depth first traversal of the parse tree generated by SableCC. This example
shows the method overrides required if actions have to be inserted both before and after the traversal of an
AWholePart node and an ADigitWholePart node.
________________________________________________________________________________
class YourTreeDecorator extends DepthFirstAdapter {
void inAWholePart(AWholePart node) {
// your code for execution prior to traversing an AWholePart node.
}
void outAWholePart(AWholePart node) {
// your code for execution after traversing an AWholePart node.
}
void inADigitWholePart(ADigitWholePart node) {
// your code for execution prior to traversing an ADigitWholePart node.
}
void outADigitWholePart(ADigitWholePart node) {
// your code for execution after traversing an ADigitWholePart node.
}
}
Figure 1: A program fragment of a class to traverse parse trees.
____________________________________________________________
Usually "in" methods have to be overridden to implement inherited attributes, and "out" methods must be
overridden to implement synthesized attributes. For example, by definition, a synthesized attribute of a
parent node is computed from the attributes its children. Therefore, the child nodes must have been
traversed (so the attributes of the children have been computed) prior to computing the attribute of the
parent. Hence, the evaluation rules for computing a synthesized attribute for a parent must be evaluated
after evaluating the rules for (i.e. traversing) the child nodes. To do this the "out" methods would have to
be overridden. Figure C gives an example of the computation of a synthesized attribute.
Setting and Retrieving Attributes
Before we can implement an attribute grammar, it is important that we understand how to set and retrieve
attributes; this will be important for labs 4 and 5 and for the final exam. The calls of methods setOut and
getOut allow us to set an attribute for a particular node and then to later retrieve that attribute. In the

examples that follow (and in labs 4 and 5), only one attribute will be associated with any particular node.
Consider for example the following production rule and its associated evaluation rule. The subscripts are so
the two occurrences of whole_part in the production rule (left and right-hand side) can be distinguished
(the subscript 0 denotes the occurrence on the left-hand side).
SableCC Syntax Specification

Evaluation Rule

whole_part = whole_part

whole_part0.value = whole_part1.value * 10 + translateDigit( digit )

(1)
(2)
(3)
(4)
(5)

digit

public void outAWholePart(AWholePart node) {


Long wholeValue = (Long) getOut(node.getWholePart());
long wValue = wholeValue.longValue();
long digitValue = translateDigit(node.getDigit().getText());
Long newValue = new Long(wValue * 10 + digitValue);
setOut(node, newValue);
}

Figure 2: An example of a production rule and its associated action; the method above shows how the
action would be implemented.
_____________________________________________________________________________________
Figure 2 illustrates the way that attributes are set and then later retrieved. The node parameter in the
method above represents the syntax tree for the parent node, the left-hand side of the above rule. The
SableCC tool generates methods so the children of a syntax tree node can easily be retrieved. For example,
the above rule has two children (whole_part and digit); they can be retrieved through calls to
getWholePart() and getDigit() (see lines 1 and 3 above).
Line 1 of the method in Figure 2 shows how an attribute is retrieved. The method getOut retrieves the
attribute of its argument, in this case, the attribute of node.getWholePart() (i.e., whole_part1.value).
(obviously, the attribute for the node representing whole_part1 must have been set by another rule
previously, otherwise the value attribute will be null). In this example, the attribute has type Long (the
built-in Java class). An attribute can be any user-defined type or built-in class (i.e., a subclass of Object).
Usually attributes will have to be downcast as in line 1 since method getOut returns an Object.
Line 3 of the above method shows how the string value of a token can be retrieved. Retrieving the string
value of a token, through method getText(),will be needed for tokens such as identifiers in labs 4 and 5.
Method getText() is only defined for token classes not for classes representing non-terminals. Also, there
is no method named translateDigit generated by the SableCC tool, so it would have to be implemented
elsewhere (see Figure C below for the complete example).
Line 4 of the method shows the action needed to compute the value of the new attribute (a new Long
object).
Line 5 shows how the attribute for the parent node is set. In this case, the attribute for the non- terminal on
the left-hand side is set to the value computed in line 4 which must have type Long.
Figures A, B, C, and D give a complete example of an attribute grammar (Figure A) and how it would be
implemented using a SableCC parser specification (Figure B) and a Java program (Figures C and D) that
decorates the tree.
Important Note on Exceptions
When using the framework generated by SableCC, we always have to be very careful about setting the
attributes. That means we have to make sure that we know what the type of the attribute should be so that
we set it properly. This is because attributes can only have type Object, so there are no error messages if
we make a mistake when setting an attribute (or when failing to set an attribute).

If the parent expects the child node to have an attribute, then the action associated with the child node must
properly set that attribute using setOut. Note carefully the attribute types in Figure C below. PWholePart
always has an attribute with type Long because its two subclasses, AWholePart and ADigitWholePart
(the nodes representing the two alternatives), set the attribute to a Long object. However, PFractionPart,
i.e., AFractionPart and ADigitFractionPart, always has an attribute with type Double. Also, the type of
the attribute for PNumeral is always Double.
A Simple Example
Figures A, B, and C give a very simple example of an attribute grammar and how it would be implemented
using the SableCC tool and the Java programming language. This attribute grammar translates a numeral
into its base 10 decimal equivalent. Notice that the syntax directed definition (attribute grammar) in Figure
A is more mathematical and abstract than the implementation given in Figure C; it is more abstract in the
sense that the type of the value attribute is not specified (we just know that it is a decimal number).
However, in the implementation in Figure C, the type of the attribute is very important. If the type of the
attribute has not been properly set by any of the children nodes, then the parent will get a null pointer
exception or class cast exception. Carefully study this example before starting lab 4. Lab 4 assumes that
this example is well understood.
______________________________________________________________________________________

Evaluation Rule
N

W
F
D

N.value = W.value

W .

N.value = W.value

W . F

N.value = W.value + F.value

. F

N.value = F.value

W D

W0 = W1.value * 10 + D.value

W.value = D.value

D F

F.value = (D.value + F.value) * 0.1

F.value = D.value * 0.1

D.value = 0

D.value = 1

D.value = 2

D.value = 3

D.value = 4

D.value = 5

D.value = 6

D.value = 7

D.value = 8

D.value = 9

Figure A: An example of an attribute grammar.


_____________________________________________________________________________________

// ************************ Helpers **********************************


Helpers
all =
space
lf =
cr =
ff =
ht =

[0..0xffff];
= ;
0x000a; // line feed
0x000d; // carriage return
0x000c; // form feed
0x0009; // tab

new_line = lf | cr | cr lf ;
not_cr_lf = [all - [cr + lf]];
// ************************ Tokens ***********************************
Tokens
white_space

= (space | ht | ff | new_line);

// ********************** literals **********************************


digit
=
decimal_point =

[0 .. 9] ;
. ;

// Ignored Tokens
// *********************** Productions ******************************
Productions
goal
= white_space* numeral [ws]:white_space*
;
numeral
= {whole} whole_part decimal_point?
| {fraction} whole_part? decimal_point fraction_part
;
whole_part
= whole_part digit
| {digit} digit
;
fraction_part
= digit fraction_part
| {digit} digit
;
Figure B: The parser specification for the attribute grammar given in Figure A.
_____________________________________________________________________________________

public class NumeralTranslator extends DepthFirstAdapter {


public void outStart(Start node) {
Double numValue = (Double) getOut(node.getPGoal());
setOut(node, numValue);
}
public void outAGoal(AGoal node) {
Double numValue = (Double) getOut(node.getNumeral());
setOut(node, numValue);
}
public void outAWholeNumeral(AWholeNumeral node) {
Long wholeValue = (Long) getOut(node.getWholePart());
long wValue = wholeValue.longValue();
setOut(node, new Double(wValue));
}
public void outAFractionNumeral(AFractionNumeral node) {
Double fractionValue = (Double) getOut(node.getFractionPart());
double fValue = fractionValue.doubleValue();
double wValue = 0.0;
if (node.getWholePart() != null) { // this node is optional
Long wholeValue = (Long) getOut(node.getWholePart());
wValue = wholeValue.longValue();
}
setOut(node, new Double(wValue + fValue));
}
public void outAWholePart(AWholePart node) {
Long wholeValue = (Long) getOut(node.getWholePart());
long wValue = wholeValue.longValue();
long digitValue = translateDigit(node.getDigit().getText());
Long newValue = new Long(wValue * 10 + digitValue);
setOut(node, newValue);
}
public void outADigitWholePart(ADigitWholePart node) {
long digitValue = translateDigit(node.getDigit().getText());
setOut(node, new Long(digitValue));
}
public void outAFractionPart(AFractionPart node) {
long digitValue = translateDigit(node.getDigit().getText());
Double fractionValue = (Double) getOut(node.getFractionPart());
double newValue =(digitValue + fractionValue.doubleValue())*0.1;
setOut(node, new Double(newValue));
}
public void outADigitFractionPart(ADigitFractionPart node) {
long digitValue = translateDigit(node.getDigit().getText());
double newValue = digitValue * 0.1;
setOut(node, new Double(newValue));
}
public long translateDigit (String digit) {
long d = Integer.parseInt(digit);
return d;
}
}
Figure C: The implementation of the attribute grammar given in Figure A based on the parser
specification of Figure B.
_____________________________________________________________________________________

Decorating the Syntax Tree


The following Java program shows how to instantiate the parser (line (1)), create a syntax tree representing
the input program (line (2)), and then decorate the tree with its attributes. Line (3) below shows the
instantiation of the tree traversal class object. Line (4) shows how to decorate the tree, i.e., compute the
value attributes for each tree node. Line (5) shows how to get the attribute assigned to the tree node. Line
(6) shows the printing of the attribute value.
The framework generated by SableCC uses the visitor pattern to walk and decorate the syntax tree. The
visitor pattern is described in "Design Patterns: Abstraction and Reuse of Object-Oriented Design" by E.
Gamma, R. Helm, R. E. Johnson, and J. Vlissides. This pattern is also introduced in the Advanced
Software Development course that some of you may have already taken. However, it is NOT necessary to
know how the visitor pattern works in order to implement an attribute grammar.
Before you begin labs 4 and 5, make sure you understand this example thoroughly; this will save you
a lot of time and effort.
_____________________________________________________________________________________
import
import
import
import

java.io.*;
lexer.*;
parser.*;
node.*;

public class ParserDriver {


public static void main(String[] args)
{
try {
System.out.println("\nStarting Lexer");
Lexer lex = new Lexer(
new PushbackReader( new InputStreamReader(
new FileInputStream(args[0])),
1024));
(1)

System.out.println("Starting Parser");
Parser p = new Parser(lex);

(2)

Start tree = p.parse();

(3)
(4)
(5)
(6)

System.out.println("\nInterpreting decimal numerals\n");


NumeralTranslator interpreter = new NumeralTranslator();
tree.apply(interpreter);
Double val = (Double)interpreter.getOut(tree);
System.out.println(tree.toString() + "--> " + val);

// Parse the input.

} catch (ParserException e) {
System.out.println(e.getMessage());
} catch (Exception e) {
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}
Figure D: The implementation of the attribute grammar given in Figure A based on the parser
specification of Figure B.
_____________________________________________________________________________________

Exercises
1. (a) The implementation given in Figures B and C defines digit (D) as a single token rather than as
10 separate tokens and production rules. Define the grammar specification for digit as it is shown in
Figure A, that is, define the ten SableCC production rules and ten tokens so the grammar specification
matches the one for the non-terminal D as shown in Figure A.
1. (b) Implement the attribute grammar for digit based on the SableCC parser specification you
defined in part (a). That is, define the class DigitTranslator that calculates the value of a digit. The
type of the attributes should be Long objects. Hint: Do NOT use method translateDigit; the
implementation can be done much more simply without this method.
1. (c) If the specification of the grammar given in Figure B used the ten rules and tokens you defined
in part (a), how would Figure C change?

2. (a) Define an attribute grammar (CFG and evaluation rules) that translates binary numerals into
decimal numbers (i.e., create an attribute grammar like the one in Figure A, except it should be for
binary numerals). Binary numerals are strings over the alphabet { 0, 1, . }. Binary numerals can have
a binary point that separates the binary whole number from the binary fraction. The value of a binary
numeral is computed using the usual base 2 arithmetic. For example, 101.011 has the value 5.375 as a
decimal number, i.e.,

1 *22 + 0 *21 + 1 *20 + 0 *2-1 + 1 *2-2 + 1 *2-3


= 4 + 0 + 1 + 0.0 + .25 + .125 = 5.375
To make sure your context free grammar (CFG) includes all strings in the language of binary
numerals, the following regular expression defines the set of binary numerals (recall that all
languages defined by a regular expression can be generated by a CFG).
(0 | 1)+

| (0 | 1)+ .

| (0 | 1)* . (0 | 1)+

2. (b) Implement the attribute grammar defined in part (a) (SableCC parser specification and tree
decorator class) that computes the decimal value of binary numerals (i.e., compute the value
attribute for the production rules in the CFG defined in part (a)).

You might also like