100% found this document useful (9 votes)
66 views

A Practical Approach to Compiler Construction 1st Edition Des Watson (Auth.) 2024 scribd download

Des

Uploaded by

rawaycellohh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (9 votes)
66 views

A Practical Approach to Compiler Construction 1st Edition Des Watson (Auth.) 2024 scribd download

Des

Uploaded by

rawaycellohh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Download the Full Version of textbook for Fast Typing at textbookfull.

com

A Practical Approach to Compiler Construction 1st


Edition Des Watson (Auth.)

https://textbookfull.com/product/a-practical-approach-to-
compiler-construction-1st-edition-des-watson-auth/

OR CLICK BUTTON

DOWNLOAD NOW

Download More textbook Instantly Today - Get Yours Now at textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

A Practical Guide to Construction Adjudication 1st Edition


Pickavance

https://textbookfull.com/product/a-practical-guide-to-construction-
adjudication-1st-edition-pickavance/

textboxfull.com

IBM SPSS by example a practical guide to statistical data


analysis Second Edition Service Des Sociétés Secrètes

https://textbookfull.com/product/ibm-spss-by-example-a-practical-
guide-to-statistical-data-analysis-second-edition-service-des-
societes-secretes/
textboxfull.com

A Practical Guide to Construction of Hydropower Facilities


1st Edition Suchintya Kumar Sur

https://textbookfull.com/product/a-practical-guide-to-construction-of-
hydropower-facilities-1st-edition-suchintya-kumar-sur/

textboxfull.com

Introduction to Computer Graphics A Practical Learning


Approach 1st Edition Fabio Ganovelli

https://textbookfull.com/product/introduction-to-computer-graphics-a-
practical-learning-approach-1st-edition-fabio-ganovelli/

textboxfull.com
A Practical Approach to High-Performance Computing Sergei
Kurgalin

https://textbookfull.com/product/a-practical-approach-to-high-
performance-computing-sergei-kurgalin/

textboxfull.com

Safety, Health and Environmental Auditing: A Practical


Guide, Second Edition Simon Watson Pain

https://textbookfull.com/product/safety-health-and-environmental-
auditing-a-practical-guide-second-edition-simon-watson-pain/

textboxfull.com

Introduction to Compiler Design 3rd Edition Torben Ægidius


Mogensen

https://textbookfull.com/product/introduction-to-compiler-design-3rd-
edition-torben-aegidius-mogensen/

textboxfull.com

Groundwater Lowering in Construction-A Practical Guide to


Dewatering 3rd Edition Pat M. Cashman (Author)

https://textbookfull.com/product/groundwater-lowering-in-construction-
a-practical-guide-to-dewatering-3rd-edition-pat-m-cashman-author/

textboxfull.com

A Practical Approach to Fracture Mechanics Jorge Luis


González-Velázquez

https://textbookfull.com/product/a-practical-approach-to-fracture-
mechanics-jorge-luis-gonzalez-velazquez/

textboxfull.com
Undergraduate Topics in Computer Science

Des Watson

A Practical
Approach
to Compiler
Construction
Undergraduate Topics in Computer Science
Undergraduate Topics in Computer Science (UTiCS) delivers high-quality
instructional content for undergraduates studying in all areas of computing and
information science. From core foundational and theoretical material to final-year
topics and applications, UTiCS books take a fresh, concise, and modern approach
and are ideal for self-study or for a one- or two-semester course. The texts are all
authored by established experts in their fields, reviewed by an international advisory
board, and contain numerous examples and problems. Many include fully worked
solutions.

More information about this series at http://www.springer.com/series/7592


Des Watson

A Practical Approach
to Compiler Construction

123
Des Watson
Department of Informatics
Sussex University
Brighton, East Sussex
UK

Series editor
Ian Mackie

Advisory Board
Samson Abramsky, University of Oxford, Oxford, UK
Karin Breitman, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
Chris Hankin, Imperial College London, London, UK
Dexter Kozen, Cornell University, Ithaca, USA
Andrew Pitts, University of Cambridge, Cambridge, UK
Hanne Riis Nielson, Technical University of Denmark, Kongens Lyngby, Denmark
Steven Skiena, Stony Brook University, Stony Brook, USA
Iain Stewart, University of Durham, Durham, UK

ISSN 1863-7310 ISSN 2197-1781 (electronic)


Undergraduate Topics in Computer Science
ISBN 978-3-319-52787-1 ISBN 978-3-319-52789-5 (eBook)
DOI 10.1007/978-3-319-52789-5
Library of Congress Control Number: 2017932112

© Springer International Publishing AG 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

The study of programming languages and their implementations is a central theme


of computer science. The design of a compiler—a program translating programs
written in a high-level language into semantically equivalent programs in another
language, typically machine code—is influenced by many aspects of computer
science. The compiler allows us to program in high-level languages, and it provides
a layer of abstraction so that we do not have to worry when programming about the
complex details of the underlying hardware.
The designer of any compiler obviously needs to know the details of both the
source language being translated and the language being generated, usually some
form of machine code for the target machine. For non-trivial languages, the com-
piler is itself a non-trivial program and should really be designed by following a
standard structure. Describing this standard structure is one of the key aims of this
book.
The design of compilers is influenced by the characteristics of formal language
specifications, by automata theory, by parsing algorithms, by processor design, by
data structure and algorithm design, by operating system services, by the target
machine instruction set and other hardware features, by implementation language
characteristics and so on, as well as by the needs of the compilers’ users. Coding a
compiler can be a daunting software task, but the process is greatly simplified by
making use of the approaches, experiences, recommendations and algorithms of
other compiler writers.

Why Study Compiler Design?

Why should compiler design be studied? Why is this subject considered to be an


important component of the education of a computer scientist? After all, only a
small proportion of software engineers are employed on large-scale, traditional
compiler projects. But knowing something about what happens within a compiler
can have many benefits. Understanding the technology and limitations of a

v
vi Preface

compiler is important knowledge for any user of a compiler. Compilers are complex
pieces of code and an awareness of how they work can very helpful. The algorithms
used in a compiler are relevant to many other application areas such as aspects of
text decoding and analysis and the development of command-driven interfaces. The
need for simple domain-specific languages occurs frequently and the knowledge of
compiler design can facilitate their rapid implementation.
Writing a simple compiler is an excellent educational project and enhances skills
in programming language understanding and design, data structure and algorithm
design and a wide range of programming techniques. Understanding how a
high-level language program is translated into a form that can be executed by the
hardware gives a good insight into how a program will behave when it runs, where
the performance bottlenecks will be, the costs of executing individual high-level
language statements and so on. Studying compiler design makes you a better
programmer.

Why Another Book?

Why is there now yet another book on compiler design? Many detailed and
comprehensive textbooks in this field have already been published. This book is a
little different from most of the others. Hopefully, it presents key aspects of the
subject in an accessible way, using a practical approach. The algorithms shown
are all capable of straightforward implementation in almost any programming
language, and the reader is strongly encouraged to read the text and in parallel
produce code for implementations of the compiler modules being described. These
practical examples are concentrated in areas of compiler design that have general
applicability. For example, the algorithms shown for performing lexical and syntax
analysis are not restricted for use in compilers alone. They can be applied to the
analysis required in a wide range of text-based software.
The field of programming language implementation is huge and this book covers
only a small part of it. Just the basic principles, potentially applicable to all
compilers, are explained in a practical way.

What’s in this Book?

This book introduces the topic of compiler construction using many programmed
examples, showing code that could be used in a range of compiler and
compiler-related projects. The code examples are nearly all written in C, a mature
language and still in widespread use. Translating them into another programming
language should not cause any real difficulty. Many existing compiler projects are
written in C, many new compiler projects are being written in C and there are many
compiler construction tools and utilities designed to support compiler
Preface vii

implementations in C. Character handling and dynamic data structure management


are well-handled by C. It is a good language for compiler construction. Therefore, it
may have seemed appropriate to choose the construction of a C compiler as a
central project for this textbook. However, this would not have been sensible
because it is a huge project, and the key algorithms of language analysis and
translation would be overwhelmed by the detail necessary to deal with the
numerous complexities of a “real” programming language, even one regarded as
being simpler than many.
This book is primarily about compiler construction, and it is not specifically
about the use of compiler-related algorithms in other application areas. Hopefully,
though, there is enough information in the analysis chapters to show how these
standard grammar-based techniques can be applied very much more widely.
Although many examples in this book are taken from code that may appear in a
complete C compiler, the emphasis is on the development of a compiler for the DL
language. This is a very simple language developed for the needs of this book from
languages used in a series of practical exercises from various undergraduate and
postgraduate compiler construction courses presented by the author. The syntax of
DL is loosely based on a subset of C, but with many restrictions. In particular, there
is just one data type (the integer), and although functions are supported, their
functionality is rather restricted. Nevertheless, DL is sufficiently powerful to be
usable for real problems. The syntax of DL is presented in the appendix.
The widely-available flex and bison tools are introduced, and their use in
practical implementations is discussed, especially in the context of generating a
compiler for DL. These particular packages provide a good insight into the benefits
offered by the many powerful compiler generation tools now available.
The software examples in this book were developed and tested on systems
running Fedora Linux on an x86-64 architecture. The C compiler used was GCC.
Machine and operating system dependencies are probably inevitable, but any
changes needed to move this code to a different computer or operating system
should be comparatively minor.
The code examples are concentrated on the compiler’s front-end. Code for
intermediate code optimisation, target machine code generation and optimisation
tends to be long, complex and often overwhelmed by target machine detail. Hence,
code examples from the back-end are largely avoided in this book so that no
introduction to or detailed discussion of assembly language programming is
included. Instead, the text presents the basic principles of back-end design from
which code generators for diverse target architectures can be developed. References
are given to sources providing further algorithm examples.
The source code of a complete DL compiler is not presented in this book. The
real reason for this is that there is an underlying assumption that one of the most
important practical exercises of the book is to produce a complete compiler for DL.
A large number of code examples taken from a compiler are included in the text to
illustrate the principles being described so that the reader will not be coding from
scratch.
viii Preface

How Should this Book be Used?

This book can be used to accompany taught courses in programming language


implementation and compiler design, and it can also be used for self-study. There is
an assumption that students using this book will have some programming skills but
not necessarily significant experience of writing large software systems. A working
understanding of basic data structures such as trees is essential. The examples in the
book are coded in C, but a comprehensive knowledge of C is really not required to
understand these examples and the accompanying text. A basic knowledge of
computer hardware is also assumed, including just the rudiments of the principles of
assembly-level programming.
Each chapter ends with a few exercises. They vary a great deal in complexity.
Some involve just a few minutes of thought, whereas others are major programming
projects. Many of the exercises are appropriate for group discussion and some may
form the basis of group projects involving code implementation as well as research.
It is especially important to make the most of the practical aspects of this subject
by coding parts of a compiler as the book is being read. This will help greatly to
alleviate boredom and will hugely help with the process of understanding. For
example, for the newcomer to recursive descent parsing, the power and elegance
of the technique can only be fully appreciated when a working implementation has
been written.
The obvious project work associated with this book is to write a complete
compiler for the DL language. Assuming that a simple target machine is chosen, the
project is of a reasonable size and can fit well into an average size university or
college course. Extensions to such a compiler by including optimisation and reg-
ister allocation can follow in a more advanced course. The project can be taken
even further by developing the DL compiler into a complete C compiler, but the
time required to do this should not be underestimated. Writing a simple compiler
following the steps described in this book is not a huge task. But it is important not
to abandon the standard techniques. I have seen some students getting into major
difficulties with the implementation of their compilers, coded using a “much better
algorithm” of their own devising! The correct approach is reliable and really does
involve a careful and systematic implementation with extensive testing of each
module before it is combined with others.
Although the DL language is used as an example in most of the chapters, this
book is not intended to be a tutorial guide for writing DL compilers. Its aims are
much broader than this—it tries to present the principles of compiler design and the
implementation of certain types of programming language, and where appropriate,
DL-targeted examples are presented. Should the reader want to accept the challenge
of writing a complete DL compiler (and I would certainly recommend this), then the
key practical information about lexical and syntax analysis is easy to find in
Chaps. 3 and 5 and semantic analysis in Chap. 6. There is then some information
about DL-specific issues of code generation in Chap. 8.
Preface ix

Turning the compiler construction project into a group project worked well.
Programming teams can be made responsible for the construction of a complete
compiler. The development can be done entirely by members of the team or it may
be possible for teams to trade with other teams. This is a good test of
well-documented interfaces. Producing a set of good test programs to help verify
that a compiler works is an important part of the set of software modules produced
by each team.
Generating standard-format object code files for real machines in an introductory
compilers course may be trying to go a little too far. Generating assembly code for a
simple processor or for a simple subset of a processor’s features is probably a better
idea. Coding an emulator for a simple target machine is not difficult—just use the
techniques described in this book, of course. Alternatively, there are many virtual
target architecture descriptions with corresponding emulator software freely avail-
able. The MIPS architecture, with the associated SPIM software [1], despite its age,
is still very relevant today and is a good target for code generation. The pleasure of
writing a compiler that produces code that actually runs is considerable!

Acknowledgement This book is loosely based on material presented in several


undergraduate and postgraduate lecture courses at the University of Sussex.
I should like to thank all the students who took these courses and who shared my
enthusiasm for the subject. Over the years, I watched thousands of compilers being
developed and discovered which parts of the process they usually found difficult.
I hope that I have addressed those issues properly in this book.
Thanks also go to my colleagues at the University of Sussex—in particular to all
the staff and students in the Foundations of Software Systems research group who
provided such a supportive and stimulating work environment. Particular thanks go
to Bernhard Reus for all his suggestions and corrections.
I’m really grateful to Ian Mackie, the UTICS series editor, and to Helen
Desmond at Springer for their constant enthusiasm for the book. They always
provided advice and support just when it was needed.
Finally, and most important, I should like to thank Wendy, Helen and Jonathan
for tolerating my disappearing to write and providing unfailing encouragement.

Sussex, UK Des Watson

Reference

1. Larus JR (1990) SPIM S20: a MIPS R2000 simulator. Technical Report 966. University of
Wisconsin-Madison, Madison, WI, Sept 1990
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 High-Level Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Advantages of High-Level Languages . . . . . . . . . . . . . . . . 2
1.1.2 Disadvantages of High-Level Languages . . . . . . . . . . . . . . 3
1.2 High-Level Language Implementation . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Compiler Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Interpreters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Why Study Compilers? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Present and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Compilers and Interpreters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Approaches to Programming Language Implementation . . . . . . . . 13
2.1.1 Compile or Interpret? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Defining a Programming Language . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 BNF and Variants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Analysis of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Chomsky Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Compiler and Interpreter Structure . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Lexical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Syntax Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.3 Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.4 Machine-Independent Optimisation. . . . . . . . . . . . . . . . . . . 31
2.4.5 Code Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.6 Machine-Dependent Optimisation . . . . . . . . . . . . . . . . . . . . 32

xi
xii Contents

2.4.7 Symbol Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


2.4.8 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 34
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3 Lexical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Lexical Tokens. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1.1 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1.2 Choosing the List of Tokens . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.3 Issues with Particular Tokens . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.4 Internal Representation of Tokens . . . . . . . . . . . . . . . . . . . 44
3.2 Direct Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 Planning a Lexical Analyser . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.2 Recognising Individual Tokens. . . . . . . . . . . . . . . . . . . . . . 47
3.2.3 More General Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3.1 Specifying and Using Regular Expressions . . . . . . . . . . . . 57
3.3.2 Recognising Instances of Regular Expressions . . . . . . . . . . 58
3.3.3 Finite-State Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 Tool-Based Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.1 Towards a Lexical Analyser for C . . . . . . . . . . . . . . . . . . . 62
3.4.2 Comparison with a Direct Implementation . . . . . . . . . . . . . 70
3.5 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 72
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4 Approaches to Syntax Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1.1 Leftmost and Rightmost Derivations . . . . . . . . . . . . . . . . . 76
4.2 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2.1 Top–Down Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.2.2 Parse Trees and the Leftmost Derivation . . . . . . . . . . . . . . 78
4.2.3 A Top–Down Parsing Algorithm . . . . . . . . . . . . . . . . . . . . 82
4.2.4 Classifying Grammars and Parsers . . . . . . . . . . . . . . . . . . . 86
4.2.5 Bottom-Up Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.6 Handling Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3 Tree Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 91
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5 Practicalities of Syntax Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.1 Top-Down Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.1 A Simple Top-Down Parsing Example . . . . . . . . . . . . . . . . 97
5.1.2 Grammar Transformation for Top-Down Parsing . . . . . . . . 100
Contents xiii

5.2 Bottom-Up Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100


5.2.1 Shift-Reduce Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2.2 Bison—A Parser Generator . . . . . . . . . . . . . . . . . . . . . . . . 103
5.3 Tree Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4 Syntax Analysis for DL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.4.1 A Top-Down Syntax Analyser for DL . . . . . . . . . . . . . . . . 113
5.4.2 A Bottom-Up Syntax Analyser for DL . . . . . . . . . . . . . . . . 124
5.4.3 Top-Down or Bottom-Up? . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.5 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.6 Declarations and Symbol Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.7 What Can Go Wrong? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.8 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 137
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6 Semantic Analysis and Intermediate Code . . . . . . . . . . . . . . . . . . . . . 141
6.1 Types and Type Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.1.1 Storing Type Information . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.1.2 Type Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.2 Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.2.1 Access to Simple Variables . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2.2 Dealing with Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.2.4 Arrays and Other Structures . . . . . . . . . . . . . . . . . . . . . . . . 150
6.3 Syntax-Directed Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3.1 Attribute Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.4 Intermediate Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.4.1 Linear IRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.4.2 Graph-Based IRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.5 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.5.1 A Three-Address Code IR . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.5.2 Translation to the IR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.5.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.6 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 173
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7 Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.1 Approaches to Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.1.1 Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.2 Local Optimisation and Basic Blocks . . . . . . . . . . . . . . . . . . . . . . 180
7.2.1 Constant Folding and Constant Propagation . . . . . . . . . . . . 181
7.2.2 Common Subexpressions . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.2.3 Elimination of Redundant Code . . . . . . . . . . . . . . . . . . . . . 186
7.3 Control and Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7.3.1 Non-local Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
xiv Contents

7.3.2 Removing Redundant Variables . . . . . . . . . . . . . . . . . . . . . 190


7.3.3 Loop Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.4 Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.4.1 Parallel Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.4.2 Detecting Opportunities for Parallelism . . . . . . . . . . . . . . . 197
7.4.3 Arrays and Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.5 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 201
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8 Code Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.1 Target Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.1.1 Real Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.1.2 Virtual Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.2 Instruction Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.3 Register Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.3.1 Live Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.3.2 Graph Colouring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.3.3 Complications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.3.4 Application to DL’s Intermediate Representation . . . . . . . . 219
8.4 Function Call and Stack Management . . . . . . . . . . . . . . . . . . . . . . 219
8.4.1 DL Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.4.2 Call and Return Implementation . . . . . . . . . . . . . . . . . . . . . 221
8.5 Optimisation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
8.5.1 Instruction-Level Parallelism . . . . . . . . . . . . . . . . . . . . . . . 223
8.5.2 Other Hardware Features . . . . . . . . . . . . . . . . . . . . . . . . . . 226
8.5.3 Peephole Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
8.5.4 Superoptimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
8.6 Automating Code Generator Construction . . . . . . . . . . . . . . . . . . . 230
8.7 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 231
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.1 Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.1.1 Cross-Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.1.2 Implementation Languages . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.1.3 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.2 Additional Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
9.3 Particular Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.4 The Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.5 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 243
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Appendix A: The DL Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
List of Figures

Figure 2.1 A simple view of programming language implementation . . . . 14


Figure 2.2 A trivial language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Figure 2.3 BNF for simple arithmetic expressions . . . . . . . . . . . . . . . . . . . 18
Figure 2.4 Syntactic structure of the expression 1 + 2 * 3 . . . . . . . . . . . . 26
Figure 2.5 The analysis/synthesis view of compilation . . . . . . . . . . . . . . . 28
Figure 2.6 Phases of compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Figure 3.1 Directed graph representation of ðabjcÞd . . . . . . . . . . . . . . . . 58
Figure 3.2 Transition diagram for the regular expression ðabjcÞ  d . . . . . 59
Figure 4.1 BNF for a trivial arithmetic language . . . . . . . . . . . . . . . . . . . . 75
Figure 4.2 Tree from the derivation of x+y*z . . . . . . . . . . . . . . . . . . . . . 77
Figure 4.3 Two parse trees for x+y+z . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Figure 5.1 A very simple DL program . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure 5.2 Tree from the program of Fig. 5.1 . . . . . . . . . . . . . . . . . . . . . . 121
Figure 6.1 Structural equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Figure 6.2 Annotated tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Figure 6.3 Tree for a  a=ða  a þ b  bÞ . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Figure 6.4 Common subexpression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Figure 6.5 Basic blocks with control flow . . . . . . . . . . . . . . . . . . . . . . . . . 160
Figure 6.6 Translation of DL to IR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Figure 6.7 A generalised tree node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Figure 7.1 Basic blocks of factorial main program (see appendix) . . . . . . . 181
Figure 7.2 Flow between basic blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Figure 8.1 Trees representing machine instructions . . . . . . . . . . . . . . . . . . 211
Figure 8.2 Live ranges and register interference graph . . . . . . . . . . . . . . . 216
Figure 8.3 Graph colouring algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Figure 8.4 Graph colouring algorithm—register allocation . . . . . . . . . . . . 218

xv
Chapter 1
Introduction

The high-level language is the central tool for the development of today’s software.
The techniques used for the implementation of these languages are therefore very
important. This book introduces some of the practicalities of producing implemen-
tations of high-level programming languages on today’s computers. The idea of a
compiler, traditionally translating from the high-level language source program to
machine code for some real hardware processor, is well known but there are other
routes for language implementation. Many programmers regard compilers as being
deeply mysterious pieces of software—black boxes which generate runnable code—
but some insight into the internal workings of this process may help towards their
effective use.
Programming language implementation has been studied for many years and it is
one of the most successful areas of computer science. Today’s compilers can generate
highly optimised code from complex high-level language programs. These compilers
are large and extremely complex pieces of software. Understanding what they do and
how they do it requires some background in programming language theory as well
as processor design together with a knowledge of how best to structure the processes
required to translate from one computer language to another.

1.1 High-Level Languages

Even in the earliest days of electronic computing in the 1940s it was clear that there
was a need for software tools to support the programming process. Programming was
done in machine code, it required considerable skill and was hard work, slow and
error prone. Assembly languages were developed, relieving the programmer from
having to deal with much of the low-level detail, but requiring an assembler, a piece
of software to translate from assembly code to machine code. Giving symbolic names
to instructions, values, storage locations, registers and so on allows the programmer
© Springer International Publishing AG 2017 1
D. Watson, A Practical Approach to Compiler Construction, Undergraduate
Topics in Computer Science, DOI 10.1007/978-3-319-52789-5_1
2 1 Introduction

to concentrate on the coding of the algorithms rather than on the details of the binary
representation required by the hardware and hence to become more productive. The
abstraction provided by the assembly language allows the programmer to ignore the
fine detail required to interact directly with the hardware.
The development of high-level languages gathered speed in the 1950s and beyond.
In parallel there was a need for compilers and other tools for the implementation
of these languages. The importance of formal language specifications was recog-
nised and the correspondence between particular grammar types and straightforward
implementation was understood. The extensive use of high-level languages prompted
the rapid development of a wide range of new languages, some designed for particular
application areas such as COBOL for business applications [1] and FORTRAN for
numerical computation [2]. Others such as PL/I (then called NPL) [3] tried to be very
much more general-purpose. Large teams developed compilers for these languages
in an environment where target machine architectures were changing fast too.

1.1.1 Advantages of High-Level Languages

The difficulties of programming in low-level languages are easy to see and the need
for more user-friendly languages is obvious. A programming notation much closer
to the problem specification is required. Higher level abstractions are needed so that
the programmer can concentrate more on the problem rather than the details of the
implementation of the solution.
High-level languages can offer such abstraction. They offer many potential advan-
tages over low-level languages including:

• Problem solving is significantly faster. Moving from the problem specification to


code is simpler using a high-level language. Debugging high-level language code
is much easier. Some high-level languages are suited to rapid prototyping, making
it particularly easy to try out new ideas and add debugging code.
• High-level language programs are generally easier to read, understand and hence
maintain. Maintenance of code is now a huge industry where programmers are
modifying code unlikely to have been written by themselves. High-level language
programs can be made, at least to some extent, self-documenting, reducing the
need for profuse comments and separate documentation. The reader of the code is
not overwhelmed by the detail necessary in low-level language programs.
• High-level languages are easier to learn.
• High-level language programs can be structured more easily to reflect the structure
of the original problem. Most current high-level languages support a wide range
of program and data structuring features such as object orientation, support for
asynchronous processes and parallelism.
• High-level languages can offer software portability. This demands some degree of
language standardisation. Most high-level languages are now fairly tightly defined
1.1 High-Level Languages 3

so that, for example, moving a Java program from one machine to another with
different architectures and operating systems should be an easy task.
• Compile-time checking can remove many bugs at an early stage, before the pro-
gram actually runs. Checking variable declarations, type checking, ensuring that
variables are properly initialised, checking for compatibility in function arguments
and so on are often supported by high-level languages. Furthermore, the compiler
can insert runtime code such as array bound checking. The small additional runtime
cost may be a small price to pay for early removal of errors.

1.1.2 Disadvantages of High-Level Languages

Despite these significant advantages, there may be circumstances where the use of a
low-level language (typically an assembly language) may be more appropriate. We
can identify possible advantages of the low-level language approach.

• The program may need to perform some low-level, hardware-specific operations


which do not correspond to a high-level language feature. For example, the hard-
ware may store device status information in a particular storage location—in most
high-level languages there is no way to express direct machine addressing. There
may be a need to perform low-level i/o, or make use of a specific machine instruc-
tion, again probably difficult to express in a high-level language.
• The use of low-level languages is often justified on the grounds of efficiency in
terms of execution speed or runtime storage requirements. This is an important
issue and is discussed later in this section.

These disadvantages of high-level languages look potentially serious. They need


further consideration.

1.1.2.1 Access to the Hardware

A program running on a computer system needs to have access to its environment.


It may input data from the user, it may need to output results, create a file, find the
time of day and so on. These tasks are hardware and operating system specific and
to achieve some portability in coding the high-level language program performing
these actions, the low-level details have to be hidden. This is conventionally done by
providing the programmer with a library acting as an interface between the program
and the operating system and/or hardware. So if the program wants to write to a
file, it makes a call to a library routine and the library code makes the appropriate
operating system calls and performs the requested action.
To address the stated advantage of low-level languages mentioned above there is
nothing to stop the construction of operating system and machine-specific libraries
to perform the special-purpose tasks such as providing access to a particular storage
4 1 Introduction

location or executing a particular machine instruction. There may be machine-specific


problems concerned with the mechanism used to call and return from this library
code with, for example, the use of registers, or with the execution time cost of the
call and return, but such difficulties can usually be overcome.
A few programming languages provide an alternative solution by supporting inline
assembly code. This code is output unchanged by the compiler, providing the high-
level language program direct access to the hardware. This is a messy solution,
fraught with danger, and reliable means have to be set up to allow data to be passed
into and returned from this code. Such facilities are rarely seen today.

1.1.2.2 Efficiency

There are many programming applications where efficiency is a primary concern.


These could be large-scale computations requiring days or weeks of processor time
or even really short computations with severe real-time constraints. Efficiency is
usually concerned with the minimisation of computation time, but other constraints
such as memory usage or power consumption could be more important.
In the early development of language implementations, the issue of efficiency
strongly influenced the design of compilers. The key disadvantage of high-level
languages was seen as being one of poor efficiency. It was assumed that machine-
generated code could never be as efficient as hand-written code. Despite some
remarkable optimisations performed by some of the early compilers (particularly
for FORTRAN), this remained largely true for many years. But as compiler tech-
nology steadily improved, as processors became faster and as their architectures
became more suited to running compiler-generated code from high-level language
programs, the efficiency argument became much less significant. Today, compiler-
generated code for a wide range of programming languages and target machines is
likely to be just as efficient, if not more so, than hand-written code.
Does this imply that justifying the use of low-level languages on the grounds
of producing efficient code is now wrong? The reality is that there may be some
circumstances where coding in machine or assembly code, very carefully, by hand,
will lead to better results. This is not really feasible where the amount of code is
large, particularly where the programmer loses sight of the large-scale algorithmic
issues while concentrating on the small-scale detail. But it may be a feasible approach
where, for example, a small function needs to run particularly quickly and the skills
of a competent low-level programmer with a good knowledge of the target machine
are available. Mixing high-level language programming with low-level language
programming is perfectly reasonable in this way. But if the amount of code to be
optimised is very small, other automated methods may be available (for example [4]).
When developing software, a valuable rule to remember is that there is no need
to optimise if the code is already fast enough. Modern processors are fast and a huge
amount can be achieved during the time taken for a human to react to the computer’s
output. However, this does not imply that compilers need never concern themselves
1.1 High-Level Languages 5

with code optimisation—there will always be some applications genuinely needing


the best out of the hardware.
The case for using high-level languages for almost all applications is now very
strong. In order to run programs written in high-level languages, we need to consider
how they can be implemented.

1.2 High-Level Language Implementation

A simplistic but not inaccurate view of the language implementation process suggests
that some sort of translator program is required (a compiler) to transform the high-
level language program into a semantically equivalent machine code program that
can run on the target machine. Other software, such as libraries, will probably also
be required. As the complexity of the source language increases as the language
becomes “higher and higher-level”, closer to human expression, one would expect
the complexity of the translator to increase too.
Many programming languages have been and are implemented in this way. And
this book concentrates on this implementation route. But other routes are possible,
and it may be the characteristics of the high-level language that forces different
approaches. For example, the traditional way of implementing Java makes use of
the Java Virtual Machine (JVM) [5], where the compiler translates from Java source
code into JVM code and a separate program (an interpreter) reads these virtual
machine instructions, emulating the actions of the virtual machine, effectively run-
ning the Java program. This seemingly contrary implementation method does have
significant benefits. In particular it supports Java’s feature of dynamic class loading.
Without such an architecture-neutral virtual machine code, implementing dynamic
class loading would be very much more difficult. More generally, it allows the support
of reflection, where a Java program can examine or modify at runtime the internal
properties of the executing program.
Interpreted approaches are very appropriate for the implementation of some pro-
gramming languages. Compilation overheads are reduced at the cost of longer run-
times. The programming language implementation field is full of tradeoffs. These
issues of compilers versus interpreters are investigated further in Chap. 2.
To make effective use of a high-level language, it is essential to know something
about its implementation. In some demanding application areas such as embedded
systems where a computer system with a fixed function is controlling some elec-
tronic or mechanical device, there may be severe demands placed on the embedded
controller and the executing code. There may be real-time constraints (for example,
when controlling the ignition timing in a car engine where a predefined set of opera-
tions has to complete in the duration of a spark), memory constraints (can the whole
program fit in the 64 k bytes available on the cheap version of the microcontroller
chip?) or power consumption constraints (how often do I have to charge the batter-
ies in my mobile phone?). These constraints make demands on the performance of
the hardware but also on the way in which the high-level language implementing
6 1 Introduction

the system’s functionality is actually implemented. The designers need to have an


in-depth knowledge of the implementation to be able to deal with these issues.

1.2.1 Compilers

The compiler is a program translating from a source language to a target language,


implemented in some implementation language. As we have seen, the traditional view
of a compiler is to take some high-level language as input and generate machine code
for some target machine. Choice of implementation language is an interesting issue
and we will see later in Chap. 9 why the choice of this language may be particularly
important.
The field of compilation is not restricted to the generation of low-level language
code. Compiler technology can be developed to translate from one high-level lan-
guage to another. For example, some of the early C++ compilers generated C code
rather than target machine code. Existing C compilers were available to perform the
final step.
The complexity of a compiler is not only influenced by the complexities of the
source and target languages, but also by the requirement for optimised target code.
There is a real danger in compiler development of being overwhelmed by the com-
plexities and the details of the task. A well-structured approach to compiler devel-
opment is essential.

1.2.2 Compiler Complexity

The tools and processes of programming language implementation cannot be consid-


ered in isolation. Collaboration between the compiler writer, the language designer
and the hardware architect is vital. The needs of the end-users must be incorporated
too. Compilers have responded to the trend towards increased high-level language
complexity and the desire for aggressive optimisation by becoming significantly more
complex themselves. In the early days of compilers the key concern was the gener-
ation of good code, rivalling that of hand coders, for machines with very irregular
architectures. Machine architectures gradually became more regular, hence mak-
ing it easier for the compiler writer. Subsequent changes (from the 1980s) towards
the much simpler instruction sets of the reduced instruction set computers (RISC)
helped simplify the generation of good code. Attention was also being focused on
the design of the high-level languages themselves and the support for structures and
methodologies to help the programmers directly. Today, the pressures caused by new
languages and other tools to support the software development process are still there
and also the rapid move towards distributed computing has placed great demands on
program analysis and code generation. Parallelism, in its various forms, offers higher
performance processing but at a cost of increased implementation complexity.
1.2 High-Level Language Implementation 7

The language implementation does not stop at the compiler. The support of col-
lections of library routines is always required, providing the environment in which
code generated by the compiler can run. Other tools such as debuggers, linkers, doc-
umentation aids and interactive development environments are needed too. This is
no easy task.
Dealing with this complexity requires a strict approach to design in the structuring
of the compiler construction project. Traditional techniques of software engineering
are well applied in compiler projects, ensuring appropriate modularisation, testing,
interface design and so on. Extensive stage-by-stage testing is vital for a compiler.
A compiler may produce highly optimised code, but if that code produces the wrong
answers when it runs, the compiler is not of much use. To ease the task of producing
a programming language implementation, many software tools have been developed
to help generate parts of a compiler or interpreter automatically. For example, lexi-
cal analysers and syntax analysers (two early stages of the compilation process) are
often built with the help of tools taking the formal specification of the syntax of the
programming language as input and generating code to be incorporated in the com-
piler as output. The modularisation of compilers has also helped to reduce workload.
For example, many compilers have been built using a target machine independent
front-end and a source language-independent back-end using a standard intermediate
representation as the interface between them. Then front-ends and back-ends can be
mixed and matched to produce a variety of complete compilers. Compiler projects
rarely start from scratch today.
Fortunately, in order to learn about the principles of language implementation,
compiler construction can be greatly simplified. If we start off with a simple pro-
gramming language and generate code for a simple, maybe virtual, machine, not
worrying too much about high-quality code, then the compiler construction project
should not be too painful or protracted.

1.2.3 Interpreters

Running a high-level language program using a compiler is a two-stage process. In the


first stage, the source program is translated into target machine code and in the second
stage, the hardware executes or a virtual machine interprets this code to produce
results. Another popular approach to language implementation generates no target
code. Instead, an interpreter reads the source program and “executes” it directly.
So if the interpreter encounters the source statement a = b + 1, it analyses the
source characters to determine that the input is an assignment statement, it extracts
from its own data structures the value of b, adds one to this value and stores the result
in its own record of a.
This process of source-level interpretation sounds attractive because there is no
need for the potentially complex implementation of code generation. But there are
practical problems. The first problem concerns performance. If a source statement
is executed repeatedly it is analysed each time, before each execution. The cost of
8 1 Introduction

possibly multiple statement analysis followed by the interpreter emulating the action
of the statement will be many times greater than the cost of executing a few machine
instructions obtained from a compilation of a = b + 1. However this cost can
be reduced fairly easily by only doing the analysis of the program once, translating
it into an intermediate form that is subsequently interpreted. Many languages have
been implemented in this way, using an interpreted intermediate form, despite the
overhead of interpretation.
The second problem concerns the need for the presence of an interpreter at runtime.
When the program is “executing” it is located in the memory of the target system in
source or in a post-analysis intermediate form, together with the interpreter program.
It is likely that the total memory footprint is much larger than that of equivalent
compiled code. For small, embedded systems with very limited memory this may be
a decisive disadvantage.
All programming language implementations are in some sense interpreted. With
source code interpretation, the interpreter is complex because it has to analyse the
source language statements and then emulate their execution. With intermediate code
interpretation, the interpreter is simpler because the source code analysis has been
done in advance. With the traditional compiled approach with the generation of
target machine code, the interpretation is done entirely by the target hardware, there
is no software interpretation and hence no overhead. Looking at these three levels of
interpretation in greater detail, one can easily identify tradeoffs:

Source-level interpretation—interpreter complexity is high, the runtime efficiency is


low (repeated analysis and emulation of the source code statements), the initial
compilation cost is zero because there is no separate compiler and hence the delay
in starting the execution of the program is also zero.
Intermediate code interpretation—interpreter complexity is lower, the runtime effi-
ciency is improved (the analysis and emulation of the intermediate code statements
is comparatively simple), there is an initial compilation cost and hence there is a
delay in starting the program.
Target code interpretation—full compilation—there is no need for interpreter soft-
ware so interpreter complexity is zero, the runtime efficiency is high (the inter-
pretation of the code is done directly by the hardware), there is a potentially large
initial compilation cost and hence there may be a significant delay in starting the
program.

The different memory requirements of the three approaches are somewhat harder
to quantify and depend on implementation details. In the source-level interpretation
case, a simplistic implementation would require both the text of the source code and
the (complex) interpreter to be in main memory. The intermediate code interpretation
case would require the intermediate code version of the program and the (simpler)
interpreter to be in main memory. And in the full compilation case, just the compiled
target code would need to be in main memory. This, of course, takes no account of the
memory requirements of the running program—space for variables, data structures,
buffers, library code, etc.
1.2 High-Level Language Implementation 9

There are other tradeoffs. For example, when the source code is modified, there is
no additional compilation overhead in the source-level interpretation case, whereas in
the full compilation case, it is usual for the entire program or module to be recompiled.
In the intermediate code interpretation case, it may be possible to just recompile the
source statements that have changed, avoiding a full recompilation to intermediate
code.
Finally, it should be emphasised that this issue of lower efficiency of interpreted
implementations is rarely a reason to dismiss the use of an interpreter. The interpreting
overhead in time and space may well be irrelevant, particularly in larger computer
systems, and the benefits offered may well overwhelm any efficiency issues.

1.3 Why Study Compilers?

It is important to ask why the topic of compiler construction is taught to computer


science students. After all, the number of people who actually spend their time
writing compilers is small. Although the need for compilers is clear, there is not
really a raging demand for the construction of new compilers for general-purpose
programming languages.
One of the key motivations for studying this technology is that compiler-related
algorithms have relevance to application areas outside the compiler field. For exam-
ple, transforming data from one syntactic form into another can be approached by
considering the grammar of the structure of the source data and using traditional
parsing techniques to read this data and then output it in the form of the new gram-
mar. Furthermore, it may be appropriate to develop a simple language to act as a user
interface to a program. The simplicity and elegance of some parsing algorithms and
basing the parsing on a formally specified grammar helps produce uncomplicated
and reliable software tools.
Studying compiler construction offers the computer science student many insights.
It gives a practical application area for many fundamental data structures and algo-
rithms, it allows the construction of a large-scale and inherently modular piece of
software, ideally suited for construction by a team of programmers. It gives an insight
into programming languages, helping to show why some programming languages
are the way they are, making it easier to learn new languages. Indeed, one of the best
ways of learning a programming language is to write a compiler for that language,
preferably writing it in its own language. Writing a compiler also gives some insight
into the design of target machines, both real and virtual.
There is still a great deal of work to be done in developing compilers. As new
programming languages are designed, new compilers are required, and new pro-
gramming paradigms may need novel approaches to compilation. As new hardware
architectures are developed, there is a need for new compilation and code generation
strategies to make effective use of the machine’s features.
Although there is a steady demand for new compilers and related software, there
is also a real need for the development of new methodologies for the construction
10 1 Introduction

of high-quality compilers generating efficient code. Implementing code to analyse


the high-level language program is not likely to be a major challenge. The area has
been well researched and there are good algorithms and software tools to help. But
the software needed to generate target code, particularly high-quality code, is much
harder to design and write. There are few standard approaches for this part of the
compiler. And this is where there is an enormous amount of work still to be done.
For example, generating code to make the best use of parallel architectures is hard,
and we are a long way from a general, practical solution.
Is there really a need for heavily optimised target code? Surely, today’s processors
are fast enough and are associated with sufficiently large memories? This may be
true for some applications, but there are many computations where there are, for
example, essential time or space constraints. These are seen particularly in embedded
applications where processor power or memory sizes may be constrained because of
cost or where there are severe real-time constraints. There will always be a need to
get the most from the combination of hardware and software. The compiler specialist
still has a great deal of work to do.

1.4 Present and Future

Today’s compilers and language tools can deal with complex (both syntactically and
semantically) programming languages, generating code for a huge range of computer
architectures, both real and virtual. The quality of generated code from many of
today’s compilers is astonishingly good, often far better than that generated by a
competent assembly/machine code programmer. The compiler can cope well with
the complex interacting features of computer architectures. But there are practical
limits. For example, the generation of truly optimal code (optimising for speed, size,
power consumption, etc.) may in practice be at best time consuming or more likely
impossible. Where we need to make the best use of parallel architectures, today’s
compilers can usually make a good attempt, but not universally. There are many
unsolved optimisation-related problems. Also, surely there must be better processor
architectures for today’s and tomorrow’s programming languages?
Compilers are not just about generating target code from high-level language
programs. Programmers need software tools, probably built on compiler technology,
to support the generation of high-quality and reliable software. Such tools have
been available for many years to perform very specific tasks. For example, consider
the lint tool [6] to highlight potential trouble spots in C programs. Although huge
advances have been made, much more is required and this work clearly interacts with
programming language design as well as with diverse areas of software engineering.
1.5 Conclusions and Further Reading 11

1.5 Conclusions and Further Reading

A study of the history of programming languages provides some good background


to the development of implementations. Detailed information about early languages
appears in [7] and more recent articles are easily found on the web. Several pictorial
timelines have been produced, showing the design connections between program-
ming languages.
Similarly, the history of processor design shows how the compiler writer has had
to deal with a rapidly changing target. Web searches reveal a huge literature, and it is
easy to see how the development of hardware architectures in more recent years has
been influenced by the compiler writer and indirectly by the needs of the high-level
language programmer. A full and wide-ranging coverage of computer hardware is
contained in [8]. A comprehensive coverage of modern architecture together with
historical perspectives is found in [9].
To help put the material appearing in the rest of this book into context, it is worth
looking at some existing compiler projects. But it is important not to be put off
by the scale of some of these projects. Many have been developed over decades,
with huge programming teams. Perhaps most famous is GCC (the GNU Compiler
Collection), documented at https://gcc.gnu.org/ which “… includes front ends for
C, C++, Objective-C, Fortran,1 Java, Ada, and Go”. This website includes links to
numerous documents describing the project from the point of view of the user, the
maintainer, the compiler writer and so on. The LLVM Compiler Infrastructure is
documented at http://llvm.org — another collection of compilers and related tools,
now in widespread use. The comp.compilers newsgroup and website (http://
compilers.iecc.com/) is an invaluable resource for compiler writers.

Exercises
1.1 Try to find a compiler that has an option to allow you to look at the generated
code (for example, use the -S option in GCC). Make sure that any optimisation
options are turned off. Look at the code generated from a simple program (only
a few source lines) and try to match up blocks of generated code with source
statements. Details are not important, and you do not need an in-depth knowledge
of the architecture or instruction set of the target machine. By using various
source programs, try to get some vague idea of how different source language
constructs are translated.
Now, turn on optimisation. It should be very much harder to match the input with
the output. Can you identify any specific optimisations that are being applied?
1.2 Find the documentation for a “big” compiler. Spend some time looking at the
options and features supported by the package.
1.3 Find the instruction set of the main processor contained in the computer you
use most often. How much memory can it address? How many registers does

1 Note the capitalisation. After FORTRAN 77, the language became known as Fortran.
12 1 Introduction

it have? What can it do in parallel? How many bits can it use for integer and
for floating point arithmetic? Approximately how long does it take to add two
integers contained in registers?
1.4 Look back at the design of a much older processor (for example, the DEC PDP-
11 architecture is well documented on the web). Answer the questions listed
above for this processor.
1.5 Try to find out something about the optimisations performed by the early
FORTRAN compilers. The work done by IBM on some early FORTRAN com-
pilers is well documented and shows the efforts made by the company to move
programmers away from assembly language programming. Try also to find out
about more recent attempts with Fortran (note the capitalisation!) to make the
most of parallel architectures.
1.6 Do some research to find out the key trends in processor design over the last few
decades. Do the same for high-level language design. How have these trends
affected the design of compilers and interpreters?
1.7 To prepare for the practical programming tasks ahead, write a simple program
to read a text file, make some trivial transformation character by character such
as swapping the case of all letters, and write the result to another text file. Keep
this program safe. It will be useful later when it can be augmented to produce a
complete lexical analyser.

References

1. American National Standards Institute, New York (1974) USA Standard COBOL, X3.23-1974
2. United States of America Standards Institute, New York (1966) USA Standard FORTRAN –
USAS X3.9-1966
3. Radin G, Paul Rogoway H (1965) NPL: highlights of a new programming language. Commun
ACM 8(1):9–17
4. Massalin H (1987) Superoptimizer – a look at the smallest program. In: Proceedings of the second
international conference on architectural support for programming languages and operating
systems (ASPLOS-II). Palo Alto, California. Published as ACM SIGPLAN Notices 22:10, pp
122–126
5. Lindholm T, Yellin F (1997) The Java virtual machine specification. The Java series. Addison-
Wesley, Reading
6. Johnson SC (1978) Lint, a C program checker. Technical report. Bell Laboratories, Murray Hill,
07974
7. Sammet JE (1969) Programming languages: history and fundamentals. Prentice-Hall, Engle-
wood Cliffs
8. Tanenbaum AS, Austin T (2013) Structured computer organization. Pearson, Upper Saddle River
9. Hennessy JL, Patterson DA (2012) Computer architecture – a quantitative approach, 5th edn.
Morgan Kaufmann, San Francisco
Chapter 2
Compilers and Interpreters

Before looking at the details of programming language implementation, we need to


examine some of the characteristics of programming languages to find out how they
are structured and defined. A compiler, or other approach to implementation, is a
large and complex software system and it is vital to have some clear and preferably
formal structure to support its construction.
This chapter examines some of the approaches that can be used for high-level
programming language implementation on today’s computer hardware and provides
some of the background to enable high-level to low-level language translation soft-
ware to be designed in a structured and standard way.

2.1 Approaches to Programming Language


Implementation

The traditional approach for the implementation of a programming language is to


write a program that translates programs written in that language into equivalent
programs in the machine code of the target processor. To make the description of this
process a little easier, we shall assume that the source program is written in a language
called mylanguage and we are producing a program to run on mymachine. This
view is shown in Fig. 2.1.
This trivial diagram is important because it forces us to consider some important
issues. First, what precisely is the nature of the source program? It is obviously
a program written in mylanguage—the language we are trying to implement.
But before we can contemplate an implementation of the translator software, we
have to have a precise definition of the rules and structure of programs written in
mylanguage. In Sect. 2.2, we consider the nature of such a definition.

© Springer International Publishing AG 2017 13


D. Watson, A Practical Approach to Compiler Construction, Undergraduate
Topics in Computer Science, DOI 10.1007/978-3-319-52789-5_2
14 2 Compilers and Interpreters

source program program to run


written in mylanguage translator on mymachine

Fig. 2.1 A simple view of programming language implementation

Second, what sort of programming language is mylanguage? At this stage it


may not really matter, but the nature of the language will of course affect the design of
the translator software. For example, if mylanguage is a high-level programming
language and mymachine is a hardware-implemented processor, then the translator
program is usually called a compiler. This book concentrates on this particular con-
figuration. If, however, mylanguage is an assembly language (for mymachine)
then the translator is usually called an assembler and would be significantly easier
to implement than a compiler.
Can programs in mylanguage be passed directly into the translator or does it
make more sense to preprocess the programs first? For example, programs written in
C can include language features best dealt with by a preprocessor. This stage could
of course be regarded as being an integral part of the translation process.
Third, what sort of language is used to express the program to run on mymachine?
Again if mymachine is a hardware-implemented processor, then we are probably
looking at the generation of machine code programs encoded in some object file for-
mat dependent on the design of mymachine and probably on the operating system
running on mymachine. Generating assembly language may be the right thing to
do, requiring the existence of a separate assembler program to produce code directly
runnable on mymachine. And there are other possibilities too. A popular approach
to language implementation assumes that mymachine is a virtual machine. This is
a machine for which there is no corresponding hardware and exists only as a conse-
quence of a piece of software which emulates the virtual machine instructions. This
approach is examined in Sect. 2.1.1. Furthermore, the translator could have a some-
what different role. It could be used to translate from mylanguage to a high-level
language rather than to a low-level language for mymachine. This would result
in a software tool which could be used to translate programs from one high-level
language to another, for example from C++ to C. In this case, aspects of the internal
design of the translator may be rather different to that of a conventional compiler and
there is an assumption that mymachine somehow runs programs written in the tar-
get high-level language. There are many common principles and algorithms that can
be used in all these language translation tasks. Whatever the form of mymachine
and the generated code, a precise specification is required, just as for mylanguage.
At the beginning of this section is a statement that the translator generates “equiv-
alent programs” for mymachine. This is an important issue. The translator should
preserve the semantics of the mylanguage program in the running of the generated
code on mymachine. The semantics of mylanguage may be specified formally
or informally and the user of mylanguage should have a clear idea of what each
valid program should “mean”. For example, translating the statement a = a + 2
Exploring the Variety of Random
Documents with Different Content
own amazement. He had considered himself doomed, and his
restoration to liberty puzzled him; but he was too obtuse to divine the
real cause, and he did not dream how every movement of his was
being watched. Some days later he justified Danevitch’s prediction.
Being off duty, he went into the city, and, making his way to one of
the quays on the Neva, now frozen over, he met a young woman,
and was seen to hand her a paper. They did not confer together
long, and when they separated, the young woman was followed to
her home by Danevitch. Had he been a mere subordinate of the
chief of police, he would have been compelled to have reported this
incident, with the result that a domiciliary visit would have been paid
to the house, and as a natural corollary of that action, assuming that,
as was suspected, she was in conspiracy with others, her co-
conspirators would be warned, and justice might be defeated.
Danevitch was aware of all this, and, like a well-trained sleuth-
hound, he did not attempt to strike his quarry until he was absolutely
sure of it. He knew that at the most Vladimir could be but a humble
instrument; behind him and influencing him were more powerful foes
to the State. These were the people he wanted to lay his hands
upon. It was no use casting his net for the little fish only; it was the
big ones he fished for. After witnessing the meeting between
Vladimir and the young woman, Danevitch had another interview
with Colonel Vlassovski, during which he informed him that Vladimir
was dangerous, and should be closely watched, though care was to
be taken not to allow him to suspect that he was being watched. A
few days later Danevitch again went to the Colonel, and said:
‘I believe I am in the way of bringing to light a great conspiracy, and I
am going to leave Russia for a time.’
‘But how in the world can you bring the conspiracy to light if you are
out of Russia?’ asked the Colonel in alarm. ‘Your presence is
required here if there is danger.’
‘No. I can do better elsewhere. There is danger, but it does not
threaten immediately. The head of the movement is not in Russia. If
the head is destroyed, the tail is sure to perish. I am going to seek
the head. The tail, which is here, can be trampled on afterwards.’
‘Where is the head, do you think?’
‘I don’t exactly know. In Berlin, perhaps; in Geneva, Paris, London.’
‘Ah, Geneva and London!’ exclaimed the Colonel angrily. ‘Those two
places are responsible for much. They offer refuge to the vilest of
wretches so long as they claim to be merely political offenders. Like
charity, that term covers a multitude of sins, and under its protecting
influence some of the most desperate and bloodthirsty scoundrels
who ever walked the earth have found sanctuary.’
‘True,’ answered Danevitch; ‘but we cannot help that. There are
ways and means, however, of dragging rascals of that kind from their
sanctuary. I am going to see what can be done.’
‘You will keep in touch with me,’ the Colonel remarked.
‘Certainly I will. In the meantime, draw a closer cordon round the
palace, and let no one sleep. You must not forget, Colonel, that the
plots we are called upon to checkmate are hatched not in Russia,
but in some of the European capitals. The poor fools who execute
the work here are mere tools. We want to lay hands on the
principals, the people who from a safe retreat supply the money.
Stop the money, and the tools will cease to work.’
All that Danevitch urged was undeniable. The Colonel knew it. Those
in power knew it. The Czar himself knew it. But hitherto the great
difficulty had been to secure the principals. The prisons were full of
the hirelings; hundreds and hundreds of them dragged out their
miserable lives in Siberia; but still the danger was not lessened, for
as long as ever money was forthcoming men and women could
always be found ready and willing to pit their liberties and lives
against the forces of the Government. It cannot be denied that
amongst them were some, many perhaps, who were not mere
hirelings, but were prompted by mistaken notions of patriotism; they
were generally young people led away by false sentiments and
misplaced enthusiasm. It had been found, too, that young women,
for the sake of men they loved, were willing to risk all they held
sacred on earth at the bidding of their lovers. They were the most
pliant, the most willing tools; but they were also the weak links in the
chain. They acted with less caution than men. They went to work
blindly, and with a stupid recklessness which was bound sooner or
later to betray them. Danevitch had a favourite theory, or saying, to
the effect that, given a plot with a woman in it, all you had to do was
to find out the woman, and you would discover the plot. In this case
he had found out the woman. The one who met Vladimir on the quay
by the Neva was a book-keeper in a general store. She shared
apartments with another young woman in a poor part of the town. At
night, when her duties for the day were over, she was in the habit of
attending secret meetings, mostly of women, with a sprinkling of men
amongst them. One of these women was a Madame Petrarna. She
was an organizer and a leader. Vladimir’s sweetheart was in high
favour with her. Petrarna was the wife of a man who was in exile as
‘a danger to the State.’ He had been arrested as a suspicious
personage, and though nothing was actually proved against him, he
was sent to Siberia.
Having learnt so much about Vladimir’s sweetheart, Danevitch
devoted his attention to Petrarna. He had made the ways of Nihilists
a study, and though they had their spies everywhere, he was often
able to outwit them, and he succeeded in getting around him a little
band of devoted agents who were ready to go anywhere and do
anything at his bidding. Amongst these agents was a clever little
woman, and she succeeded one night in gaining admission to a
meeting over which Petrarna was presiding. The president spoke of
the arrest and release of Vladimir, and how he had been able, after
all, to hand to his sweetheart and their colleague certain drawings of
the palace, which would be invaluable to them in their work.
This and many other things the agent learnt, and conveyed the
intelligence to her employer Danevitch, whereby he was induced to
go abroad to search for the head, as he had told Colonel Vlassovsky.
Weeks passed, and Danevitch was in Geneva. The weather was
bitter. The winter had set in very early, and so far had been unusually
severe. At this period there were something like five thousand
Russians living in Geneva and its environs. The majority of these
Russians were Nihilists. One night, although a black bise was
blowing, filling the air with spiculæ of ice, and freezing to the marrow
all those who ventured into the streets, various individuals—singly, in
twos and threes—wended their way to an old building in a lonely
side-street not far from the Gare. It was a short street, and devoted
principally to warehouses, which were closed at night; consequently
it was badly lighted, and after business hours practically deserted.
The entrance to one of these buildings was by an arched gateway,
closed with massive wooden gates, in one side of which was a small
door to allow the workpeople to pass in and out when the gates were
closed. On the night in question, this little door opened and shut
many times; each time it opened, somebody entered after having
been asked for a sign, a counter-sign, and a password. Without
these none could enter. At length there were nearly fifty persons
present. Then the gate was barred and guarded. In a long back
upper room, the windows of which were so screened that not a ray of
light could escape, a meeting was held. It was a Nihilist meeting, and
the chief thing discussed was the destruction of the Czar of Russia.
Reports were also read from many ‘Centres,’ detailing the progress
that was made in what was called ‘The Revolutionary Movement.’
One man brought with him a great quantity of seditious literature in
Russian. It had been printed by a secret press in the town. The
meeting was presided over by a lady; that lady was Mrs. Sherard
Wilson. She distributed a considerable amount of money among
those present, and talked the most violent of language. She was a
fluent and eloquent speaker, and swayed the meeting as reeds are
swayed by the wind.
A long discussion followed, and many things were settled. Amongst
others, the date of the ‘Czar’s execution’ was fixed; and Mrs.
Sherard announced that she would leave for St. Petersburg in a very
few days to hasten the ‘good cause.’
The meeting was orderly, business-like, and quiet. Every person
present—man and woman—seemed terribly in earnest, and there
was a grim severity in their tone and speech which argued
unrelenting bitterness and hatred against the ruler of Russia and
many prominent members of his council, all of whom were marked
for swift and sudden death. It was midnight when the meeting broke
up. Silently the people came, silently they departed; and when the
last one had gone, and the door in the gate had been locked, a
death-like stillness reigned in the deserted warehouse. Outside, the
black bise roared, bringing from the lake and the surrounding hills
fierce storms of hail.
A little later the door of the gate opened noiselessly, and a man,
having glanced carefully up and down to see that no one was in
sight, passed out, locked the door after him, and disappeared in the
darkness of the night.
That man was Michael Danevitch. He had heard all that had passed
at the meeting, for he had been concealed behind a pile of packing-
cases, and his note-book was filled with the names, so far as he
could gather them, of all those who had taken part in the
proceedings.
Three days after the meeting had been held, Mrs. Sherard Wilson
took her departure for Berlin, where she rested for a day and a night,
and had interviews with several influential people, and at a certain
bank and money-changer’s in Berlin she converted an English
cheque for a large amount into Russian money. She was known to
the money-changer; he had cashed similar cheques before. Having
completed her business, she pursued her way to Russia. At the
frontier her luggage and passport were examined. There was
nothing liable to duty in the former; the latter was all in order and
duly viséd. The examiners at the frontier, however, failed to discover
in one of her trunks a very artfully and cleverly contrived false
bottom, where lay concealed not only a mass of inflammatory
literature, but documents of the most damaging description. So she
passed on her journey, distributing largess freely, and regarded by
the officials as a lady of distinction, travelling no doubt on important
business, for no one travelled for pleasure in the winter weather.
Mrs. Wilson spoke French, German, Russian, and many dialects, so
that she had no difficulty with regard to tongues. In the same train
with her travelled a man, who was ostensibly a fur merchant, in
reality her shadower—Danevitch the detective.
In due course they reached St. Petersburg, and the lady was driven
to one of the principal hotels, where she engaged a suite of rooms;
and when three or four days had elapsed, during which she was very
active and went about much, she attended a secret meeting, held in
the house of one Alexeyeff, who was a bookseller in a small way of
business. In that house over sixty persons assembled, including the
indefatigable Mrs. Sherard Wilson. When the last person had
entered, there gradually closed around the place a cordon of heavily-
armed policemen. They, again, were reinforced by a body of soldiers
with loaded guns and fixed bayonets. At a given signal, when all was
ready, the door of the house was burst in and the meeting, which
had just got to business, was broken up in wild confusion. The
people saw that they had been betrayed and were trapped. For a
moment a panic seized them. Some made a bid for liberty, and
rushed off, but could not get far; the cordon was too strong to be
broken through. Others, with a wild despair, prepared to sell their
lives and liberties dearly. But, as is well known, Continental police,
and particularly the Russian police, stand on no ceremony when
resistance to their authority is offered. The maudlin sentiment which
we in England so often display, even when the most desperate
ruffians are concerned, is quite unknown abroad. Resistance to the
law generally means injury, and often death, to the resister. On the
occasion in question, the police and the soldiers were all heavily
armed, for they were aware that the work they were called upon to
perform could not be undertaken with kid gloves on; the glittering
swords and bayonets which menaced the trapped people had an
effect, and what threatened to be a scene of bloodshed and death
ended in a despairing surrender to the forces that were irresistible.
From the moment that the police broke in upon the meeting Mrs.
Sherard Wilson felt that hope had gone, and she made no attempt
either to save her own liberty or arouse her followers to action.
Under a very strong escort the misguided people were conveyed to
prison, and very soon it was made evident that Danevitch had
brought to light one of the most desperate and gigantic conspiracies
of modern times. Not only had plans been drawn up and
arrangements made for killing the Czar, but many noblemen and
high officials were to be killed. The conspirators were chosen from all
ranks of society, and they had followers in the army and the navy, as
well as in the police. That they would have succeeded in their
nefarious designs there is little doubt, had it not been for the
vigilance and cleverness of Danevitch. He found out that Count
Obolensk, who resided in London, was supplying large sums of
money to aid the work of the conspiracy. The detective therefore
decided upon the bold step of taking service in the Count’s
household for a time. This he succeeded in doing, and on the night
of the meeting recorded in the early part of this story, which was held
at the Count’s house, he hid himself behind the writing-desk and
heard all that took place. In order to get away from the house without
raising suspicion, he let the tray of china fall on the stairs as Miss
Obolensk was descending. He followed Mrs. Sherard Wilson to
Geneva, and was present at that other meeting, when he gained
most important information, and subsequently, all unknown to her,
accompanied the lady to Russia.
Investigation brought to light the fact that Mrs. Wilson was the wife of
a Russian of high social position, but he had been sent to Siberia for
life as a political offender. From that moment his wife became the
sworn enemy of the Government and the Czar. She had previously
been acquainted with Count Obolensk, and was able to exert great
influence over him, and, as he was very wealthy, he proved a
valuable ally. The plot failed, however, at the eleventh hour, thanks to
Danevitch. How narrow had been the escape of the Emperor from a
violent death was revealed at the trial of the prisoners, when it was
proved that a considerable number of the officials of the palace, as
well as soldiers and servants, had been corrupted, and on a given
date a man was to be admitted to the palace at night, and he was to
throw a bomb into the Czar’s bedroom.
Simultaneously an attempt was to be made on the lives of several
influential people residing outside of the palace. Desperate and
terrible as all this seems, there is no doubt it would have been
attempted, for the men and women who were mixed up in the plot
were reckless of their lives, and terribly in earnest.
No mercy was shown to the prisoners, and the majority of them were
sent to some of the most inhospitable regions of Northern Siberia,
including Mrs. Sherard Wilson. To her it must have been infinitely
worse than death, and it may be doubted if she ever survived to
reach her destination.
THE CROWN JEWELS.
Moscow—or, as the natives call it, Maskva—might almost be
described as a city within a city; that is to say, there is the Kremlin,
and a town outside of that again. The word Kremlin is derived from
the Slavonic word Krim, which signifies a fort. It is built on a hill, and
is surrounded by a high turreted wall from twelve to sixteen feet
thick. This wall varies from thirty to sixty feet high, and is furnished
with battlements, embrasures, and gates. Within the Kremlin are
most of the Government offices: the Treasury; the renowned
Cathedral of St. Michael, where the monarchs of Russia were
formerly interred; and the Cathedral of the Assumption of the Virgin
Mary, long used as a place of coronation of the Emperors.
In the Treasury are preserved the State jewels, which, in the
aggregate, are probably of greater value than any other State jewels
in the civilized world. There are something like twenty crowns of
such a size, splendour, and intrinsic value that each in itself is a
fortune. Tradition says that one of these crowns was given by the
Greek Emperor Comnenus to the great Vladimir. Some are covered
with the most magnificent diamonds; others with turquoises of
immense size; others, again, with rubies and pearls; the groundwork
of all is solid gold, and the workmanship exquisite. Then there are
sceptres of massive gold, powdered with priceless gems. There are
diamond tiaras, diamond cinctures, services of gold and jewelled
plate, jewelled swords. These costly treasures are preserved in a
large well-lighted room of noble proportions, and to this room the
public are freely admitted. It need scarcely be said that the State
jewel-room of the Treasury is a source of great attraction to
foreigners, and no one visiting Moscow for the first time would think
of leaving the city without having paid a visit to the Treasury jewel-
room. One morning, on opening the Museum for the day, there was
tremendous consternation amongst the officials and attendants,
when one of the guardians of the treasure-house made the discovery
that no less than three crowns, two sceptres, a diamond belt and a
diamond tiara were missing. The circumstance was at once reported
to the keeper of the jewels—General Kuntzler. The office was
generally held by a retired military officer, and was much sought
after, as it was a life appointment and the salary was good. The
keeper had many subordinates under him, and while they were
responsible to him, he himself was held entirely responsible by the
Government for the safe-guarding of the jewels. General Kuntzler
had occupied the position for about two years, after long and
important military service. When he heard of the robbery, he was so
affected that his mind gave way, and before the day was out he shot
himself.
Investigation soon made it evident that a crime of unparalleled
audacity had been committed under the very noses of the
Government officials, and property intrinsically valued at many
thousands of pounds had disappeared. As the affair was a very
serious one for all concerned, no time was lost in summoning
Michael Danevitch and enlisting his services. As can readily be
understood, quite apart from the monetary value of the lost baubles,
the associations surrounding them made it highly desirable that
every effort should be put forth to recover them; and it was
impressed upon Danevitch how imperatively necessary it was to take
the most active measures to get on the track of the thieves
immediately, because, as everyone knew, the gold would be melted
down as soon as possible, and the precious relics be thus destroyed.
Amongst the crowns carried off was the one worn by the last King of
Poland. It was a magnificent bauble, and was so thickly encrusted
with gems that in round figures it was worth in English money
something like fifty thousand pounds. It will be seen, therefore, that
the loss in mere value to the State was enormous. It was, of course,
as Danevitch saw clearly enough, no ordinary robbery. It must have
been planned deliberately, and carried out with great ingenuity. Nor
was it less obvious that more than one person had been concerned
in the daring crime.
There was a prevailing impression at first that General Kuntzler must
have had a share in the robbery, but Danevitch did not take that
view. The unfortunate General had an untarnished record, and
though his suicide was calculated to arouse suspicion, it was
established by Danevitch that the poor man—fully realizing the great
responsibility that rested on his shoulders—was unable to face the
blame that would attach to him. It would be said that he had not
exercised sufficient care, and had been careless of the safety of the
priceless treasures committed to his charge. This was more than he
could bear, and he ended the whole business as far as he was
concerned by laying violent hands upon himself.
‘I saw from the first,’ Danevitch writes, ‘that the guilty parties must be
sought for among the ranks of those who make robbery a fine art, if
one may be allowed to so express himself. Mere commonplace,
vulgar minds would have been incapable of conceiving, let alone of
carrying out, so daring a deed as that of robbing the State of its
priceless historical baubles. It was no less self-evident to me that the
affair must have been very carefully planned, and arrangements
made for conveying the articles out of the country immediately, or of
effectually destroying their identity. In their original condition they
would practically be worth nothing to the illegal possessors,
inasmuch as no man dare offer them for sale; but by taking out the
gems and melting the gold the materials could thus be converted into
cash. I ascertained that when the Museum was closed in the evening
previous to the robbery being discovered, everything was safe.’
It appeared that it was the duty of the chief subordinate, one
Maximoff, to go round the hall the last thing, after it had been closed
to the public for the day, and see that everything was safe. He then
reported to General Kuntzler. This had been done with great
regularity. It so happened, however, that the day preceding the
discovery that the jewels had been stolen was an official holiday. At
stated periods in Russia there is an official holiday, when all public
Government departments are closed. This holiday had favoured the
work of the thieves, and some time during the forty hours that
elapsed between the closing of the hall in the evening before the
holiday, and the discovery of the robbery on the morning after the
holiday, the jewels had been carried off.
The holiday was on a Wednesday; on Tuesday evening Maximoff
made his round of inspection as usual, and duly presented his official
report to his chief, General Kuntzler. According to that report,
everything was safe; the place was carefully locked up, and all the
keys deposited in the custody of the General, who kept them in an
iron safe in his office. It was pretty conclusively proved that those
keys never left the safe from the time they were deposited there on
Tuesday night until Maximoff went for them on Thursday morning.
During the whole of Wednesday Maximoff and the attendants were
away. Maximoff was a married man, with three children, and he had
taken his family into the country. Kuntzler remained, and there was
the usual military guard at the Treasury. The guard consisted of six
sentinels, who did duty night and day, being relieved every four
hours.
‘The whole affair was very complicated,’ proceeds Danevitch, ‘and I
found myself confronted with a problem of no ordinary difficulty. I was
satisfied, however, that General Kuntzler was entirely innocent of
any complicity in the affair; and, so far as I could determine then,
there was not the slightest ground for suspecting Maximoff. There
were twelve other subordinates. They were charged with the duty of
dusting the various glass cases in which the jewels were deposited,
and of keeping the people in order on public days, and I set to work
in my own way to endeavour to find out what likelihood there was of
any of these men being confederates. It seemed to me that one or
more of them had been corrupted, and proved false to his charge.
Without an enemy in the camp it was difficult to understand how the
thieves had effected an entrance.’
The Treasury was a large white stone building, with an inner
courtyard, around which were grouped numerous Government
offices. The entrance to this yard was by a noble archway, closed by
a massive and ornamental iron gate. In this gateway a sentry was
constantly posted. The Museum was situated in about the centre of
the left wing of the main block of buildings. The entrance was from
the courtyard, and the hall, being in an upper story, was reached by
a flight of marble steps. To gain admission to the hall, the public were
necessarily compelled to pass under the archway, and so into the
courtyard. Of course there were other ways of reaching the hall of
jewels, but they were only used by the employés and officials.
General Kuntzler, his lieutenant, Maximoff, and four of the
subordinates, resided on the premises. They had rooms in various
parts of the building.
A careful study of the building, its approaches and its exits, led
Danevitch to the conclusion that the thief or thieves must have
reached the hall from one of the numerous Government offices on
the ground-floor of the block, or from the direction of Kuntzler’s
apartments, and he set to work to try and determine that point. He
found that one of the offices referred to was used as a depository for
documents relating to Treasury business, and beneath it, in the
basement, was an arched cellar, also used for storing documents.
This cellar was one of many others, all connected with a concreted
subway, which in turn was connected with the upper stories by a
narrow staircase, considered strictly private, and used, or supposed
to be used, by the employés only. The office was officially known as
Bureau 7. Exit from it could be had by a door, which opened into a
cul-de-sac, and was not a public thoroughfare. It was, in fact, a
narrow alley, formed by the Treasury buildings and a church.
Danevitch was not slow to perceive that Bureau 7 and the cul-de-sac
offered the best, if not the only, means of egress to anyone who,
being on the premises illegally, wished to escape without being seen.
It was true that one of the sentries always on duty patrolled the cul-
de-sac at intervals; but that, to the mind of Danevitch, was not an
insuperable obstacle to the escape of anyone from the building. Of
course, up to this point it was all conjecture, all theory; but the astute
detective brought all his faculties to bear to prove that his theory was
a reasonable one.
He ascertained that the door into the cul-de-sac was very rarely used
indeed, and had not been opened for a long time, as the office itself
was only a store-room for documents, and days often passed without
anyone going into it. Critical examination, however, revealed to
Danevitch that the outer door had been very recently opened. This
was determined by many minute signs, which revealed themselves
to the quick and practised eyes of the detective. But something more
was forthcoming to confirm him in his theory. On the floor of Bureau
7 he found two or three diamonds, and in the passage of the cul-de-
sac he picked up some more. Here, then, at once was fairly positive
proof that the thief or thieves had made their exit that way. Owing to
rough handling, or to the jarring together of the stolen things, some
of the precious stones had become detached, and by some
carelessness or other a number of them had fallen unperceived to
the ground; these as surely pointed the way taken by the robbers as
the lion in the desert betrays his track by the spoor. This important
discovery Danevitch kept to himself. He was fond of likening his
profession to a game at whist, and he used to say that the cautious
and skilful player should never allow his opponent to know what
cards he holds.
Having determined so much, his next step was to discover, if
possible, the guilty persons. It was tolerably certain that, whoever
they were, they must have been well acquainted with the premises.
Of course it went without saying that no one could have undertaken
and carried out such an extraordinary robbery without first of all
making a very careful study of every detail, as well as of every
means of reaching the booty, and of conveying it away when
secured. The fact of the robbery having been committed on the
Wednesday, which was a Government holiday, showed that it had
been well planned, and it was equally evident that somebody
concerned in it was intimately acquainted with the premises and all
their ramifications. The importance of the discovery of the way by
which the criminals had effected their escape could not be overrated,
and yet it was of still greater importance that the way by which they
entered should be determined. To do that, however, was not an easy
matter. The probability—a strong probability—was that those
concerned had lain perdu in the building from the closing-time on
Tuesday night until the business was completed, which must have
been during the hours of darkness from Tuesday night to
Wednesday morning, or Wednesday night and Thursday morning. In
the latter case, however, the enterprising ‘exploiters’ must have
remained on the premises the whole of Wednesday, and that was
hardly likely. They certainly could not have entered on Wednesday,
because as it was a non-business day a stranger or strangers
seeking admission would have been challenged by the sentries, and
not allowed to pass without a special permit. At night a password
was always sent round to the people residing in the building, and if
they went out they could not gain entrance again without giving the
password. These precautions were, in an ordinary way, no doubt,
effective enough; but the fact that on this occasion they had proved
of no avail pointed to one thing certain, which was that the intruders
had gained admission on the Tuesday with the general public, but
did not leave when the Museum was closed for the night, and to
another thing, not so certain, but probable, that they had been
assisted by somebody living on the premises.
Altogether something like sixty persons had lodgings in the Treasury
buildings, but only fourteen of these persons, including Kuntzler
himself, were attached to the Museum portion. The General’s
apartments were just above the hall in which the Crown jewels were
kept. He had a suite of six rooms, including a kitchen and a servant’s
sleeping-place. He was a widower, but his sister lived with him as his
housekeeper. She was a widow; her name was Anna Ivanorna. The
General also had an adopted daughter, a pretty girl, about twenty
years of age: she was called Lydia. It appeared she was the natural
child of one of the General’s comrades, who had been killed during
an émeute in Siberia, where he was stationed on duty. On the death
of his friend, and being childless himself, Kuntzler took the girl, then
between six and seven years of age, and brought her up. For
obvious reasons, of course, Danevitch made a study of the
General’s household, and so learned the foregoing particulars.
As may be imagined, the General’s death was a terrible blow to his
family, and Lydia suffered such anguish that she fell very ill.
Necessarily it became the duty of Danevitch to endeavour to
ascertain by every means in his power if Kuntzler’s suicide had
resulted from any guilty knowledge of the robbery. But not a scrap of
evidence was forthcoming to justify suspicion, though the outside
public suspected him. That, perhaps, was only natural. As a matter
of fact, however, he bore a very high reputation. He had held many
important positions of trust, and had been elected to the post of
Crown Jewel Keeper, on the death of his predecessor, on account of
the confidence reposed in him by the Government, and during the
time he had held the office he had given the utmost satisfaction. An
examination of his books—he had to keep an account of all the
expenses in connection with his department—his papers and private
letters, did not bring to light a single item that was calculated to
arouse suspicion, and not a soul in the Government service breathed
a word against him, while he was highly respected and esteemed by
a very large circle of friends.
It was admitted on all sides that General Kuntzler was a very
conscientious and sensitive man. The knowledge of the robbery
came upon him with a suddenness that overwhelmed him, and, half
stunned by the shock, his mind gave way, and he adopted the weak
man’s method to relieve himself of a terrible responsibility. That was
the worst that anyone who knew him ventured to say; he was
accorded a public and a military funeral, and was carried to his last
resting-place amidst the genuine sorrow of great numbers of people.
‘I confess that at this stage of the proceedings,’ writes Danevitch in
his notes of the case, ‘I did not feel very sanguine of success in the
task imposed upon me; and when Colonel Andreyeff, Chief of the
Moscow Police, sent for me, and asked my views, I frankly told him
what I thought, keeping back, however, for the time being, the
discovery I had made, that the culprits had departed from the
building by Bureau 7, and had scattered some diamonds on the way.
The Colonel became very grave when he learnt my opinion, and paid
me the compliment of saying that great hopes had been placed on
me, that the reputation of his department was at stake, and if the
jewels were not recovered, and the culprits brought to justice, it
might cost him his position. I pointed out that I was quite incapable of
performing miracles; that while I could modestly claim to have been
more successful in my career than any other man following the same
calling, it was not within my power to see through stone walls, or
divine the innermost secrets of men’s hearts.
‘“But you are capable of reading signs which other men have no
eyes for,” exclaimed the Colonel.
‘“Possibly,” I answered, as I bowed my thanks for the good opinion
he held of me; “but in this instance I see no sign.”
‘“But you are searching for one?” said the Colonel anxiously.
‘“Oh, certainly I am,” I responded.
‘The anxious expression faded from the Colonel’s face, and he
smiled as, fixing his keen gray eyes on me, he remarked:
‘“As long as you are still searching for a sign, Danevitch, there is
hope. There must be a sign somewhere, and unless you have grown
blind and mentally dull, it will not escape you for long.”
‘This was very flattering to my amour propre, and I admit that it had a
tendency to stimulate me to renewed exertion, if stimulus was really
needed. But, as a matter of fact, I was not just then very hopeful.
Nevertheless, as I took my leave, I said that, if the problem was
solvable by mortal man, I would solve it. This was pledging myself to
a good deal; but I was vain enough to think that, if I failed by
methods which I had made a lifelong study, to say nothing of a
natural gift for my work, no one else was likely to succeed, except by
some accident which would give him the advantage.’
Like most men of exceptional ability, Danevitch was conscious of his
strength, but he rarely allowed this self-consciousness to assert
itself, and when he did he was justified. His methods were certainly
his own, and he never liked to own defeat. That meant that where he
failed it was hardly likely anyone else would have succeeded. Not
only had he a tongue cunning to question, an eye quick to observe,
but, as I have said elsewhere, a sort of eighth sense, which enabled
him to discern what other men could not discern.
After that interview with Colonel Andreyeff, he fell to pondering on
the case, and bringing all the logic he was capable of to bear. He
saw no reason whatever to change his first opinion, that there had
been an enemy in the camp. By that is meant that the robbery could
never have been effected unless with the aid of someone connected
with the place, and knowing it well. Following his course of
reasoning, he came to the decision that the stolen property was still
within the Kremlin. His reason for this was, as he states:
‘The thieves could not have passed out during the night, as they
would have been questioned by the guards at the gates. Nor could
they have conveyed out such a bulky packet on Wednesday, as they
would have been called upon for a permit. On the other hand, if the
property had been divided up into small parcels, the risk would have
been great, and suspicion aroused. But assuming that the thieves
had been stupid enough to carry off the things in bulk, they must
have known that they were not likely to get far before attracting
attention, while any attempt to dispose of the articles as they were
would have been fatal. To have been blind to these tremendous risks
was to argue a denseness on the part of the culprits hardly
conceivable of men who had been clever enough to abstract from a
sentry-guarded Government building property of such enormous
value. They would know well enough that melted gold and loose
gems could always find a market; but, having regard to the hue and
cry, that market was hardly likely to be sought for in any part of
Russia. Therefore, when reduced to an unrecognisable state, and
when vigilance had been relaxed, the gold and the jewels would be
carried abroad to some of the centres of Europe, where the infamous
receiver flourishes and waxes fat on the sins of his fellow-men.
‘In accordance with my custom in such cases,’ continues Danevitch
in his notes, ‘I lost not a moment when I took up the case in
telegraphing to every outlet from Russia, including the frontier posts.
I knew, therefore, that at every frontier station and every outlet
luggage would be subjected to very critical examination, and the
thieves would experience great difficulty indeed in getting clear. But
there was another aspect of the case that could not be overlooked,
and it caused me considerable anxiety; it was this—the gems could
be carried away a few at a time. A woman, for instance, could
conceal about her person small packets of them, and excite no
suspicion. To examine everyone personally at the frontiers was next
to impossible. There was another side, however, to this view, and it
afforded me some consolation. To get the gems out of the country in
the way suggested would necessitate a good many journeys on the
part of the culprits, and one person making the same journey several
times would excite suspicion. If several people were employed in the
work, they would be certain to get at loggerheads sooner or later,
and the whole business would be exposed. I always made it a sort of
axiom that “when thieves fall out honest men come by their due,”
and experience had taught me that thieves invariably fall out when it
comes to a division of plunder. Of course, I was perfectly alive to the
fact that it would not do to rely upon that; something more was
wanted: it was of the highest importance to prevent the stolen
property being carried far away, and all my energies were
concentrated to that end.
‘I have already given my reasons for thinking that at this stage the
stolen jewels had not been removed from the Kremlin. Although
there are no regular streets, as understood, in the Kremlin, there are
numerous shops and private residences, the latter being inhabited
for the most part by the officials and other employés of the numerous
Government establishments. The result is that within the Kremlin
itself there is a very large population.’
It will be seen from these particulars that the whole affair bristled with
difficulties, and, given that the thieves were sharp, shrewd, and
cautious, they might succeed in defeating Danevitch’s efforts. One of
the first things he did was to request that every sentry at the Kremlin
gates should be extra vigilant, and subject passers to and fro to
more than ordinary observation, while if they had reason to suspect
any particular person, that person should be instantly arrested. The
precautions which were thus taken reduced the matter to a game of
chance. If the thieves betrayed themselves by an incautious or
careless act they would lose. On the other hand, if they were skilful
and vigilant the detective would be defeated; and as the stakes were
very large, and to lose meant death to them (that being the penalty in
Russia for such a crime), it was presumable that they would not
easily sacrifice themselves. At this stage Danevitch himself
confessed that he would not have ventured to give an opinion as to
which of the two sides would win.
The more Danevitch studied the subject, the more he became
convinced that the thieves must have been in league with someone
connected with the Treasury Department. In face of the fact that
false keys had been used, the theory of collusion could not be
ignored; the difficulty was to determine who was the most likely
person to have proved traitor to his trust. Maximoff bore a high
character; General Kuntzler had reposed full confidence in him. The
subordinates were also men of good repute. That, however, was not
a guarantee that they were proof against temptation. Nevertheless,
Danevitch could not get hold of anything that was calculated to
arouse his suspicion against any particular individual. If there was a
guilty man amongst them, he would, of course, be particularly careful
not to commit any act, or utter any word, calculated to betray him,
knowing as he did that Danevitch was on the alert.
When several days had passed, and General Kuntzler had been
consigned to his tomb, Danevitch had an interview with his sister,
Anna Ivanorna. She was in a state of great mental excitement and
nervous prostration; and Lydia, the General’s adopted daughter, was
also very ill. Anna was a somewhat remarkable woman. She was a
tall, big-boned, determined-looking individual, with a soured
expression of face and restless gray eyes. Her manner of speaking,
her expression of face, and a certain cynicism, which made itself
apparent in her talk, gave one the notion that she was a
disappointed woman.
‘This is a sad business,’ began Danevitch, after some preliminary
remarks.
‘Very sad,’ she answered. ‘It has cost my brother his life.’
‘He evidently felt it very keenly,’ said Danevitch.
‘A man must feel a thing keenly to commit suicide, unless he is a
weak-brained fool, incapable of any endurance,’ she replied with a
warmth that amounted almost to fierceness. After a pause, she
added: ‘My brother was far from being a fool. He was a strong man
—a clever man.’
‘So I understand. Did he make any observation to you before he
committed the rash act?’
‘No.’
‘Yes, he did, Anna,’ cried out Lydia from the couch on which she was
lying, wrapped in rugs.
Anna turned upon her angrily, and exclaimed:
‘How do you know? Hold your tongue. He made no observation, I
say.’
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like