0% found this document useful (0 votes)
65 views

CompilerTalk 2019

Uploaded by

javibds
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

CompilerTalk 2019

Uploaded by

javibds
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

21 compilers and

3 orders of magnitude
in 60 minutes
a wander through a weird landscape to the heart of compilation

Spring 2019

!1
Hello!
• I am someone who has
worked (for pay!) on some
compilers: rustc, swiftc, gcc,
clang, llvm, tracemonkey, etc.

• Ron asked if I could talk about


compiler stuff I know, give
perspective on the field a bit.

• This is for students who


already know roughly how to
write compilers, not going to
cover that!

!2
the speaker, in 1979
I like compilers!

• Relationship akin to "child


with many toy dinosaurs".

• Some are bigger and scarier.


We will look at them first.

• Some are weird and


wonderful. We will visit them
along the way.

• Some are really tiny!

!3
Borrowsaur fighting a Thunkasaur
Goal for talk
• I expect gap between class
projects and industrial
compilers is overwhelming.

• Want to explore space


between, demystify and make
more design choices clear.

• Reduce terror, spark curiosity,


encourage trying it as career!

• If I can compiler, so can you!

!4
Plan of talk

• Describe a few of the giants.

• Talk a bit about what makes them so huge & complex.

• Wander through the wilderness (including history) looking


for ways compilers can vary, and examining specimens.

• Also just point out stuff I think is cool / underappreciated.

!5
Caveats
• I'm not a teacher or very good at giving talks.

• Lots of material, not ideal to stop for questions unless


you're absolutely lost. Gotta keep pace!

• But: time at end for questions and/or email followup.


Happy to return to things you're curious about. Slides are
numbered! Jot down any you want to ask about.

• Apologies: not as much industry-talk as I promised. Will


try for some. But too many dinosaurs for show and tell!

!6
Part 1: some giants

!7
Specimen #1

Clang
• ~2m lines of C++: 800k lines LValue CodeGenFunction::EmitLValue(const Expr *E) {
ApplyDebugLocation DL(*this, E);

clang plus 1.2m LLVM. Self


switch (E->getStmtClass()) {
default: return EmitUnsupportedLValue(E, "l-value expression");

hosting, bootstrapped from GCC.


case Expr::ObjCPropertyRefExprClass:
llvm_unreachable("cannot emit a property reference directly");

case Expr::ObjCSelectorExprClass:
return EmitObjCSelectorLValue(cast<ObjCSelectorExpr>(E));


case Expr::ObjCIsaExprClass:
C-language family (C, C++, return EmitObjCIsaExpr(cast<ObjCIsaExpr>(E));
case Expr::BinaryOperatorClass:

ObjC), multi-target (23).

return EmitBinaryOperatorLValue(cast<BinaryOperator>(E));
case Expr::CompoundAssignOperatorClass: {
QualType Ty = E->getType();
if (const AtomicType *AT = Ty->getAs<AtomicType>())
Ty = AT->getValueType();


if (!Ty->isAnyComplexType())

Single AST + LLVM IR.


}
return EmitCompoundAssignmentLValue(cast<CompoundAssignOperator>(E));
return EmitComplexCompoundAssignmentLValue(cast<CompoundAssignOperator>(E));

case Expr::CallExprClass:
case Expr::CXXMemberCallExprClass:

• 2007-now, large multi-org team.

case Expr::CXXOperatorCallExprClass:
case Expr::UserDefinedLiteralClass:
return EmitCallExprLValue(cast<CallExpr>(E));

• Good diagnostics, fast code.

• Originally Apple, more


permissively licensed than GCC.

!8
Specimen #2

Swiftc
• ~530k lines of C++ plus 2m RValue RValueEmitter::visitIfExpr(IfExpr *E, SGFContext C) {
auto &lowering = SGF.getTypeLowering(E->getType());

lines clang and LLVM. Many if (lowering.isLoadable() || !SGF.silConv.useLoweredAddresses()) {


// If the result is loadable, emit each branch and forward its result
// into the destination block argument.
same authors. Not self-hosting.
Condition cond = SGF.emitCondition(E->getCondExpr(),
/*invertCondition*/ false,
SGF.getLoweredType(E->getType()),
NumTrueTaken, NumFalseTaken);
cond.enterTrue(SGF);

• Newer app-dev language.

SILValue trueValue;
{
auto TE = E->getThenExpr();
FullExpr trueScope(SGF.Cleanups, CleanupLocation(TE));
trueValue = visit(TE).forwardAsSingleValue(SGF, TE);
}

• Tightly integrated with clang,


cond.exitTrue(SGF, trueValue);

cond.enterFalse(SGF);

interop with C/ObjC libraries.

SILValue falseValue;
{
auto EE = E->getElseExpr();
FullExpr falseScope(SGF.Cleanups, CleanupLocation(EE));
falseValue = visit(EE).forwardAsSingleValue(SGF, EE);


}

Extra SIL IR for optimizations.


cond.exitFalse(SGF, falseValue);

• Multi-target, via LLVM.

• 2014-now, mostly Apple.

!9
Specimen #3

Rustc
• ~360k lines of Rust, plus 1.2m fn expr_as_rvalue(
&mut self,

lines LLVM. Self-hosting, mut block: BasicBlock,


scope: Option<region::Scope>,
expr: Expr<'tcx>,

bootstrapped from OCaml.


) -> BlockAnd<Rvalue<'tcx>> {
debug!(
"expr_as_rvalue(block={:?}, scope={:?}, expr={:?})",
block, scope, expr
);

• Newer systems language.


let this = self;
let expr_span = expr.span;
let source_info = this.source_info(expr_span);


match expr.kind {
Two extra IRs (HIR, MIR).
ExprKind::Scope {
region_scope,
lint_level,
value,
} => {

• Multi-target, via LLVM.


let region_scope = (region_scope, source_info);
this.in_scope(region_scope, lint_level, block, |this| {

})
this.as_rvalue(block, scope, value)

• 2010-now, large multi-org team.

• Originally mostly Mozilla. And yes


I did a lot of the initial bring-up so
my name is attached to it forever;
glad it worked out!

!10
Aside: what is this "LLVM"?
• Notice the last 3 languages all end in
LLVM. "Low Level Virtual Machine"
https://github.com/llvm/llvm-project

• Strongly typed IR, serialization format,


library of optimizations, lowerings to
many target architectures.

• "One-stop-shop" for compiler backends.

• 2003-now, UIUC at first, many industrial


contributors now.

• Longstanding dream of compiler


engineering world, possibly most
successful attempt at it yet.

• Here is a funny diagram of modern


compilers from Andi McClure (https://
runhello.com/)

!11
Specimen #4

GNU Compiler Collection (GCC)


• ~2.2m lines of mostly C, C++. 600k static int
find_reusable_reload (rtx *p_in, rtx out, enum reg_class rclass,
lines Ada. Self-hosting, bootstrapped {
enum reload_type type, int opnum, int dont_share)

from other C compilers.

rtx in = *p_in;
int i;

if (earlyclobber_operand_p (out))
return n_reloads;

• Multi-language (C, C++, ObjC, Ada, D, for (i = 0; i < n_reloads; i++)


if ((reg_class_subset_p (rclass, rld[i].rclass)
Go, Fortran), multi-target (21).
|| reg_class_subset_p (rld[i].rclass, rclass))
/* If the existing reload has a register, it must fit our class. */
&& (rld[i].reg_rtx == 0
|| TEST_HARD_REG_BIT (reg_class_contents[(int) rclass],

• Language & target-agnostic TREE AST


true_regnum (rld[i].reg_rtx)))
&& ((in != 0 && MATCHES (rld[i].in, in) && ! dont_share
&& (out == 0 || rld[i].out == 0 || MATCHES (rld[i].out, out)))
and RTL IR. Challenging to work on.
|| (out != 0 && MATCHES (rld[i].out, out)
&& (in == 0 || rld[i].in == 0 || MATCHES (rld[i].in, in))))
&& (rld[i].out == 0 || ! earlyclobber_operand_p (rld[i].out))
&& (small_register_class_p (rclass)


|| targetm.small_register_classes_for_mode_p (VOIDmode))
1987-present, large multi-org team.
&& MERGABLE_RELOADS (type, rld[i].when_needed, opnum, rld[i].opnum))
return i;

• Generates quite fast code.

• Originally political project to free


software from proprietary vendors.
Licensed somewhat protectively.

!12
Part 2: why so big?

!13
Size and economics
• Compilers get big because the development costs are seen as
justified by the benefits, at least to the people paying the bills.

• Developer productivity: highly expressive languages, extensive


diagnostics, IDE integration, legacy interop.

• Every drop of runtime performance: shipping on billions of


devices or gigantic multi-warehouse fleets.

• Covering & exploiting all the hardware: someone makes a new


chip, they pay for an industrial compiler to make use of it.

• Writing compilers in verbose languages: for all the usual reasons


(compatibility, performance, familiarity).

!14
Tradeoffs and balance
• This is ok!

• The costs and benefits are context dependent.

• Different contexts, weightings: different compilers.

• Remainder of talk will be exploring those differences.

• Always remember: balancing cost tradeoffs by context.

• Totally biased subset of systems: stuff I think is interesting


and worth knowing, might give hope / inspire curiosity.

!15
Part 3: variations
(this part is much longer)

!16
Variation #1

Fewer optimizations

• In some contexts, "all the optimizations" is too much.

• Too slow to compile, too much memory, too much


development / maintenance effort, too inflexible.

• Common in JITs, or languages with lots of indirection


anyways (dynamic dispatch, pointer chasing): optimizer
can't do too well anyways.

!17
Proebsting's law
• "Compiler Advances Double
Computing Power Every 18 Years"

• Sarcastic joke / real comparison to


Moore's law: hardware doubles
power every 18 months. Swamps
compilers.

• Empirical observation though!


Optimizations seem to only win
~3-5x, after 60+ years of work.

• Less-true as language gains more


abstractions to eliminate (i.e.
specialize / de-virtualize). More
true if lower-level. Scott, Kevin. On Proebsting's Law. 2001

!18
Frances Allen
Got All The Good Ones
• 1971: "A Catalogue of
Optimizing Transformations".

• The ~8 passes to write if


you're going to bother.

• Inline, Unroll (& Vectorize),


CSE, DCE, Code Motion,
Constant Fold, Peephole.

• That's it. You're welcome.

• Many compilers just do those,


get ~80% best-case perf.
https://commons.wikimedia.org/wiki/File:Allen_mg_2528-3750K-b.jpg - CC BY-SA 2.0

!19
Specimen #5

V8
• 660k lines C++ including backends. Not // Shared routine for word comparison against zero.
void InstructionSelector::VisitWordCompareZero(Node* user, Node* value,
self-hosting.
FlagsContinuation* cont) {
// Try to combine with comparisons against 0 by simply inverting the branch.
while (value->opcode() == IrOpcode::kWord32Equal && CanCover(user, value)) {
Int32BinopMatcher m(value);

• JavaScript compiler in Chrome, Node.

if (!m.right().Is(0)) break;

user = value;
value = m.left().node();
cont->Negate();

• Multi-target (7), multi-tier JIT. }

if (CanCover(user, value)) {
Optimizations mix of classical stuff and switch (value->opcode()) {
case IrOpcode::kWord32Equal:
dynamic language stuff from Smalltalk.
cont->OverwriteAndNegateIfEqual(kEqual);
return VisitWordCompare(this, value, kX64Cmp32, cont);
case IrOpcode::kInt32LessThan:
cont->OverwriteAndNegateIfEqual(kSignedLessThan);

• Multiple generations of optimization and return VisitWordCompare(this, value, kX64Cmp32, cont);


case IrOpcode::kInt32LessThanOrEqual:
cont->OverwriteAndNegateIfEqual(kSignedLessThanOrEqual);
IRs. Always adjusting for sweet spot of return VisitWordCompare(this, value, kX64Cmp32, cont);
case IrOpcode::kUint32LessThan:
runtime perf vs. compile time, memory, cont->OverwriteAndNegateIfEqual(kUnsignedLessThan);
return VisitWordCompare(this, value, kX64Cmp32, cont);
case IrOpcode::kUint32LessThanOrEqual:
maintenance cost, etc.
cont->OverwriteAndNegateIfEqual(kUnsignedLessThanOrEqual);
return VisitWordCompare(this, value, kX64Cmp32, cont);

• Recently added slower (non-JIT)


interpreter tier, removed others.

• 2008-present, mostly Google, open


source.

!20
Variation #2

Compiler-friendly implementation
(and input) languages
• Note: your textbook has 3 implementation flavours. Java, C,
ML. No coincidence.

• ML designed as implementation language for symbolic logic


(expression-tree wrangling) system: LCF (1972).

• LCF written in Lisp. Lisp also designed as implementation


language for symbolic logic system: Advice Taker (1959).

• Various family members: Haskell, OCaml, Scheme, Racket.

• All really good at defining and manipulating trees. ASTs, types,


IRs, etc. Usually make for much smaller/simpler compilers.

!21
Specimen #6

Glasgow Haskell Compiler (GHC)


• 180k lines Haskell, self-
stmtToInstrs :: CmmNode e x -> NatM InstrBlock
stmtToInstrs stmt = do
dflags <- getDynFlags

hosting, bootstrapped from


is32Bit <- is32BitPlatform
case stmt of
CmmComment s -> return (unitOL (COMMENT s))
CmmTick {} -> return nilOL
Chalmers Lazy ML compiler.
CmmUnwind regs -> do
let to_unwind_entry :: (GlobalReg, Maybe CmmExpr) -> UnwindTable
to_unwind_entry (reg, expr) = M.singleton reg (fmap toUnwindExpr expr)
case foldMap to_unwind_entry regs of


tbl | M.null tbl -> return nilOL
Pure-functional language, very | otherwise -> do
lbl <- mkAsmTempLabel <$> getUniqueM
return $ unitOL $ UNWIND lbl tbl
advanced type-system.
CmmAssign reg src
| isFloatType ty -> assignReg_FltCode format reg src
| is32Bit && isWord64 ty -> assignReg_I64Code reg src
| otherwise -> assignReg_IntCode format reg src


where ty = cmmRegType dflags reg
Several tidy IRs after AST: format = cmmTypeFormat ty

CmmStore addr src

Core, STG, CMM. Custom | isFloatType ty -> assignMem_FltCode format addr src
| is32Bit && isWord64 ty -> assignMem_I64Code addr src
| otherwise -> assignMem_IntCode format addr src

backends.
where ty = cmmExprType dflags src
format = cmmTypeFormat ty

• 1991-now, initially academic


researchers, lately Microsoft
after they were hired there.

!22
Specimen #7

Chez Scheme
• 87k lines Scheme (a Lisp), self- (define asm-size
(lambda (x)

hosting, bootstrapped from C- (case (car x)


[(asm) 0]
[(byte) 1]

Scheme.
[(word) 2]
[else 4])))

(define asm-move
(lambda (code* dest src)

• 4 targets, good performance,


(Trivit (dest src)
(record-case src
[(imm) (n)

incremental compilation.
(if (and (eqv? n 0) (record-case dest [(reg) r #t] [else #f]))
(emit xor dest dest code*)
(emit movi src dest code*))]
[(literal) stuff (emit movi src dest code*)]
[else (emit mov src dest code*)]))))

• Written on "nanopass framework" (define-who asm-move/extend


(lambda (op)

for compilers with many similar (lambda (code* dest src)


(Trivit (dest src)
(case op
IRs. Chez has 27 different IRs!
[(sext8) (emit movsb src dest code*)]
[(sext16) (emit movsw src dest code*)]
[(zext8) (emit movzb src dest code*)]
[(zext16) (emit movzw src dest code*)]
[else (sorry! who "unexpected op ~s" op)])))))

• 1984-now, academic-industrial,
mostly single developer. Getting
down to the size-range where a
compiler is small enough to be
that.

!23
Specimen #8

Poly/ML
• 44k lines SML, self-hosting.
| cgOp(PushToStack(RegisterArg reg)) =
let
val (rc, rx) = getReg reg
in


opCodeBytes(PUSH_R rc, if rx then SOME{w=false, b = true,
Single machine target (plus else NONE)
x=false, r = false }

end
byte-code), AST + IR, classical | cgOp(PushToStack(MemoryArg{base, offset, index})) =
opAddressPlus2(Group5, LargeInt.fromInt offset, base, index, 0w6)

optimizations. Textbook style.


| cgOp(PushToStack(NonAddressConstArg constnt)) =
if is8BitL constnt
then opCodeBytes(PUSH_8, NONE) @ [Word8.fromLargeInt constnt]
else if is32bit constnt


then opCodeBytes(PUSH_32, NONE) @ int32Signed constnt

Standard platform for else


let
val opb = opCodeBytes(Group5, NONE)

symbolic logic packages in


val mdrm = modrm(Based0, 0w6 (* push *), 0w5 (* PC rel *))

opb @ [mdrm] @ int32Signed(tag 0)

Isabelle and HOL.


|
end
cgOp(PushToStack(AddressConstArg _)) =
(
case targetArch of
Native64Bit => (* Put it in the constant area. *)


let

1986-now, academic, mostly in


val opb = opCodeBytes(Group5, NONE)
val mdrm = modrm(Based0, 0w6 (* push *), 0w5 (* PC rel *));

single developer. |
end
opb @ [mdrm] @ int32Signed(tag 0)

Native32Bit => opCodeBytes(PUSH_32, NONE) @ int32Signed(tag 0)


| ObjectId32Bit =>

!24
Specimen #9

CakeML
• 58k lines SML, 5 targets, self- val WordOp64_on_32_def = Define `
WordOp64_on_32 (opw:opw) =
hosting.
dtcase opw of
| Andw => list_Seq [Assign 29 (Const 0w);
Assign 27 (Const 0w);
Assign 33 (Op And [Var 13; Var 23]);


Assign 31 (Op And [Var 11; Var 21])]
9 IRs, many simplifying passes.
| Orw => list_Seq [Assign 29 (Const 0w);
Assign 27 (Const 0w);
Assign 33 (Op Or [Var 13; Var 23]);
Assign 31 (Op Or [Var 11; Var 21])]


| Xor => list_Seq [Assign 29 (Const 0w);
160k lines HOL proofs: verified! Assign 27 (Const 0w);
Assign 33 (Op Xor [Var 13; Var 23]);
Assign 31 (Op Xor [Var 11; Var 21])]
| Add => list_Seq [Assign 29 (Const 0w);


Assign 27 (Const 0w);

Language semantics proven to be Inst (Arith (AddCarry 33 13 23 29));


Inst (Arith (AddCarry 31 11 21 29))]
| Sub => list_Seq [Assign 29 (Const 1w);
preserved through compilation!!! Assign 27 (Op Xor [Const (-1w); Var 23]);
Inst (Arith (AddCarry 33 13 27 29));
Assign 27 (Op Xor [Const (-1w); Var 21]);
Inst (Arith (AddCarry 31 11 27 29))]`

• Cannot emphasize enough. This was val WordShift64_on_32_def = Define `


WordShift64_on_32 sh n = list_Seq

science fiction when I was young.


(* inputs in 11 and 13, writes results
(if sh = Ror then
(let n = n MOD 64 in
in 31 and 33 *)

(if n < 32 then


[Assign 33 (Op Or [ShiftVar Lsl 11 (32 - n);

• CompCert first serious one, now ShiftVar Lsr


Assign 31 (Op Or [ShiftVar Lsl
ShiftVar Lsr
13
13
11
n]);
(32 - n);
n])]
several.
else
[Assign 33 (Op Or [ShiftVar Lsl 13 (64 - n);
ShiftVar Lsr 11 (n - 32)]);
Assign 31 (Op Or [ShiftVar Lsl 11 (64 - n);

• 2012-now, deeply academic.

!25
Variation #3

Meta-languages
• Notice Lisp / ML code looks a bit like grammar productions: recursive
branching tree-shaped type definitions, pattern matching.

• There's a language lineage that took that idea ("programs as grammars") to


its logical conclusion: metacompilers (a.k.a. "compiler-compilers"). Ultimate
in "compiler-friendly" implementation languages.

• More or less: parser glued to an "un-parser".

• Many times half a metacompiler lurks in more-normal compilers:

• YACCs ("yet another compiler-compiler"): parser-generators

• BURGs ("bottom-up rewrite generators"): code-emitter-generators

• See also: GCC ".md" files, LLVM TableGen. Common pattern!

!26
Aside: SRI-ARC
• Stanford Research Institute - Augmentation Research Lab. US Air
Force R&D project. Very famous for its NLS ("oNLine System").

• History of that project too big to tell here. Highly influential in forms of
computer-human interaction, hypertext, collaboration, visualization.

• Less well-known is their language tech: TREE-META and MPS/MPL.

!27
Specimen #10

TREE-META
• 184 lines of TREE-META. Bootstrapped .META PROGM

from META-II.
OUTPT[-,-] => % *1 ':' % '%PUSHJ;' % *2 '%POPJ;' % ;

AC[-,-,-] => *1 *3 ACX [*2,#1] #1 ':' % ;


ACX[AC[-,-,-],#1] => #1 ');' % *1:*1 *1:*3 ACX[*1:*2,#1]

• In the Schorre metacompiler family


T/
[-,#1] => #1 ');' % *1 ;

=> '%BT;DATA(@' ;
(META, META-II)
F/ => '%BF;DATA(@' ;

BALTER[-] => '%SAV;' % *1 '%RSTR;' % ;


OER[-,-] => *1 '%OER;' % *2 ;

• SRI-ARC, 1967. Made to support


OUTAB[-] => <TN <- CONV[*1]; OUT[TN] > ');' % ;
ERCODE [-,NUM] => *1 '%ERCHK;DATA(' OUTAB[*2]
[-.-] => *1 '%ERSTR;DATA(' OPSTR[*2] ;
language tools in the NLS project.
ERR[-] => *1 '%ERCHK;DATA(0);' % ;

DOO[-,-] => *1 *2 ;

• "Syntax-directed translation": parse NDMK[-,-] => '%MKND;DATA(@' *1 ',' *2 ');' % ;


NDLB => '%NDLBL;DATA(@' *1 ');' % ;
MKNODE [-] => '%NDMK;DATA(' *1 ');' % ;
input to trees, un-parse to machine GOO/ => '%OUTRE;'% ;
code. Only guided by grammars.
SET/ => '%SET;' % ;

PRIM[.ID] => '%' *1 ';' %


[.SR] => '%SRP;DATA(' OPSTR[*1] ;

• Hard to provide diagnostics, type- CALL[-] => '%CALL;DATA(@' *1 ');' % ;


STST[-] => '%TST;DATA(' OPSTR[*1] ;
checking, optimization, really anything SCODE[-] => '%CHRCK;DATA(' OUTAB[*1] ;
ARB[-] => #1 ':' % *1 '%BT;DATA(@' #1 ');' % '%SET;' % ;

other than straight translations.


BEGINN[-.-] => <B <- 0 > *2 'ENTRY 0;%INIT;%CALL;DATA(@' *1 ');' %
'%FIN;' % ;

• But: extremely small, simple compliers.


Couple pages. Ideal for bootstrap phase.

!28
Specimen #11 (Segue)

Mesa
• 42k lines of Mesa (bootstrapped
from MPL, itself from TREE-META).

• One of my favourite languages!

• Strongly typed, modules with


separate compilation and type
checked linking. Highly influential
(Modula, Java).

• Co-designed language, OS, and


byte-code VM implemented in CPU
microcode, adapted to compiler.

• Xerox PARC, 1976-1981, small


team left SRI-ARC, took MPL.

!29 https://www.flickr.com/photos/microsoftpdc/4119070676/ - CC BY 2.0


Variations #4, #5, and #6

leverage interpreters

• Mesa and Xerox PARC is a nice segue into next few


points: all involve compilers interacting with interpreters.

• Interpreters & compilers actually have a long relationship!

• In fact interpreters predate compilers.

• Let us travel back in time to the beginning, to illustrate!

!30
Origins of "computer"
• 1940s: First digital
computers.

• Before: fixed-function
machines and/or humans
(largely women) doing job
called "computer".

• Computing power literally


measured in "kilo-girls"
and "kilo-girl-hours".

!31
ENIAC: general hardware
• 1945: ENIAC built for US
Army, Ordnance Corps.
Artillery calculations in
WWII.

• "Programmers" drawn
from "computer" staff, all
women.

• "Programming" meant
physically rewiring per-
task.

!32
Stored Programs
• 1948: Jean Bartik leads
team to convert ENIAC to
"stored programs",
instructions (called
"orders") held in memory.

• Interpreted by hardware.
Faster to reconfigure than
rewiring; but ran slower.

• Subroutine concept
developed for factoring
stored programs.

!33
First software pseudo codes:
interpreters on ENIAC, BINAC, UNIVAC

• 1949:"Short Code" software interpreters for higher level


"pseudo-code" instructions (non-HW-interpreted) that
denote subroutine calls and expressions. ~50x slower
than HW-interpreted.

!34
Specimen #12

A-0: the first compiler


• Reads interpreter-like pseudo-
codes, then emits "compilation"
program with all codes resolved
to their subroutines.

• Result runs almost as fast as


manually coded; but as easy to
write-for as interpreter. An
interpreter "fast mode".

• Rationale all about balancing time


tradeoffs (coding-time, compiler-
execution-time, run-time).

• 1951, Grace Hopper, Univac

http://commons.wikimedia.org/wiki/File:Grace_Murray_Hopper,_in_her_office_in_Washington_DC,_1978,_©Lynn_Gilbert.jpg
!35 - CC BY-SA 4.0
Balance between
interpretation and compilation
is context dependent too!

!36
Variation #4

Only compile from frontend to IR,


interpret residual VM code
• Can stop before real machine code. Emit IR == "virtual machine" code.

• Can further compile or just interpret that VM code.

• Residual VM interpreter has several real advantages:

• Easier to port to new hardware, or bootstrap compiler. "Just get something running".

• Fast compilation & program startup, keeps interactive user engaged.

• Simply easier to write, less labor. Focus your time on frontend semantics.

https://xavierleroy.org/talks/zam-kazam05.pdf

!37
Specimen #13

Roslyn
• 350k lines C#, 320k lines VB. private void EmitBinaryOperatorInstruction(BoundBinaryOperator expression)
{
switch (expression.OperatorKind.Operator())
Self-hosting, bootstrapped off {
case BinaryOperatorKind.Multiplication:
_builder.EmitOpCode(ILOpCode.Mul);

previous gen.
break;

case BinaryOperatorKind.Addition:
_builder.EmitOpCode(ILOpCode.Add);
break;

• Multi-language framework case BinaryOperatorKind.Subtraction:


_builder.EmitOpCode(ILOpCode.Sub);
break;

(C#, VB.NET). Rich semantics, case BinaryOperatorKind.Division:


if (IsUnsignedBinaryOperator(expression))

good diagnostics, IDE {

}
_builder.EmitOpCode(ILOpCode.Div_un);

integration.
else
{
_builder.EmitOpCode(ILOpCode.Div);
}
break;

• Lowers from AST to CIL IR.


Separate CLR project
interprets or compiles IR.

• 2011-now, Microsoft, OSS.

!38
Specimen #14

Eclipse Compiler for Java (ECJ)


• 146k lines Java, self-hosting, /**
* Code generation for the conditional operator ?:
*
bootstrapped off Javac.
* @param currentScope org.eclipse.jdt.internal.compiler.lookup.BlockScope
* @param codeStream org.eclipse.jdt.internal.compiler.codegen.CodeStream
* @param valueRequired boolean
*/
@Override


public void generateCode(
In Eclipse! Also in many Java BlockScope currentScope,
CodeStream codeStream,
boolean valueRequired) {

products (eg. IntelliJ IDEA). int pc = codeStream.position;


BranchLabel endifLabel, falseLabel;

Rich semantics, good if (this.constant != Constant.NotAConstant) {


if (valueRequired)
codeStream.generateConstant(this.constant, this.implicitConversion);

diagnostics, IDE integration.


}
codeStream.recordPositionsFrom(pc, this.sourceStart);
return;

Constant cst = this.condition.optimizedBooleanConstant();


boolean needTruePart =


!(cst != Constant.NotAConstant && cst.booleanValue() == false);

Lowers from AST to JVM IR. boolean needFalsePart =


!(cst != Constant.NotAConstant && cst.booleanValue() == true);
endifLabel = new BranchLabel(codeStream);

Separate JVM projects


interpret or compile IR.

• 2001-now, IBM, OSS.

!39
Variation #5

Only compile some functions,


interpret the rest
• Cost of interpreter only bad at inner loops or fine-grain. Outer
loops or coarse-grain (eg. function calls) similar to virtual dispatch.

• Design option: interpret by default, selectively compile hot


functions ("fast mode") at coarse grain. Best of both worlds!

• Keep interpreter-speed immediate feedback to user.

• Interpreter may be low-effort, portable, can bootstrap.

• Defer effort on compiler until needed.

• Anything hard to compile, just call back to interpreter.

!40
Specimen #15

Pharo/Cog
• 54k line VM interpreter and 18k line JIT: C
code generated from Smalltalk
metaprograms. Bootstrapped from Squeak.

• Smalltalk is what you'll actually hear people


mention coming from Xerox PARC.

• Very simple language. "Syntax fits on a


postcard". Easy to interpret.

• Complete GUI, IDE, powerful tools.

• Standard Smalltalk style: interpret by


default, JIT for "fast mode". Compiler
bootstraps-from and calls-into VM
whenever convenient.

• Targets ARM, x86, x64, MIPS.

• 2008-now, academic-industrial consortium.

!41
Specimen #16

Franz Lisp
• 20k line C interpreter, 7,752 line Lisp ;--- e-move :: move value from one place to anther
; this corresponds to d-move except the args are EIADRS
compiler.
;
(defun e-move (from to)
(if (and (dtpr from)
(eq '$ (car from))


(eq 0 (cadr from)))
Older command-line system, standard then (e-write2 'clrl to)
else (e-write3 'movl from to)))

Unix Lisp for years.


;--- d-move :: emit instructions to move value from one place to another
;
(defun d-move (from to)
(makecomment `(from ,(e-uncvt from) to ,(e-uncvt to)))

• Like Smalltalk: very simple language. #+(or for-vax for-tahoe)


(cond ((eq 'Nil from) (e-move '($ 0) (e-cvt to)))

Actually an AST/IR that escaped from


(t (e-move (e-cvt from) (e-cvt to))))

#+for-68k
the lab. Easy to interpret.
(let ((froma (e-cvt from))
(toa (e-cvt to)))
(if (and (dtpr froma)
(eq '$ (car froma))


(and (>& (cadr froma) -1) (<& (cadr froma) 65))
Frequent Lisp style: interpret by (atom toa)
(eq 'd (nthchar toa 1)))
default; compile for "fast mode". then ;it's a mov #immed,Dn, where 0 <= immed <= 64
; i.e., it's a quick move
(e-write3 'moveq froma toa)
Compiler bootstraps-from and calls- else (cond ((eq 'Nil froma) (e-write3 'movl '#.nil-reg toa))
(t (e-write3 'movl froma toa))))))
into interpreter whenever convenient.

• Targets m68k and VAX.

• 1978-1988, UC Berkeley.

!42
Variation #6

Partial Evaluation Tricks


• Consider program in terms of parts that are static (will not
change anymore) or dynamic (may change).

• Partial evaluator (a.k.a. "specializer") runs the parts that


depend only on static info, emits residual program that only
depends on dynamic info.

• Note: interpreter takes two inputs: program to interpret, and


program's own input. First is static, but redundantly treated
as dynamic.

• So: compiling is like partially evaluating an interpreter,


eliminating the redundant dynamic treatment in its first input.

!43
Futamura Projections
• Famous work relating programs P, interpreters I, partial evaluators E, and
compilers C. The so-called "Futamura Projections":

• 1: E(I,P) → partially evaluate I(P) → emit C(P), a compiled program

• 2: E(E,I) → partially evaluate λP.I(P) → emit C, a compiler!

• 3: E(E,E) → partially evaluate λI.λP.I(P) → emit a compiler-compiler!

• Futamura, Yoshihiko, 1971. Partial Evaluation of Computation Process—


An Approach to a Compiler-Compiler. http://citeseerx.ist.psu.edu/
viewdoc/summary?doi=10.1.1.10.2747

• Formal strategy for building compilers from interpreters and specializers.

!44
Specimen #17

Truffle/Graal
• 240k lines of Java for Graal (VM); 90k public Variable emitConditional(LogicNode node, Value trueValue, Value
falseValue) {
lines for Truffle (interpreter-writing if (node instanceof IsNullNode) {
IsNullNode isNullNode = (IsNullNode) node;

framework)
LIRKind kind =
gen.getLIRKind(isNullNode.getValue().stamp(NodeView.DEFAULT));
Value nullValue = gen.emitConstant(kind, isNullNode.nullConstant());
return gen.emitConditionalMove(kind.getPlatformKind(),


operand(isNullNode.getValue()),
Actual real system based on first nullValue, Condition.EQ, false,
trueValue, falseValue);
Futamura Projection.
} else if (node instanceof CompareNode) {
CompareNode compare = (CompareNode) node;
PlatformKind kind =
gen.getLIRKind(compare.getX().stamp(NodeView.DEFAULT))


.getPlatformKind();
Seriously competitive! Potential future return gen.emitConditionalMove(kind, operand(compare.getX()),
operand(compare.getY()),
Oracle JVM.
compare.condition().asCondition(),
compare.unorderedIsTrue(),
trueValue, falseValue);
} else if (node instanceof LogicConstantNode) {


return gen.emitMove(((LogicConstantNode) node).getValue() ?
Multi-language (JavaScript, Python, trueValue : falseValue);
} else if (node instanceof IntegerTestNode) {
Ruby, R, JVM byte code, LLVM bitcode) IntegerTestNode test = (IntegerTestNode) node;
return gen.emitIntegerTestMove(operand(test.getX()),
operand(test.getY()),
multi-target (3)
trueValue, falseValue);
} else {
throw GraalError.unimplemented(node.toString());


}
"Write an interpreter with some }

machinery to help the partial evaluator,


get a compiler for free"

• Originally academic, now Oracle

!45
Variation #7

Forget IR and/or AST!


• In some contexts, even building an AST or IR is overkill.

• Small hardware, tight budget, one target, bootstrapping.

• Avoiding AST tricky, languages can be designed to help.


So-called "single-pass" compilation, emit code line-at-a-
time, while reading.

• Likely means no optimization aside from peephole.

!46
Specimen #18

Turbo Pascal
• 14k instructions including
editor. x86 assembly. 39kb on
disk.

• Famous early personal-micro


compiler. Single-pass, no AST
or IR. Single target.

• Proprietary ($65) so I don't


have source. Here's an ad!

• 1983-1992; lineage continues


into modern Delphi compiler.

!47
Specimen #19

Manx Aztec C
• 21k instructions, 50kb on disk.

• Contemporary to Turbo
Pascal, one of many
competitors.

• Unclear if AST or not, no


source. Probably no IR.

• Multi-target, Z80 and 8080.

• 1980-1990s, small team.

!48
Specimen #20

Not just the past: 8cc


• 6,740 lines of C, self-hosting, static void emit_binop_int_arith(Node *node) {
SAVE;
char *op = NULL;
compiles to ~110kb via clang, switch (node->kind) {
case '+': op = "add"; break;
case '-': op = "sub"; break;

220kb via self.


case '*': op = "imul"; break;
case '^': op = "xor"; break;
case OP_SAL: op = "sal"; break;
case OP_SAR: op = "sar"; break;
case OP_SHR: op = "shr"; break;


case '/': case '%': break;

Don't have to use assembly to default: error("invalid operator '%d'", node->kind);


}
emit_expr(node->left);

get this small! Quite readable push("rax");


emit_expr(node->right);
emit("mov #rax, #rcx");

and simple. Works.


pop("rax");
if (node->kind == '/' || node->kind == '%') {
if (node->ty->usig) {
emit("xor #edx, #edx");
emit("div #rcx");


} else {

Single target, AST but no IR, }


emit("cqto");
emit("idiv #rcx");

few diagnostics.
if (node->kind == '%')
emit("mov #edx, #eax");
} else if (node->kind == OP_SAL || node->kind == OP_SAR ||
node->kind == OP_SHR) {
emit("%s #cl, #%s", op, get_int_reg(node->left->ty, 'a'));


} else {

2012-2016, mostly one }


}
emit("%s #rcx, #rax", op);

developer.

!49
Grand Finale

!50
Specimen #21

JonesForth
• 692 instruction VM, 1,490 lines Forth \
\
IF is an IMMEDIATE word which compiles 0BRANCH followed by a dummy offset, and places
the address of the 0BRANCH on the stack. Later when we see THEN, we pop that address
for compiler, REPL, debugger, etc.
\
:
off the stack, calculate the offset, and back-fill the offset.
IF IMMEDIATE
' 0BRANCH , \ compile 0BRANCH
HERE @ \ save location of the offset on the stack


0 , \ compile a dummy offset
Educational implementation of Forth.
;

: THEN IMMEDIATE
DUP


HERE @ SWAP - \ calculate the offset from the address saved on the stack
Forth, like Lisp, is nearly VM code at ;
SWAP ! \ store the offset in the back-filled location

input (postfix not prefix).


: ELSE IMMEDIATE
' BRANCH , \ definite branch to just over the false-part
HERE @ \ save location of the offset on the stack
0 , \ compile a dummy offset

• Minimal partial-compiler turns user


SWAP
DUP
HERE @ SWAP -
\
\
now back-fill the original (IF) offset
same as for THEN word above

"words" into chains of indirect-jumps. ;


SWAP !

Machine-code primitive words.

• Interactive system with quote, eval,


control flow, exceptions, debug
inspector. Pretty high expressivity!

• 2009, one developer.

!51
Coda

!52
There have been a lot of languages
http://hopl.info catalogues 8,945 programming
languages from the 18th century to the present
!53
Go study them: past and present!
Many compilers possible!
Pick a future you like!

!54
The End!

https://en.wikipedia.org/wiki/Dinosaur#/media/File:Ornithopods_jconway.jpg

(I also probably ought to mention that due to using some CC BY-SA pictures,

this talk is licensed CC BY-SA 4.0 international)

!55

You might also like