CompilerTalk 2019
CompilerTalk 2019
3 orders of magnitude
in 60 minutes
a wander through a weird landscape to the heart of compilation
Spring 2019
!1
Hello!
• I am someone who has
worked (for pay!) on some
compilers: rustc, swiftc, gcc,
clang, llvm, tracemonkey, etc.
!2
the speaker, in 1979
I like compilers!
!3
Borrowsaur fighting a Thunkasaur
Goal for talk
• I expect gap between class
projects and industrial
compilers is overwhelming.
!4
Plan of talk
!5
Caveats
• I'm not a teacher or very good at giving talks.
!6
Part 1: some giants
!7
Specimen #1
Clang
• ~2m lines of C++: 800k lines LValue CodeGenFunction::EmitLValue(const Expr *E) {
ApplyDebugLocation DL(*this, E);
case Expr::ObjCSelectorExprClass:
return EmitObjCSelectorLValue(cast<ObjCSelectorExpr>(E));
•
case Expr::ObjCIsaExprClass:
C-language family (C, C++, return EmitObjCIsaExpr(cast<ObjCIsaExpr>(E));
case Expr::BinaryOperatorClass:
return EmitBinaryOperatorLValue(cast<BinaryOperator>(E));
case Expr::CompoundAssignOperatorClass: {
QualType Ty = E->getType();
if (const AtomicType *AT = Ty->getAs<AtomicType>())
Ty = AT->getValueType();
•
if (!Ty->isAnyComplexType())
case Expr::CallExprClass:
case Expr::CXXMemberCallExprClass:
case Expr::CXXOperatorCallExprClass:
case Expr::UserDefinedLiteralClass:
return EmitCallExprLValue(cast<CallExpr>(E));
!8
Specimen #2
Swiftc
• ~530k lines of C++ plus 2m RValue RValueEmitter::visitIfExpr(IfExpr *E, SGFContext C) {
auto &lowering = SGF.getTypeLowering(E->getType());
SILValue trueValue;
{
auto TE = E->getThenExpr();
FullExpr trueScope(SGF.Cleanups, CleanupLocation(TE));
trueValue = visit(TE).forwardAsSingleValue(SGF, TE);
}
cond.enterFalse(SGF);
SILValue falseValue;
{
auto EE = E->getElseExpr();
FullExpr falseScope(SGF.Cleanups, CleanupLocation(EE));
falseValue = visit(EE).forwardAsSingleValue(SGF, EE);
•
}
!9
Specimen #3
Rustc
• ~360k lines of Rust, plus 1.2m fn expr_as_rvalue(
&mut self,
•
match expr.kind {
Two extra IRs (HIR, MIR).
ExprKind::Scope {
region_scope,
lint_level,
value,
} => {
})
this.as_rvalue(block, scope, value)
!10
Aside: what is this "LLVM"?
• Notice the last 3 languages all end in
LLVM. "Low Level Virtual Machine"
https://github.com/llvm/llvm-project
!11
Specimen #4
rtx in = *p_in;
int i;
if (earlyclobber_operand_p (out))
return n_reloads;
•
|| targetm.small_register_classes_for_mode_p (VOIDmode))
1987-present, large multi-org team.
&& MERGABLE_RELOADS (type, rld[i].when_needed, opnum, rld[i].opnum))
return i;
!12
Part 2: why so big?
!13
Size and economics
• Compilers get big because the development costs are seen as
justified by the benefits, at least to the people paying the bills.
!14
Tradeoffs and balance
• This is ok!
!15
Part 3: variations
(this part is much longer)
!16
Variation #1
Fewer optimizations
!17
Proebsting's law
• "Compiler Advances Double
Computing Power Every 18 Years"
!18
Frances Allen
Got All The Good Ones
• 1971: "A Catalogue of
Optimizing Transformations".
!19
Specimen #5
V8
• 660k lines C++ including backends. Not // Shared routine for word comparison against zero.
void InstructionSelector::VisitWordCompareZero(Node* user, Node* value,
self-hosting.
FlagsContinuation* cont) {
// Try to combine with comparisons against 0 by simply inverting the branch.
while (value->opcode() == IrOpcode::kWord32Equal && CanCover(user, value)) {
Int32BinopMatcher m(value);
if (!m.right().Is(0)) break;
user = value;
value = m.left().node();
cont->Negate();
if (CanCover(user, value)) {
Optimizations mix of classical stuff and switch (value->opcode()) {
case IrOpcode::kWord32Equal:
dynamic language stuff from Smalltalk.
cont->OverwriteAndNegateIfEqual(kEqual);
return VisitWordCompare(this, value, kX64Cmp32, cont);
case IrOpcode::kInt32LessThan:
cont->OverwriteAndNegateIfEqual(kSignedLessThan);
!20
Variation #2
Compiler-friendly implementation
(and input) languages
• Note: your textbook has 3 implementation flavours. Java, C,
ML. No coincidence.
!21
Specimen #6
•
tbl | M.null tbl -> return nilOL
Pure-functional language, very | otherwise -> do
lbl <- mkAsmTempLabel <$> getUniqueM
return $ unitOL $ UNWIND lbl tbl
advanced type-system.
CmmAssign reg src
| isFloatType ty -> assignReg_FltCode format reg src
| is32Bit && isWord64 ty -> assignReg_I64Code reg src
| otherwise -> assignReg_IntCode format reg src
•
where ty = cmmRegType dflags reg
Several tidy IRs after AST: format = cmmTypeFormat ty
Core, STG, CMM. Custom | isFloatType ty -> assignMem_FltCode format addr src
| is32Bit && isWord64 ty -> assignMem_I64Code addr src
| otherwise -> assignMem_IntCode format addr src
backends.
where ty = cmmExprType dflags src
format = cmmTypeFormat ty
!22
Specimen #7
Chez Scheme
• 87k lines Scheme (a Lisp), self- (define asm-size
(lambda (x)
Scheme.
[(word) 2]
[else 4])))
(define asm-move
(lambda (code* dest src)
incremental compilation.
(if (and (eqv? n 0) (record-case dest [(reg) r #t] [else #f]))
(emit xor dest dest code*)
(emit movi src dest code*))]
[(literal) stuff (emit movi src dest code*)]
[else (emit mov src dest code*)]))))
• 1984-now, academic-industrial,
mostly single developer. Getting
down to the size-range where a
compiler is small enough to be
that.
!23
Specimen #8
Poly/ML
• 44k lines SML, self-hosting.
| cgOp(PushToStack(RegisterArg reg)) =
let
val (rc, rx) = getReg reg
in
•
opCodeBytes(PUSH_R rc, if rx then SOME{w=false, b = true,
Single machine target (plus else NONE)
x=false, r = false }
end
byte-code), AST + IR, classical | cgOp(PushToStack(MemoryArg{base, offset, index})) =
opAddressPlus2(Group5, LargeInt.fromInt offset, base, index, 0w6)
•
then opCodeBytes(PUSH_32, NONE) @ int32Signed constnt
•
let
single developer. |
end
opb @ [mdrm] @ int32Signed(tag 0)
!24
Specimen #9
CakeML
• 58k lines SML, 5 targets, self- val WordOp64_on_32_def = Define `
WordOp64_on_32 (opw:opw) =
hosting.
dtcase opw of
| Andw => list_Seq [Assign 29 (Const 0w);
Assign 27 (Const 0w);
Assign 33 (Op And [Var 13; Var 23]);
•
Assign 31 (Op And [Var 11; Var 21])]
9 IRs, many simplifying passes.
| Orw => list_Seq [Assign 29 (Const 0w);
Assign 27 (Const 0w);
Assign 33 (Op Or [Var 13; Var 23]);
Assign 31 (Op Or [Var 11; Var 21])]
•
| Xor => list_Seq [Assign 29 (Const 0w);
160k lines HOL proofs: verified! Assign 27 (Const 0w);
Assign 33 (Op Xor [Var 13; Var 23]);
Assign 31 (Op Xor [Var 11; Var 21])]
| Add => list_Seq [Assign 29 (Const 0w);
•
Assign 27 (Const 0w);
!25
Variation #3
Meta-languages
• Notice Lisp / ML code looks a bit like grammar productions: recursive
branching tree-shaped type definitions, pattern matching.
!26
Aside: SRI-ARC
• Stanford Research Institute - Augmentation Research Lab. US Air
Force R&D project. Very famous for its NLS ("oNLine System").
• History of that project too big to tell here. Highly influential in forms of
computer-human interaction, hypertext, collaboration, visualization.
!27
Specimen #10
TREE-META
• 184 lines of TREE-META. Bootstrapped .META PROGM
from META-II.
OUTPT[-,-] => % *1 ':' % '%PUSHJ;' % *2 '%POPJ;' % ;
=> '%BT;DATA(@' ;
(META, META-II)
F/ => '%BF;DATA(@' ;
DOO[-,-] => *1 *2 ;
!28
Specimen #11 (Segue)
Mesa
• 42k lines of Mesa (bootstrapped
from MPL, itself from TREE-META).
leverage interpreters
!30
Origins of "computer"
• 1940s: First digital
computers.
• Before: fixed-function
machines and/or humans
(largely women) doing job
called "computer".
!31
ENIAC: general hardware
• 1945: ENIAC built for US
Army, Ordnance Corps.
Artillery calculations in
WWII.
• "Programmers" drawn
from "computer" staff, all
women.
• "Programming" meant
physically rewiring per-
task.
!32
Stored Programs
• 1948: Jean Bartik leads
team to convert ENIAC to
"stored programs",
instructions (called
"orders") held in memory.
• Interpreted by hardware.
Faster to reconfigure than
rewiring; but ran slower.
• Subroutine concept
developed for factoring
stored programs.
!33
First software pseudo codes:
interpreters on ENIAC, BINAC, UNIVAC
!34
Specimen #12
http://commons.wikimedia.org/wiki/File:Grace_Murray_Hopper,_in_her_office_in_Washington_DC,_1978,_©Lynn_Gilbert.jpg
!35 - CC BY-SA 4.0
Balance between
interpretation and compilation
is context dependent too!
!36
Variation #4
• Easier to port to new hardware, or bootstrap compiler. "Just get something running".
• Simply easier to write, less labor. Focus your time on frontend semantics.
https://xavierleroy.org/talks/zam-kazam05.pdf
!37
Specimen #13
Roslyn
• 350k lines C#, 320k lines VB. private void EmitBinaryOperatorInstruction(BoundBinaryOperator expression)
{
switch (expression.OperatorKind.Operator())
Self-hosting, bootstrapped off {
case BinaryOperatorKind.Multiplication:
_builder.EmitOpCode(ILOpCode.Mul);
previous gen.
break;
case BinaryOperatorKind.Addition:
_builder.EmitOpCode(ILOpCode.Add);
break;
}
_builder.EmitOpCode(ILOpCode.Div_un);
integration.
else
{
_builder.EmitOpCode(ILOpCode.Div);
}
break;
!38
Specimen #14
•
public void generateCode(
In Eclipse! Also in many Java BlockScope currentScope,
CodeStream codeStream,
boolean valueRequired) {
•
!(cst != Constant.NotAConstant && cst.booleanValue() == false);
!39
Variation #5
!40
Specimen #15
Pharo/Cog
• 54k line VM interpreter and 18k line JIT: C
code generated from Smalltalk
metaprograms. Bootstrapped from Squeak.
!41
Specimen #16
Franz Lisp
• 20k line C interpreter, 7,752 line Lisp ;--- e-move :: move value from one place to anther
; this corresponds to d-move except the args are EIADRS
compiler.
;
(defun e-move (from to)
(if (and (dtpr from)
(eq '$ (car from))
•
(eq 0 (cadr from)))
Older command-line system, standard then (e-write2 'clrl to)
else (e-write3 'movl from to)))
#+for-68k
the lab. Easy to interpret.
(let ((froma (e-cvt from))
(toa (e-cvt to)))
(if (and (dtpr froma)
(eq '$ (car froma))
•
(and (>& (cadr froma) -1) (<& (cadr froma) 65))
Frequent Lisp style: interpret by (atom toa)
(eq 'd (nthchar toa 1)))
default; compile for "fast mode". then ;it's a mov #immed,Dn, where 0 <= immed <= 64
; i.e., it's a quick move
(e-write3 'moveq froma toa)
Compiler bootstraps-from and calls- else (cond ((eq 'Nil froma) (e-write3 'movl '#.nil-reg toa))
(t (e-write3 'movl froma toa))))))
into interpreter whenever convenient.
• 1978-1988, UC Berkeley.
!42
Variation #6
!43
Futamura Projections
• Famous work relating programs P, interpreters I, partial evaluators E, and
compilers C. The so-called "Futamura Projections":
!44
Specimen #17
Truffle/Graal
• 240k lines of Java for Graal (VM); 90k public Variable emitConditional(LogicNode node, Value trueValue, Value
falseValue) {
lines for Truffle (interpreter-writing if (node instanceof IsNullNode) {
IsNullNode isNullNode = (IsNullNode) node;
framework)
LIRKind kind =
gen.getLIRKind(isNullNode.getValue().stamp(NodeView.DEFAULT));
Value nullValue = gen.emitConstant(kind, isNullNode.nullConstant());
return gen.emitConditionalMove(kind.getPlatformKind(),
•
operand(isNullNode.getValue()),
Actual real system based on first nullValue, Condition.EQ, false,
trueValue, falseValue);
Futamura Projection.
} else if (node instanceof CompareNode) {
CompareNode compare = (CompareNode) node;
PlatformKind kind =
gen.getLIRKind(compare.getX().stamp(NodeView.DEFAULT))
•
.getPlatformKind();
Seriously competitive! Potential future return gen.emitConditionalMove(kind, operand(compare.getX()),
operand(compare.getY()),
Oracle JVM.
compare.condition().asCondition(),
compare.unorderedIsTrue(),
trueValue, falseValue);
} else if (node instanceof LogicConstantNode) {
•
return gen.emitMove(((LogicConstantNode) node).getValue() ?
Multi-language (JavaScript, Python, trueValue : falseValue);
} else if (node instanceof IntegerTestNode) {
Ruby, R, JVM byte code, LLVM bitcode) IntegerTestNode test = (IntegerTestNode) node;
return gen.emitIntegerTestMove(operand(test.getX()),
operand(test.getY()),
multi-target (3)
trueValue, falseValue);
} else {
throw GraalError.unimplemented(node.toString());
•
}
"Write an interpreter with some }
!45
Variation #7
!46
Specimen #18
Turbo Pascal
• 14k instructions including
editor. x86 assembly. 39kb on
disk.
!47
Specimen #19
Manx Aztec C
• 21k instructions, 50kb on disk.
• Contemporary to Turbo
Pascal, one of many
competitors.
!48
Specimen #20
•
case '/': case '%': break;
•
} else {
few diagnostics.
if (node->kind == '%')
emit("mov #edx, #eax");
} else if (node->kind == OP_SAL || node->kind == OP_SAR ||
node->kind == OP_SHR) {
emit("%s #cl, #%s", op, get_int_reg(node->left->ty, 'a'));
•
} else {
developer.
!49
Grand Finale
!50
Specimen #21
JonesForth
• 692 instruction VM, 1,490 lines Forth \
\
IF is an IMMEDIATE word which compiles 0BRANCH followed by a dummy offset, and places
the address of the 0BRANCH on the stack. Later when we see THEN, we pop that address
for compiler, REPL, debugger, etc.
\
:
off the stack, calculate the offset, and back-fill the offset.
IF IMMEDIATE
' 0BRANCH , \ compile 0BRANCH
HERE @ \ save location of the offset on the stack
•
0 , \ compile a dummy offset
Educational implementation of Forth.
;
: THEN IMMEDIATE
DUP
•
HERE @ SWAP - \ calculate the offset from the address saved on the stack
Forth, like Lisp, is nearly VM code at ;
SWAP ! \ store the offset in the back-filled location
!51
Coda
!52
There have been a lot of languages
http://hopl.info catalogues 8,945 programming
languages from the 18th century to the present
!53
Go study them: past and present!
Many compilers possible!
Pick a future you like!
!54
The End!
https://en.wikipedia.org/wiki/Dinosaur#/media/File:Ornithopods_jconway.jpg
(I also probably ought to mention that due to using some CC BY-SA pictures,
!55