0% found this document useful (0 votes)

31 views

PLDI Week 04 LLVM

The document discusses intermediate representations used in compilers. It introduces the concept of using an intermediate representation instead of directly generating assembly code. This allows for optimizations and abstraction from the target architecture. Different types of intermediate representations are described from high-level to low-level ones.

Uploaded by

Victor Zhao

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

PLDI Week 04 LLVM

Uploaded by

Victor Zhao

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

CS4212: Compiler Design

Week 4: Simple Intermediate Representations

LLVM

Ilya Sergey
[email protected]

ilyasergey.net/CS4212/
Last Week: Directly Translating AST to Assembly

• For simple languages, no need for intermediate representation.

– e.g. the arithmetic expression language from

• Main Idea: Maintain invariants

– e.g. Code emitted for a given expression always computes the answer into %rax

• Key Challenges:
– storing intermediate values needed to compute complex expressions
– some instructions use specific registers (e.g. shift)
One Simple Strategy

• Compilation is the process of “emitting” instructions into an instruction stream.

• To compile an expression, we recursively compile sub expressions and then process the results.

• Invariants:
– Compilation of an expression yields its result in %rax
– Argument (Xi) is stored in a dedicated operand register
– Intermediate values are pushed onto the stack
– Stack slot is popped after use (so the space is reclaimed)

• Resulting code is wrapped (e.g., with retq) to comply with cdecl calling conventions

• Alternative strategy: using stack machine language as an IR; see the compile2 in compile.ml
Intermediate Representations
Why do something else?
• We have seen a simple syntax-directed translation
– Input syntax uniquely determines the output, no complex analysis or code transformation is done.
– It works fine for simple languages.

But…
• The resulting code quality is poor.
• Richer source language features are hard to encode
– Structured data types, objects, first-class functions, etc.
• It’s hard to optimize the resulting assembly code.
– The representation is too concrete – e.g. it has committed to using certain registers and the stack
– Only a fixed number of registers
– Some instructions have restrictions on where the operands are located
• Control-flow is not structured:
– Arbitrary jumps from one code block to another
– Implicit fall-through makes sequences of code non-modular
(i.e. you can’t rearrange sequences of code easily)
• Retargeting the compiler to a new architecture is hard.
– Target assembly code is hard-wired into the translation
Intermediate Representations (IR’s)
• Abstract machine code: hides details of the target architecture

• Allows machine independent code generation and optimization.

x86

Java
AST IR Byte-
code

Optimization Arm
Multiple IR’s
• Goal: get program closer to machine code without losing the
information needed to do analysis and optimizations

• In practice, multiple intermediate representations

might be used (for different purposes)
x86

Java
AST HIR MIR Byte-
code

Optimization Optimization Arm

Optimizations
What makes a good IR?
• Easy translation target (from the level above)
• Easy to translate (to the level below)
• Narrow interface
– Fewer constructs means simpler phases/optimizations

• Example: Source language might have “while”, “for”, and “foreach” loops
(and maybe more variants)
– IR might have only “while” loops and sequencing
– Translation eliminates “for” and “foreach”

⟦for(pre; cond; post) {body}⟧

=
⟦pre; while(cond) {body;post}⟧

– Here the notation ⟦cmd⟧ denotes the “translation” or “compilation” of the command cmd.
IR’s at the extreme
• High-level IR’s
– Abstract syntax + new node types not generated by the parser
• e.g. Type checking information or disambiguated syntax nodes
– Typically preserves the high-level language constructs
• Structured control flow, variable names, methods, functions, etc.
• May do some simplification (e.g. convert for to while)
– Allows high-level optimizations based on program structure
• e.g. inlining “small” functions, reuse of constants, etc.
– Useful for semantic analyses like type checking
GHC Compilation Pipeline
A number of Intermediate Languages

• Haskell Source

• Core

• Spineless Tagless G-Machine

• C--

• C / Machine Code / LLVM Code

Most of interesting optimizations

happen here
GHC Core

• A tiny language, to which Haskell sources are de-sugared;

• Based on explicitly typed System F with type equality

coercions;

• Used as a base platform for analyses and optimizations;

• All names are fully-quali ed;

• if-then-else is compiled to case-expressions;

• Variables have additional metadata;

• Type class constraints are compiled into record parameters.

data Bind b = NonRec b (Expr b)

| Rec [(b, (Expr b))]

type Alt b = (AltCon, [b], Expr b)

data AltCon
= DataAlt DataCon
| LitAlt Literal
| DEFAULT
How to Get Core

Desugared GHC Core

> ghc -ddump-ds Mysum.hs

Try with
module Mysum where

mysum n = lgo 0 [1..n]

where
lgo z [] = z
lgo z (x:xs) = lgo (z + x) xs

More at
IR’s at the extreme
• High-level IR’s
– Abstract syntax + new node types not generated by the parser
• e.g. Type checking information or disambiguated syntax nodes
– Typically preserves the high-level language constructs
• Structured control flow, variable names, methods, functions, etc.
• May do some simplification (e.g. convert for to while)
– Allows high-level optimizations based on program structure
• e.g. inlining “small” functions, reuse of constants, etc.
– Useful for semantic analyses like type checking

• Low-level IR’s
– Machine dependent assembly code + extra pseudo-instructions
• e.g. a pseudo instruction for interfacing with garbage collector or memory allocator (parts of the language runtime
system)
• e.g. (on x86) a imulq instruction that doesn’t restrict register usage
– Source structure of the program is lost:
• Translation to assembly code is straightforward
– Allows low-level optimizations based on target architecture
• e.g. register allocation, instruction selection, memory layout, etc.

• What’s in between?
Mid-level IR’s: Many Varieties

• Intermediate between AST (abstract syntax) and assembly

• May have unstructured jumps, abstract registers, or memory locations
• Convenient for translation to high-quality machine code
– Example: all intermediate values are named to facilitate optimizations that attempt to minimize stack/register usage

• Many examples:
– Triples: OP a b
• Useful for instruction selection on X86 via “graph tiling” (a way to better utilise registers)
– Quadruples: a = b OP c (RISC-like “three address form”)
– SSA: variant of quadruples where each variable is assigned exactly once
• Easy dataflow analysis for optimization
• e.g. LLVM: industrial-strength IR, based on SSA
– Stack-based:
• Easy to generate
• e.g. Java Bytecode, UCODE
Growing an IR

• Develop an IR in detail… starting from the very basic.

• Start: a (very) simple intermediate representation for the arithmetic language

– Very high level
– No control flow

• Goal: A simple subset of the LLVM IR

– LLVM = “Low-level Virtual Machine”
– Used in HW3+

• Add features needed to compile rich source languages

Simple let-based IR
Eliminating Nested Expressions
• Fundamental problem:
– Compiling complex & nested expression forms to simple operations.

Source ((1 + X4) + (3 + (X1 * 5)))

Add(Add(Const 1, Var X4),

AST Add(Const 3, Mul(Var X1,
Const 5)))

IR ?
• Idea: name intermediate values, make order of evaluation explicit.
– No nested operations.
Translation to SLL
Add(Add(Const 1, Var X4),
• Given this: Add(Const 3, Mul(Var X1,
Const 5)))

• Translate to this desired SLL form:

let tmp0 = add 1L varX4 in
let tmp1 = mul varX1 5L in
let tmp2 = add 3L tmp1 in
let tmp3 = add tmp0 tmp2 in
tmp3

• Translation makes the order of evaluation explicit.

• Names intermediate values
• Note: introduced temporaries are never modified
Intermediate Representations

• IR1: Expressions
– simple arithmetic expressions, immutable global variables

• IR2: Commands
– global mutable variables
– commands for update and sequencing

• IR3: Local control flow

– conditional commands & while loops
– basic blocks
Demo: IR1 and IR2

• https://github.com/cs4212/week-03-intermediate-2023

• Definitions: ir1.ml, ir2.ml

• Using IRs: ir_by_hand.ml

IR3: Basic Blocks

• A sequence of instructions that is always executed starting at the first instruction

and always exits at the last instruction.
– Starts with a label that names the entry point of the basic block.
– Ends with a control-flow instruction (e.g. branch or return) the “link”
– Contains no other control-flow instructions
– Contains no interior label used as a jump target

• Basic blocks can be arranged into a control-flow graph

– Nodes are basic blocks
– There is a directed edge from node A to node B if the control flow instruction at the end
of basic block A might jump to the label of basic block B.
Demo: IR3

• https://github.com/cs4212/week-03-intermediate-2023

• Definitions: ir3.ml
LLVM
Low-Level Virtual Machine (LLVM)

• Open-Source Compiler Infrastructure

– see llvm.org for full documentation
• Created by Chris Lattner (advised by Vikram Adve) at UIUC
– LLVM: An infrastructure for Multi-stage Optimization, 2002
– LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation, 2004
• 2005: Adopted by Apple for XCode 3.1
• Front ends:
– llvm-gcc (drop-in replacement for gcc)
– Clang: C, objective C, C++ compiler supported by Apple
– various languages: Swift, ADA, Scala, Haskell, …
• Back ends:
– x86 / Arm / Power / etc.
LLVM Compiler Infrastructure
[Lattner et al.]

LLVM
frontends
Typed SSA llc
like backend
'clang' IR
code gen
jit
Optimisations/
Transformations

Analysis
Example LLVM Code
factorial-pretty.ll
• LLVM offers a textual representation of its IR define @factorial(%n) {
%1 = alloca
– files ending in .ll %acc = alloca
store %n, %1
store 1, %acc
factorial64.c br label %start

#include <stdio.h> start:

#include <stdint.h> %3 = load %1
%4 = icmp sgt %3, 0
br %4, label %then, label %else
int64_t factorial(int64_t n) {
int64_t acc = 1; then:
%6 = load %acc
while (n > 0) {
%7 = load %1
acc = acc * n; %8 = mul %6, %7
n = n - 1; store %8, %acc
%9 = load %1
}
%10 = sub %9, 1
return acc; store %10, %1
} br label %start

else:
%12 = load %acc
ret %12
}
Real LLVM
factorial.ll
; Function Attrs: nounwind ssp
define i64 @factorial(i64 %n) #0 {
%1 = alloca i64, align 8
• Decorates values with type information %acc = alloca i64, align 8
i64 store i64 %n, i64* %1, align 8
store i64 1, i64* %acc, align 8
i64* br label %2

i1 (boolean) ; <label>:2 ; preds = %5, %0

%3 = load i64* %1, align 8
%4 = icmp sgt i64 %3, 0
• Permits numeric identifiers br i1 %4, label %5, label %11

; <label>:5 ; preds = %2
%6 = load i64* %acc, align 8
• Has alignment annotations %7 = load i64* %1, align 8
(padding for some specified number of bytes) %8 = mul nsw i64 %6, %7
store i64 %8, i64* %acc, align 8
%9 = load i64* %1, align 8
%10 = sub nsw i64 %9, 1
• Keeps track of entry edges for each block: store i64 %10, i64* %1, align 8
br label %2
preds = %5, %0
; <label>:11 ; preds = %2
%12 = load i64* %acc, align 8
ret i64 %12
}
Example Control-flow Graph
define @factorial(%n) {
entry:
define @factorial(%n) { %1 = alloca
%1 = alloca
%acc = alloca
%acc = alloca store %n, %1
store %n, %1 store 1, %acc
store 1, %acc br label %start
br label %start
start:
start: %3 = load %1
%4 = icmp sgt %3, 0
%3 = load %1
br %4, label %then, label %else
%4 = icmp sgt %3, 0
br %4, label %then, label %else
then:
%6 = load %acc
then: else:
%7 = load %1
%8 = mul %6, %7
%6 = load %acc store %8, %acc
%12 = load %acc
%7 = load %1 %9 = load %1
ret %12
%8 = mul %6, %7
%10 = sub %9, 1
store %8, %acc
store %10, %1
%9 = load %1
%10 = sub %9, 1 br label %start
store %10, %1
br label %start else:
%12 = load %acc
ret %12
} }
LL Basic Blocks and Control-Flow Graphs

• LLVM enforces (some of) the basic block invariants syntactically.

• Representation in OCaml:

type block = {
insns : (uid * insn) list;
term : (uid * terminator)
}

• A control flow graph is represented as a list of labeled basic blocks with these invariants:
– No two blocks have the same label
– All terminators mention only labels that are defined among the set of basic blocks
– There is a distinguished, unlabelled, entry block:

type cfg = block * (lbl * block) list

LL Storage Model: Locals
• Several kinds of storage:
– Local variables (or temporaries): %uid
– Global declarations (e.g. for string constants): @gid
– Abstract locations: references to (stack-allocated) storage created by the alloca instruction
– Heap-allocated structures created by external calls (e.g. to malloc)

• Local variables:
– Defined by the instructions of the form %uid = …
– Must satisfy the single static assignment (SSA) invariant
• Each %uid appears on the left-hand side of an assignment only once in the entire control flow graph.
– The value of a %uid remains unchanged throughout its lifetime
– Analogous to “let %uid = e in …” in OCaml

• Intended to be an abstract version of machine registers.

• Full “SSA” to allow richer use of local variables by taking the control flow into the account
– phi functions (https://en.wikipedia.org/wiki/Static_single-assignment_form)
LL Storage Model: alloca

• The alloca instruction allocates stack space and returns a reference to it.
– The returned reference is stored in local:
%ptr = alloca typ
– The amount of space allocated is determined by the type

• The contents of the slot are accessed via the load and store instructions:

%acc = alloca i64 ; allocate a storage slot

store i64 4212, i64* %acc ; store the integer value 4212
%x = load i64, i64* %acc ; load the value 4212 into %x

• Gives an abstract version of stack slots

Structured Data
Compiling Structured Data
• Consider C-style structures like those below.

• How do we represent Point and Rect values?

struct Point { int x; int y; };

struct Rect { struct Point ll, lr, ul, ur };

struct Rect mk_square(struct Point ll, int len) {

struct Rect square;
square.ll = square.lr = square.ul = square.ur = ll;
square.lr.x += len;
square.ul.y += len;
square.ur.x += len;
square.ur.y += len;
return square;
}
Representing Structs
struct Point { int x; int y;};

• Store the data using two contiguous words of memory.

• Represent a Point value p as the address of the first word.

p x y

struct Rect { struct Point ll, lr, ul, ur };

• Store the data using 8 contiguous words of memory.

square ll.x ll.y lr.x lr.y ul.x ul.y ur.x ur.y

• Compiler needs to know the size of the struct at compile time to allocate the needed storage space.
• Compiler needs to know the shape of the struct at compile time to index into the structure.
Assembly-level Member Access
square ll.x ll.y lr.x lr.y ul.x ul.y ur.x ur.y

struct Point { int32 x; int32 y; };

struct Rect { struct Point ll, lr, ul, ur };

• Consider: ⟦square.ul.y⟧ = (x86.operand, x86.insns)

• Assume that %rcx holds the base address of square

• Calculate the offset relative to the base pointer of the data:

– ul = sizeof(struct Point) + sizeof(struct Point)
– y = sizeof(int)

• So: ⟦square.ul.y⟧ = (ans, Movq 20(%rcx) ans)

Padding & Alignment
• How to lay out non-homogeneous structured data?
struct Example {
int x;
char a;
char b; Not 32-bit
32-bit boundaries int y; aligned
};

x a b y

Padding
Copy-in/Copy-out
When we do an assignment in C as in:

struct Rect mk_square(struct Point ll, int elen) {

struct Square res;
res.lr = ll;
...

then we copy all of the elements out of the source and put them
in the target. Same as doing word-level operations:

struct Rect mk_square(struct Point ll, int elen) {

struct Square res;
res.lr.x = ll.x;
res.lr.y = ll.x;
...

• For really large copies, the compiler uses something like memcpy
(which is implemented using a loop in assembly).
C Procedure Calls
• Similarly, when we call a procedure, we copy arguments in, and copy results out.
– Caller sets aside extra space in its frame to store results that are bigger than will fit in %rax.
– We do the same with scalar values such as integers or doubles.

• Sometimes, this is termed "call-by-value".

– This is bad terminology.
– Copy-in/copy-out is more accurate.

• Benefit: locality
• Problem: expensive for large records…

• In C: can opt to pass pointers to structs: “call-by-reference”

• Languages like Java and OCaml always pass non-word-sized objects by reference.
Call-by-Reference
void mkSquare(struct Point *ll, int elen,
struct Rect *res) {
res->lr = res->ul = res->ur = res->ll = *ll;
res->lr.x += elen;
res->ur.x += elen;
res->ur.y += elen;
res->ul.y += elen;
}
void foo() {
struct Point origin = {0,0};
struct Square unit_sq;
mkSquare(&origin, 1, &unit_sq);
}

• The caller passes in the address of the point and the address of the result (1 word each).
• Note that returning references to stack-allocated data can cause problems.
– This space might be reclaimed when foo() is done
– Need to allocate storage in the heap…
Arrays
void foo() { void foo() {
char buf[27]; char buf[27];

buf[0] = 'a'; *(buf) = 'a';

buf[1] = 'b'; *(buf+1) = 'b';
... ...
buf[25] = 'z'; *(buf+25) = 'z';
buf[26] = 0; *(buf+26) = 0;
} }

• Space is allocated on the stack for buf.

– Note, without the ability to allocated stack space dynamically (C’s alloca function)
need to know size of buf at compile time…

• buf[i] is really just

(base_of_array) + i * elt_size
Multi-Dimensional Arrays

• In C, int M[4][3] yields an array with 4 rows and 3 columns.

• Laid out in row-major order:
M[0][0] M[0][1] M[0][2] M[1][0] M[1][1] M[1][2] M[2][0] …

• In Fortran, arrays are laid out in column major order.

M[0][0] M[1][0] M[2][0] M[3][0] M[0][1] M[1][1] M[2][1] …

• In ML and Java, there are no multi-dimensional arrays:

– (int array) array is represented as an array of pointers to arrays of ints.

• Why is knowing these memory layout strategies important?

Array Bounds Checks
• Safe languages (e.g. Java, C#, ML but not C, C++) check array indices to
ensure that they’re in bounds.
– Compiler generates code to test that the computed offset is legal

• Needs to know the size of the array… where to store it?

– One answer: Store the size before the array contents.

arr
Size=7 A[0] A[1] A[2] A[3] A[4] A[5] A[6]

• Other possibilities:
– Pascal: only permit statically known array sizes (very unwieldy in practice)
– What about multi-dimensional arrays?
Array Bounds Checks (Implementation)
• Example: Assume %rax holds the base pointer (arr) and %ecx holds the array index i.
To read a value from the array arr[i]:

movq -8(%rax) %rdx // load size into rdx

cmpq %rdx %rcx // compare index to bound
j l __ok // jump if 0 <= i < size
callq __err_oob // test failed, call the error handler
__ok:
movq (%rax, %rcx, 8) dest // do the load from the array access

• Clearly more expensive: adds move, comparison & jump

– More memory traffic
– Hardware can improve performance: executing instructions in parallel, branch prediction
• These overheads are particularly bad in an inner loop
• Compiler optimisations can help remove the overhead
– e.g. In a for loop, if bound on index is known, only do the test once
C-style Strings
• A string constant "foo" is represented as global data:
_string42: 102 111 111 0

• C uses null-terminated strings

• Strings are usually placed in the text segment so they are read only.
– allows all copies of the same string to be shared.

• Rookie mistake (in C): write to a string constant.

char *p = "foo”;
p[0] = 'b’;

Attempting to modify the string literal is undefined behaviour.

• Instead, must allocate space on the heap:
char *p = (char *)malloc(4 * sizeof(char));
strncpy(p, “foo”, 4); /* include the null byte */
p[0] = 'b’;
Tagged Datatypes
C-style Enumerations / ML-style datatypes

• In C: enum Day {sun, mon, tue, wed, thu, fri, sat} today;

• In OCaml: type day = Sun | Mon | Tue | Wed | Thu | Fri | Sat

• Associate an integer tag with each case: sun = 0, mon = 1, …

– C lets programmers choose the tags

• OCaml datatypes can also carry data: type foo = Bar of int | Baz of int * foo

• Representation: a foo value is a pointer to a pair: (tag, data)

• Example: tag(Bar) = 0, tag(Baz) = 1
⟦let f = Bar(3)⟧ = f 0 3

⟦let g = Baz(4, f)⟧ = g 1 4 f

Switch Compilation

• Consider the C statement:

switch (e) {
case sun: s1; break;
case mon: s2; break;
…
case sat: s3; break;
}

• How to compile this?

– What happens if some of the break statements are omitted?
(Control falls through to the next branch.)
Cascading ifs and Jumps
%tag = ⟦e⟧;
⟦switch(e) {case tag1: s1; case tag2 s2; …}⟧ = br label %l1

• Each $tag1…$tagN is just a constant int l1: %cmp1 = icmp eq %tag, $tag1
br %cmp1 label %b1, label %l2
tag value. b1: ⟦s1⟧
br label %l2
• Note: ⟦break;⟧
l2: %cmp2 = icmp eq %tag, $tag2
(within the switch branches) is: br %cmp2 label %b2, label %l3
b2: ⟦s2⟧
br %merge br label %l3
…
lN: %cmpN = icmp eq %tag, $tagN
br %cmpN label %bN, label %merge
bN: ⟦sN⟧
br label %merge

merge:
Alternatives for Switch Compilation
• Nested if-then-else works OK in practice if # of branches is small
– (e.g. < 16 or so).

• For more branches, use better data structures to organise the jumps:
– Create a table of pairs (v1, branch_label) and loop through
– Or, do binary search rather than linear search
– Or, use a hash table rather than binary search

• One common case: the tags are dense in some range

[min…max]
– Let N = max – min
– Create a branch table Branches[N] where Branches[i] = branch_label for tag i.
– Compute tag = ⟦e⟧ and then do an indirect jump: J Branches[tag]
• Common to use heuristics to combine these techniques.
ML-style Pattern Matching
• ML-style match statements are like C’s switch statements except:
– Patterns can bind variables match e with
– Patterns can nest | Bar(z) -> e1
| Baz(y, Bar(w)) -> e2
| _ -> e3

• Compilation strategy:
match e with
– “Flatten” nested patterns into | Bar(z) -> e1
matches against one constructor | Baz(y, tmp) ->
(match tmp with
at a time. | Bar(w) -> e2
– Compile the match against the | Baz(_, _) -> e3)
tags of the datatype as for C-style switches.
– Code for each branch additionally must copy data from ⟦e⟧ to the variables bound in the patterns.

• There are many opportunities for optimisations, many papers about “pattern-match compilation”
– Many of these transformations can be done at the AST level
Datatypes in LLVM IR
Structured Data in LLVM
• LLVM’s IR is uses types to describe the structure of data.

t ::=
void
i1 | i8 | i64 N-bit integers
[<#elts> x t] arrays
fty function types
{t1, t2, … , tn} structures
t* pointers
%Tident named (identified) type

fty ::= Function Types

t (t1, .., tn) return, argument types

• <#elts> is an integer constant >= 0

• Structure types can be named at the top level:

%T1 = type {t1, t2, … , tn}

• Such structure types can be recursive
Example LL Types
• A static array of 4230 integers: [ 4230 x i64 ]

• A two-dimensional array of integers: [ 3 x [ 4 x i64 ] ]

• Structure for representing dynamically-allocated arrays with their length:

{ i64 , [0 x i64] }
– There is no array-bounds check; the static type information is only used for calculating pointer offsets.

• C-style linked lists (declared at the top level):

%Node = type { i64, %Node*}

• Structs from the C program shown earlier:

%Rect = { %Point, %Point, %Point, %Point }
%Point = { i64, i64 }
getelementptr
• LLVM provides the getelementptr instruction to compute pointer values
– Given a pointer and a “path” through the structured data pointed to by that pointer,
getelementptr computes an address
– This is the abstract analog of the X86 LEA (load effective address). It does not access memory.
– It is a “type indexed” operation, since the size computations depend on the type

insn ::= …
| getelementptr t* %val, t1 idx1, t2 idx2 ,…

• Example: access the x component of the first point of a rectangle:

%tmp1 = getelementptr %Rect* %square, i32 0, i32 0

%tmp2 = getelementptr %Point* %tmp1, i32 0, i32 0

• The first is i32 0 a “step through” the pointer to, e.g., %square, with offset 0.

See “Why is the extra 0 index required?”: https://llvm.org/docs/GetElementPtr.html#why-is-the-extra-0-index-required

GEP Example*
struct RT {
int A; 1. %s is a pointer to an (array of) %ST structs,
int B[10][20]; suppose the pointer value is ADDR
int C;
} 2. Compute the index of the 1st element by
struct ST { adding size_ty(%ST).
struct RT X;
int Y; 3. Compute the index of the Z field by
struct RT Z; adding size_ty(%RT) +
} size_ty(i32) to skip past X and Y.
int *foo(struct ST *s) {
4. Compute the index of the B field by
return &s[1].Z.B[5][13];
} adding size_ty(i32) to skip past A.

5. Index into the 2d array.

%RT = type { i32, [10 x [20 x i32]], i32 }

%ST = type { %RT, i32, %RT }
define i32* @foo(%ST* %s) {
entry:
%arrayidx = getelementptr %ST* %s, i32 1, i32 2, i32 1, i32 5, i32 13
ret i32* %arrayidx
}

Final answer: ADDR + size_ty(%ST) + size_ty(%RT) + size_ty(i32)

+ size_ty(i32) + 5*20*size_ty(i32) + 13*size_ty(i32)
*adapted from the LLVM documentation: see http://llvm.org/docs/LangRef.html#getelementptr-instruction
getelementptr

• GEP never dereferences the address it’s calculating:

– GEP only produces pointers by doing arithmetic
– It doesn’t actually traverse the links of a data structure

• To index into a deeply nested structure, one has to “follow the pointer” by
loading from the computed pointer
Compiling Data Structures via LLVM

1. Translate high level language types into an LLVM representation type.

– For some languages (e.g. C) this process is straight forward
• The translation simply uses platform-specific alignment and padding
– For other languages, (e.g. OO languages) there might be a fairly complex
elaboration.
• e.g. for OCaml, arrays types might be translated to pointers to length-indexed
structs.
⟦int array⟧ = { i32, [0 x i32]}*
2. Translate accesses of the data into getelementptr operations:
– e.g. for OCaml array size access:
⟦length a⟧ =
%1 = getelementptr {i32, [0 x i32]}* %a, i32 0, i32 0
Type Casting
• What if the LLVM IR’s type system isn’t expressive enough?
– e.g. if the source language has subtyping, perhaps due to inheritance
– e.g. if the source language has polymorphic/generic types

• LLVM IR provides a bitcast instruction

– This is a form of (potentially) unsafe cast. Misuse can cause serious bugs
(segmentation faults, or silent memory corruption)

%rect2 = type { i64, i64 } ; two-field record

%rect3 = type { i64, i64, i64 } ; three-field record

define @foo() {
%1 = alloca %rect3 ; allocate a three-field record
%2 = bitcast %rect3* %1 to %rect2* ; safe cast
%3 = getelementptr %rect2* %2, i32 0, i32 1 ; allowed
…
}
Demo: Compiling to LLVM

• Clone https://github.com/cs4212/week-04-llvm-demo

• Check struct.c and its LLVM representations

Next Week

• LLVMLite Specification

• Overview of HW3

• Lexical Analysis

Az 104
No ratings yet
Az 104
21 pages
CPC Install Exercise Guide Ispss 20230227
No ratings yet
CPC Install Exercise Guide Ispss 20230227
75 pages
Intermediate Code Generation-17-19
No ratings yet
Intermediate Code Generation-17-19
90 pages
MLIR Tutorial
No ratings yet
MLIR Tutorial
78 pages
1 Davis Chisnall LLVM 2017
No ratings yet
1 Davis Chisnall LLVM 2017
166 pages
Intermediate Representation: Goals
No ratings yet
Intermediate Representation: Goals
40 pages
cs471 16 Ir
No ratings yet
cs471 16 Ir
6 pages
Intermediate Representation and Symbol Table
No ratings yet
Intermediate Representation and Symbol Table
39 pages
unit 4 new
No ratings yet
unit 4 new
60 pages
Quick Primer On LLVM IR: (For Those Already Familiar With LLVM IR, Feel Free To)
No ratings yet
Quick Primer On LLVM IR: (For Those Already Familiar With LLVM IR, Feel Free To)
13 pages
4.llvm
No ratings yet
4.llvm
26 pages
A Complete Guide to LLVM for Programming Language Creators
No ratings yet
A Complete Guide to LLVM for Programming Language Creators
22 pages
15IR and SymTab
No ratings yet
15IR and SymTab
30 pages
Compiler_5 (2)
No ratings yet
Compiler_5 (2)
30 pages
COMPI-DESI-CHP-05
No ratings yet
COMPI-DESI-CHP-05
19 pages
LLVM Tutorial
100% (1)
LLVM Tutorial
59 pages
Compiler2018 Big Picture
No ratings yet
Compiler2018 Big Picture
53 pages
poc intermediate code generation
No ratings yet
poc intermediate code generation
14 pages
Reading List: Aho-Sethi-Ullman: Chapter 6.1 6.2 Chapter 6.3 6.10 (Note: Glance Through It Only For
No ratings yet
Reading List: Aho-Sethi-Ullman: Chapter 6.1 6.2 Chapter 6.3 6.10 (Note: Glance Through It Only For
33 pages
Chapter 5 - Intermediate Code Generation
No ratings yet
Chapter 5 - Intermediate Code Generation
27 pages
Chapter 2 - A Quick Tour: 2.1 The Compiler Toolchain
No ratings yet
Chapter 2 - A Quick Tour: 2.1 The Compiler Toolchain
6 pages
Compiler Construction IR
No ratings yet
Compiler Construction IR
3 pages
A Simple Graph-Based Intermediate Representation
No ratings yet
A Simple Graph-Based Intermediate Representation
15 pages
IRDL: An IR Definition Language For SSA Compilers: Mathieu Fehr Jeff Niu River Riddle
No ratings yet
IRDL: An IR Definition Language For SSA Compilers: Mathieu Fehr Jeff Niu River Riddle
14 pages
5-ir
No ratings yet
5-ir
51 pages
Code
No ratings yet
Code
73 pages
C 2002/03 T.Grust Compiler Construction: 1. Introduction 19
No ratings yet
C 2002/03 T.Grust Compiler Construction: 1. Introduction 19
26 pages
1 Chapter - 5: Intermediate Code Generation Bahir Dar Institute of Technology
No ratings yet
1 Chapter - 5: Intermediate Code Generation Bahir Dar Institute of Technology
30 pages
CIS 461 Compiler Design and Construction Fall 2012 Lecture-Module 17
No ratings yet
CIS 461 Compiler Design and Construction Fall 2012 Lecture-Module 17
33 pages
lect02
No ratings yet
lect02
29 pages
Additional Note CSC 409
No ratings yet
Additional Note CSC 409
11 pages
Unit-5 Compiler Design - Code Generation
No ratings yet
Unit-5 Compiler Design - Code Generation
42 pages
Intermediate Code Generation and Code Optimization
No ratings yet
Intermediate Code Generation and Code Optimization
40 pages
L3 - Compiler Construction (CS-403) PDF
No ratings yet
L3 - Compiler Construction (CS-403) PDF
25 pages
Be A Binary Rockstar
No ratings yet
Be A Binary Rockstar
77 pages
Blue Purple Modern Animated Computer Science Presentation
No ratings yet
Blue Purple Modern Animated Computer Science Presentation
11 pages
Exploiting ILP With Software Approach
No ratings yet
Exploiting ILP With Software Approach
104 pages
Intermediate Code Generation
No ratings yet
Intermediate Code Generation
42 pages
The Architecture of Open Source Applications (Volume 1) LLVM5
No ratings yet
The Architecture of Open Source Applications (Volume 1) LLVM5
1 page
9 Intermediate Code Generation
No ratings yet
9 Intermediate Code Generation
12 pages
Chapter 1 Completed
No ratings yet
Chapter 1 Completed
31 pages
Chapter 1-2 Compiler Design
No ratings yet
Chapter 1-2 Compiler Design
60 pages
CD - Ch.6
No ratings yet
CD - Ch.6
33 pages
Compiler Design - Compilers Principles and Practice - A.hosking - Compiler Course Slides
No ratings yet
Compiler Design - Compilers Principles and Practice - A.hosking - Compiler Course Slides
237 pages
Compiler Construction: A Compulsory Module For Students in
No ratings yet
Compiler Construction: A Compulsory Module For Students in
34 pages
CPPT Saved
No ratings yet
CPPT Saved
402 pages
Code Generation
No ratings yet
Code Generation
62 pages
Chapter 6 ICG and Assignemnt 2 Compiler Design
No ratings yet
Chapter 6 ICG and Assignemnt 2 Compiler Design
43 pages
High Level Synthesis - 02 - Basic Concepts
No ratings yet
High Level Synthesis - 02 - Basic Concepts
27 pages
Chapter 6 - Intermediate Code Generation
No ratings yet
Chapter 6 - Intermediate Code Generation
42 pages
91230
No ratings yet
91230
47 pages
ASPLOS19-LLVM-Tutorial
No ratings yet
ASPLOS19-LLVM-Tutorial
71 pages
Lecture21-22 Compiler Construction
No ratings yet
Lecture21-22 Compiler Construction
42 pages
Csc 307 Compiling Techniques
No ratings yet
Csc 307 Compiling Techniques
31 pages
RkCD-Chapter 6 - Intermediate Code Generation
No ratings yet
RkCD-Chapter 6 - Intermediate Code Generation
12 pages
Intermediate Code Generation
No ratings yet
Intermediate Code Generation
23 pages
L6-LLVM-Part2
No ratings yet
L6-LLVM-Part2
6 pages
Unit-Iv: Intermediate Code Generation
No ratings yet
Unit-Iv: Intermediate Code Generation
19 pages
Digital Computer Systems Lecture 4
No ratings yet
Digital Computer Systems Lecture 4
20 pages
Basic Information About C language PDF
From Everand
Basic Information About C language PDF
Suraj Das
No ratings yet
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
PLDI Week 10 More Typing
No ratings yet
PLDI Week 10 More Typing
55 pages
PLDI Week 07 More Parsing
No ratings yet
PLDI Week 07 More Parsing
59 pages
PLDI Week 08 Lambda
No ratings yet
PLDI Week 08 Lambda
17 pages
PLDI Week 02 X86lite
No ratings yet
PLDI Week 02 X86lite
30 pages
PLDI Week 03 Irs
No ratings yet
PLDI Week 03 Irs
51 pages
Microsoft Certifications 2018
No ratings yet
Microsoft Certifications 2018
2 pages
Firefly User Manual 26 April 2021
No ratings yet
Firefly User Manual 26 April 2021
1 page
OTL Rollback
No ratings yet
OTL Rollback
2 pages
Cloud Computing Muskan
No ratings yet
Cloud Computing Muskan
9 pages
Diagnostico 225d
No ratings yet
Diagnostico 225d
4 pages
Citra Log - Txt.old
No ratings yet
Citra Log - Txt.old
6 pages
Automating Manufacturing Systems Presentation Rev 4
No ratings yet
Automating Manufacturing Systems Presentation Rev 4
89 pages
Entity Framework Net Core
100% (1)
Entity Framework Net Core
74 pages
CSC3002 Computer-Networks ETH 1 AC37
No ratings yet
CSC3002 Computer-Networks ETH 1 AC37
3 pages
Arithme (C Logic Units and Memory
No ratings yet
Arithme (C Logic Units and Memory
30 pages
Ceph Cookbook - Sample Chapter
No ratings yet
Ceph Cookbook - Sample Chapter
28 pages
Ineptpdf8 4 511111 Pyw
No ratings yet
Ineptpdf8 4 511111 Pyw
38 pages
Mid Term Exam Part 1 PL SQL
No ratings yet
Mid Term Exam Part 1 PL SQL
17 pages
Operating Instructions Smartlink DP
No ratings yet
Operating Instructions Smartlink DP
10 pages
Class: Write A Java To Represent A Vehicle Type
No ratings yet
Class: Write A Java To Represent A Vehicle Type
2 pages
17 12 1 RN Ie93xx Ess9300
No ratings yet
17 12 1 RN Ie93xx Ess9300
9 pages
System Programming & Operating System: A Laboratory Manual FOR
No ratings yet
System Programming & Operating System: A Laboratory Manual FOR
45 pages
Applications of Handshake Mode
No ratings yet
Applications of Handshake Mode
3 pages
Loops 1
No ratings yet
Loops 1
20 pages
FPGA UG 02157 1 0 Lattice Diamond Design Flow Overview Intel Quartus
No ratings yet
FPGA UG 02157 1 0 Lattice Diamond Design Flow Overview Intel Quartus
43 pages
3AMCAT Coding Questions and Answers PDF Test 2018 Java - Geek Plac.
No ratings yet
3AMCAT Coding Questions and Answers PDF Test 2018 Java - Geek Plac.
3 pages
70-533 Paper 5
No ratings yet
70-533 Paper 5
204 pages
AU9540 Driver Release Note 20100813
No ratings yet
AU9540 Driver Release Note 20100813
4 pages
My Ball
No ratings yet
My Ball
1 page
Amazon RDS Custom
No ratings yet
Amazon RDS Custom
26 pages
Android System Development PDF
100% (1)
Android System Development PDF
409 pages
Google Glass: Overview and User Guide
No ratings yet
Google Glass: Overview and User Guide
13 pages
jBASE Dataguard
No ratings yet
jBASE Dataguard
140 pages