blob: 376b8b6e1bb6812fd82bd62782173c94b4a30df7 [file] [log] [blame] [view]
River Riddle01c857b2020-03-30 19:25:001# 'llvm' Dialect
Alex Zinenkof0597cb2019-02-22 09:00:252
Alex Zinenkoc2751252020-12-15 17:29:123This dialect maps [LLVM IR](https://llvm.org/docs/LangRef.html) into MLIR by
4defining the corresponding operations and types. LLVM IR metadata is usually
5represented as MLIR attributes, which offer additional structure verification.
Alex Zinenkof0597cb2019-02-22 09:00:256
Alex Zinenkoc2751252020-12-15 17:29:127We use "LLVM IR" to designate the
Alex Zinenkof0597cb2019-02-22 09:00:258[intermediate representation of LLVM](https://llvm.org/docs/LangRef.html) and
Alex Zinenkoc2751252020-12-15 17:29:129"LLVM _dialect_" or "LLVM IR _dialect_" to refer to this MLIR dialect.
10
11Unless explicitly stated otherwise, the semantics of the LLVM dialect operations
12must correspond to the semantics of LLVM IR instructions and any divergence is
13considered a bug. The dialect also contains auxiliary operations that smoothen
14the differences in the IR structure, e.g., MLIR does not have `phi` operations
15and LLVM IR does not have a `constant` operation. These auxiliary operations are
16systematically prefixed with `mlir`, e.g. `llvm.mlir.constant` where `llvm.` is
17the dialect namespace prefix.
Alex Zinenkof0597cb2019-02-22 09:00:2518
19[TOC]
20
Alex Zinenkoc2751252020-12-15 17:29:1221## Dependency on LLVM IR
Alex Zinenkof0597cb2019-02-22 09:00:2522
Alex Zinenkoc2751252020-12-15 17:29:1223LLVM dialect is not expected to depend on any object that requires an
24`LLVMContext`, such as an LLVM IR instruction or type. Instead, MLIR provides
25thread-safe alternatives compatible with the rest of the infrastructure. The
26dialect is allowed to depend on the LLVM IR objects that don't require a
27context, such as data layout and triple description.
28
29## Module Structure
30
31IR modules use the built-in MLIR `ModuleOp` and support all its features. In
32particular, modules can be named, nested and are subject to symbol visibility.
33Modules can contain any operations, including LLVM functions and globals.
34
35### Data Layout and Triple
36
37An IR module may have an optional data layout and triple information attached
38using MLIR attributes `llvm.data_layout` and `llvm.triple`, respectively. Both
39are string attributes with the
40[same syntax](https://llvm.org/docs/LangRef.html#data-layout) as in LLVM IR and
41are verified to be correct. They can be defined as follows.
42
43```mlir
44module attributes {llvm.data_layout = "e",
45 llvm.target_triple = "aarch64-linux-android"} {
46 // module contents
47}
48```
Alex Zinenkof0597cb2019-02-22 09:00:2549
Alex Zinenkoccdd8c72020-12-17 11:19:5250### Functions
51
52LLVM functions are represented by a special operation, `llvm.func`, that has
53syntax similar to that of the built-in function operation but supports
54LLVM-related features such as linkage and variadic argument lists. See detailed
55description in the operation list [below](#llvmfunc-mlirllvmllvmfuncop).
56
57### PHI Nodes and Block Arguments
58
59MLIR uses block arguments instead of PHI nodes to communicate values between
60blocks. Therefore, the LLVM dialect has no operation directly equivalent to
61`phi` in LLVM IR. Instead, all terminators can pass values as successor operands
62as these values will be forwarded as block arguments when the control flow is
63transferred.
64
65For example:
66
67```mlir
68^bb1:
Alex Zinenko2230bf92021-01-06 15:19:0469 %0 = llvm.addi %arg0, %cst : i32
70 llvm.br ^bb2[%0: i32]
Alex Zinenkoccdd8c72020-12-17 11:19:5271
72// If the control flow comes from ^bb1, %arg1 == %0.
Alex Zinenko2230bf92021-01-06 15:19:0473^bb2(%arg1: i32)
Alex Zinenkoccdd8c72020-12-17 11:19:5274 // ...
75```
76
77is equivalent to LLVM IR
78
79```llvm
80%0:
81 %1 = add i32 %arg0, %cst
82 br %3
83
84%3:
85 %arg1 = phi [%1, %0], //...
86```
87
88Since there is no need to use the block identifier to differentiate the source
89of different values, the LLVM dialect supports terminators that transfer the
90control flow to the same block with different arguments. For example:
91
92```mlir
93^bb1:
Alex Zinenko2230bf92021-01-06 15:19:0494 llvm.cond_br %cond, ^bb2[%0: i32], ^bb2[%1: i32]
Alex Zinenkoccdd8c72020-12-17 11:19:5295
Alex Zinenko2230bf92021-01-06 15:19:0496^bb2(%arg0: i32):
Alex Zinenkoccdd8c72020-12-17 11:19:5297 // ...
98```
99
100### Context-Level Values
101
102Some value kinds in LLVM IR, such as constants and undefs, are uniqued in
103context and used directly in relevant operations. MLIR does not support such
104values for thread-safety and concept parsimony reasons. Instead, regular values
105are produced by dedicated operations that have the corresponding semantics:
106[`llvm.mlir.constant`](#llvmmlirconstant-mlirllvmconstantop),
107[`llvm.mlir.undef`](#llvmmlirundef-mlirllvmundefop),
Markus Böck9b993362021-05-25 12:50:59108[`llvm.mlir.null`](#llvmmlirnull-mlirllvmnullop). Note how these operations are
Alex Zinenkoccdd8c72020-12-17 11:19:52109prefixed with `mlir.` to indicate that they don't belong to LLVM IR but are only
110necessary to model it in MLIR. The values produced by these operations are
111usable just like any other value.
112
113Examples:
114
115```mlir
116// Create an undefined value of structure type with a 32-bit integer followed
117// by a float.
Alex Zinenkodd5165a2021-01-06 15:21:08118%0 = llvm.mlir.undef : !llvm.struct<(i32, f32)>
Alex Zinenkoccdd8c72020-12-17 11:19:52119
120// Null pointer to i8.
121%1 = llvm.mlir.null : !llvm.ptr<i8>
122
123// Null pointer to a function with signature void().
124%2 = llvm.mlir.null : !llvm.ptr<func<void ()>>
125
126// Constant 42 as i32.
Alex Zinenko2230bf92021-01-06 15:19:04127%3 = llvm.mlir.constant(42 : i32) : i32
Alex Zinenkoccdd8c72020-12-17 11:19:52128
129// Splat dense vector constant.
Alex Zinenkobd30a792021-01-11 12:58:05130%3 = llvm.mlir.constant(dense<1.0> : vector<4xf32>) : vector<4xf32>
Alex Zinenkoccdd8c72020-12-17 11:19:52131```
132
Alex Zinenko7fd18502021-01-12 11:07:12133Note that constants list the type twice. This is an artifact of the LLVM dialect
134not using built-in types, which are used for typed MLIR attributes. The syntax
135will be reevaluated after considering composite constants.
Alex Zinenkoccdd8c72020-12-17 11:19:52136
137### Globals
138
139Global variables are also defined using a special operation,
140[`llvm.mlir.global`](#llvmmlirglobal-mlirllvmglobalop), located at the module
141level. Globals are MLIR symbols and are identified by their name.
142
143Since functions need to be isolated-from-above, i.e. values defined outside the
144function cannot be directly used inside the function, an additional operation,
145[`llvm.mlir.addressof`](#llvmmliraddressof-mlirllvmaddressofop), is provided to
146locally define a value containing the _address_ of a global. The actual value
147can then be loaded from that pointer, or a new value can be stored into it if
148the global is not declared constant. This is similar to LLVM IR where globals
149are accessed through name and have a pointer type.
150
151### Linkage
152
153Module-level named objects in the LLVM dialect, namely functions and globals,
154have an optional _linkage_ attribute derived from LLVM IR
155[linkage types](https://llvm.org/docs/LangRef.html#linkage-types). Linkage is
156specified by the same keyword as in LLVM IR and is located between the operation
157name (`llvm.func` or `llvm.global`) and the symbol name. If no linkage keyword
Markus Böck286a7a42021-10-29 07:19:11158is present, `external` linkage is assumed by default. Linkage is _distinct_ from
Alex Zinenkoccdd8c72020-12-17 11:19:52159MLIR symbol visibility.
160
161### Attribute Pass-Through
162
163The LLVM dialect provides a mechanism to forward function-level attributes to
164LLVM IR using the `passthrough` attribute. This is an array attribute containing
165either string attributes or array attributes. In the former case, the value of
166the string is interpreted as the name of LLVM IR function attribute. In the
167latter case, the array is expected to contain exactly two string attributes, the
168first corresponding to the name of LLVM IR function attribute, and the second
169corresponding to its value. Note that even integer LLVM IR function attributes
170have their value represented in the string form.
171
172Example:
173
174```mlir
175llvm.func @func() attributes {
176 passthrough = ["noinline", // value-less attribute
177 ["alignstack", "4"], // integer attribute with value
178 ["other", "attr"]] // attribute unknown to LLVM
179} {
180 llvm.return
181}
182```
183
184If the attribute is not known to LLVM IR, it will be attached as a string
185attribute.
186
River Riddle465ef552019-04-05 15:19:42187## Types
Alex Zinenkof0597cb2019-02-22 09:00:25188
Alex Zinenko7fd18502021-01-12 11:07:12189LLVM dialect uses built-in types whenever possible and defines a set of
190complementary types, which correspond to the LLVM IR types that cannot be
191directly represented with built-in types. Similarly to other MLIR context-owned
192objects, the creation and manipulation of LLVM dialect types is thread-safe.
Alex Zinenkoc2751252020-12-15 17:29:12193
194MLIR does not support module-scoped named type declarations, e.g. `%s = type
195{i32, i32}` in LLVM IR. Instead, types must be fully specified at each use,
196except for recursive types where only the first reference to a named type needs
Markus Böck9b993362021-05-25 12:50:59197to be fully specified. MLIR [type aliases](../LangRef.md/#type-aliases) can be used
Alex Zinenko7fd18502021-01-12 11:07:12198to achieve more compact syntax.
Alex Zinenkoc2751252020-12-15 17:29:12199
200The general syntax of LLVM dialect types is `!llvm.`, followed by a type kind
201identifier (e.g., `ptr` for pointer or `struct` for structure) and by an
202optional list of type parameters in angle brackets. The dialect follows MLIR
203style for types with nested angle brackets and keyword specifiers rather than
Alex Zinenko7fd18502021-01-12 11:07:12204using different bracket styles to differentiate types. Types inside the angle
205brackets may omit the `!llvm.` prefix for brevity: the parser first attempts to
206find a type (starting with `!` or a built-in type) and falls back to accepting a
207keyword. For example, `!llvm.ptr<!llvm.ptr<i32>>` and `!llvm.ptr<ptr<i32>>` are
208equivalent, with the latter being the canonical form, and denote a pointer to a
209pointer to a 32-bit integer.
Alex Zinenkoc2751252020-12-15 17:29:12210
Alex Zinenko7fd18502021-01-12 11:07:12211### Built-in Type Compatibility
Alex Zinenkoc2751252020-12-15 17:29:12212
Alex Zinenko7fd18502021-01-12 11:07:12213LLVM dialect accepts a subset of built-in types that are referred to as _LLVM
214dialect-compatible types_. The following types are compatible:
215
216- Signless integers - `iN` (`IntegerType`).
Valentin Clementcf0173d2021-01-15 15:29:37217- Floating point types - `bfloat`, `half`, `float`, `double` , `f80`, `f128`
218 (`FloatType`).
Alex Zinenko7fd18502021-01-12 11:07:12219- 1D vectors of signless integers or floating point types - `vector<NxT>`
220 (`VectorType`).
221
222Note that only a subset of types that can be represented by a given class is
223compatible. For example, signed and unsigned integers are not compatible. LLVM
224provides a function, `bool LLVM::isCompatibleType(Type)`, that can be used as a
225compatibility check.
226
Alex Zinenko9a60ad22021-01-19 12:42:16227Each LLVM IR type corresponds to *exactly one* MLIR type, either built-in or
228LLVM dialect type. For example, because `i32` is LLVM-compatible, there is no
229`!llvm.i32` type. However, `!llvm.ptr<T>` is defined in the LLVM dialect as
230there is no corresponding built-in type.
231
Alex Zinenko7fd18502021-01-12 11:07:12232### Additional Simple Types
233
234The following non-parametric types derived from the LLVM IR are available in the
235LLVM dialect:
Alex Zinenkoc2751252020-12-15 17:29:12236
Alex Zinenkoc2751252020-12-15 17:29:12237- `!llvm.x86_mmx` (`LLVMX86MMXType`) - value held in an MMX register on x86
238 machine.
239- `!llvm.ppc_fp128` (`LLVMPPCFP128Type`) - 128-bit floating-point value (two
240 64 bits).
241- `!llvm.token` (`LLVMTokenType`) - a non-inspectable value associated with an
242 operation.
243- `!llvm.metadata` (`LLVMMetadataType`) - LLVM IR metadata, to be used only if
244 the metadata cannot be represented as structured MLIR attributes.
245- `!llvm.void` (`LLVMVoidType`) - does not represent any value; can only
246 appear in function results.
247
248These types represent a single value (or an absence thereof in case of `void`)
249and correspond to their LLVM IR counterparts.
250
Alex Zinenko7fd18502021-01-12 11:07:12251### Additional Parametric Types
Alex Zinenkoc2751252020-12-15 17:29:12252
Alex Zinenko7fd18502021-01-12 11:07:12253These types are parameterized by the types they contain, e.g., the pointee or
254the element type, which can be either compatible built-in or LLVM dialect types.
Alex Zinenkof0597cb2019-02-22 09:00:25255
Alex Zinenkoc2751252020-12-15 17:29:12256#### Pointer Types
Alex Zinenkof0597cb2019-02-22 09:00:25257
Alex Zinenkoc2751252020-12-15 17:29:12258Pointer types specify an address in memory.
259
260Pointer types are parametric types parameterized by the element type and the
261address space. The address space is an integer, but this choice may be
262reconsidered if MLIR implements named address spaces. Their syntax is as
263follows:
264
265```
Alex Zinenko7fd18502021-01-12 11:07:12266 llvm-ptr-type ::= `!llvm.ptr<` type (`,` integer-literal)? `>`
Alex Zinenkoc2751252020-12-15 17:29:12267```
268
269where the optional integer literal corresponds to the memory space. Both cases
270are represented by `LLVMPointerType` internally.
271
Alex Zinenkoc2751252020-12-15 17:29:12272#### Array Types
273
Alex Zinenko7fd18502021-01-12 11:07:12274Array types represent sequences of elements in memory. Array elements can be
275addressed with a value unknown at compile time, and can be nested. Only 1D
276arrays are allowed though.
Alex Zinenkoc2751252020-12-15 17:29:12277
278Array types are parameterized by the fixed size and the element type.
Alex Zinenko7fd18502021-01-12 11:07:12279Syntactically, their representation is the following:
Alex Zinenkoc2751252020-12-15 17:29:12280
281```
Alex Zinenko7fd18502021-01-12 11:07:12282 llvm-array-type ::= `!llvm.array<` integer-literal `x` type `>`
Alex Zinenkoc2751252020-12-15 17:29:12283```
284
Alex Zinenko7fd18502021-01-12 11:07:12285and they are internally represented as `LLVMArrayType`.
Alex Zinenkoc2751252020-12-15 17:29:12286
287#### Function Types
288
289Function types represent the type of a function, i.e. its signature.
290
291Function types are parameterized by the result type, the list of argument types
292and by an optional "variadic" flag. Unlike built-in `FunctionType`, LLVM dialect
293functions (`LLVMFunctionType`) always have single result, which may be
294`!llvm.void` if the function does not return anything. The syntax is as follows:
295
296```
Alex Zinenko7fd18502021-01-12 11:07:12297 llvm-func-type ::= `!llvm.func<` type `(` type-list (`,` `...`)? `)` `>`
Alex Zinenkoc2751252020-12-15 17:29:12298```
299
300For example,
301
302```mlir
Alex Zinenko7fd18502021-01-12 11:07:12303!llvm.func<void ()> // a function with no arguments;
304!llvm.func<i32 (f32, i32)> // a function with two arguments and a result;
Alex Zinenkoc2751252020-12-15 17:29:12305!llvm.func<void (i32, ...)> // a variadic function with at least one argument.
306```
307
308In the LLVM dialect, functions are not first-class objects and one cannot have a
309value of function type. Instead, one can take the address of a function and
310operate on pointers to functions.
311
Alex Zinenko7fd18502021-01-12 11:07:12312### Vector Types
313
314Vector types represent sequences of elements, typically when multiple data
315elements are processed by a single instruction (SIMD). Vectors are thought of as
316stored in registers and therefore vector elements can only be addressed through
317constant indices.
318
319Vector types are parameterized by the size, which may be either _fixed_ or a
320multiple of some fixed size in case of _scalable_ vectors, and the element type.
321Vectors cannot be nested and only 1D vectors are supported. Scalable vectors are
322still considered 1D.
323
324LLVM dialect uses built-in vector types for _fixed_-size vectors of built-in
325types, and provides additional types for fixed-sized vectors of LLVM dialect
326types (`LLVMFixedVectorType`) and scalable vectors of any types
327(`LLVMScalableVectorType`). These two additional types share the following
328syntax:
329
330```
331 llvm-vec-type ::= `!llvm.vec<` (`?` `x`)? integer-literal `x` type `>`
332```
333
334Note that the sets of element types supported by built-in and LLVM dialect
335vector types are mutually exclusive, e.g., the built-in vector type does not
336accept `!llvm.ptr<i32>` and the LLVM dialect fixed-width vector type does not
337accept `i32`.
338
339The following functions are provided to operate on any kind of the vector types
340compatible with the LLVM dialect:
341
342- `bool LLVM::isCompatibleVectorType(Type)` - checks whether a type is a
343 vector type compatible with the LLVM dialect;
344- `Type LLVM::getVectorElementType(Type)` - returns the element type of any
345 vector type compatible with the LLVM dialect;
346- `llvm::ElementCount LLVM::getVectorNumElements(Type)` - returns the number
347 of elements in any vector type compatible with the LLVM dialect;
348- `Type LLVM::getFixedVectorType(Type, unsigned)` - gets a fixed vector type
349 with the given element type and size; the resulting type is either a
350 built-in or an LLVM dialect vector type depending on which one supports the
351 given element type.
352
353#### Examples of Compatible Vector Types
354
355```mlir
356vector<42 x i32> // Vector of 42 32-bit integers.
357!llvm.vec<42 x ptr<i32>> // Vector of 42 pointers to 32-bit integers.
358!llvm.vec<? x 4 x i32> // Scalable vector of 32-bit integers with
359 // size divisible by 4.
360!llvm.array<2 x vector<2 x i32>> // Array of 2 vectors of 2 32-bit integers.
361!llvm.array<2 x vec<2 x ptr<i32>>> // Array of 2 vectors of 2 pointers to 32-bit
362 // integers.
363```
364
Alex Zinenkoc2751252020-12-15 17:29:12365### Structure Types
366
367The structure type is used to represent a collection of data members together in
368memory. The elements of a structure may be any type that has a size.
369
370Structure types are represented in a single dedicated class
371mlir::LLVM::LLVMStructType. Internally, the struct type stores a (potentially
372empty) name, a (potentially empty) list of contained types and a bitmask
373indicating whether the struct is named, opaque, packed or uninitialized.
374Structure types that don't have a name are referred to as _literal_ structs.
375Such structures are uniquely identified by their contents. _Identified_ structs
376on the other hand are uniquely identified by the name.
377
378#### Identified Structure Types
379
380Identified structure types are uniqued using their name in a given context.
381Attempting to construct an identified structure with the same name a structure
382that already exists in the context *will result in the existing structure being
383returned*. **MLIR does not auto-rename identified structs in case of name
384conflicts** because there is no naming scope equivalent to a module in LLVM IR
385since MLIR modules can be arbitrarily nested.
386
387Programmatically, identified structures can be constructed in an _uninitialized_
388state. In this case, they are given a name but the body must be set up by a
389later call, using MLIR's type mutation mechanism. Such uninitialized types can
390be used in type construction, but must be eventually initialized for IR to be
391valid. This mechanism allows for constructing _recursive_ or mutually referring
392structure types: an uninitialized type can be used in its own initialization.
393
394Once the type is initialized, its body cannot be changed anymore. Any further
395attempts to modify the body will fail and return failure to the caller _unless
396the type is initialized with the exact same body_. Type initialization is
397thread-safe; however, if a concurrent thread initializes the type before the
398current thread, the initialization may return failure.
399
400The syntax for identified structure types is as follows.
401
402```
403llvm-ident-struct-type ::= `!llvm.struct<` string-literal, `opaque` `>`
404 | `!llvm.struct<` string-literal, `packed`?
Alex Zinenko7fd18502021-01-12 11:07:12405 `(` type-or-ref-list `)` `>`
406type-or-ref-list ::= <maybe empty comma-separated list of type-or-ref>
407type-or-ref ::= <any compatible type with optional !llvm.>
408 | `!llvm.`? `struct<` string-literal `>`
Alex Zinenkoc2751252020-12-15 17:29:12409```
410
411The body of the identified struct is printed in full unless the it is
412transitively contained in the same struct. In the latter case, only the
413identifier is printed. For example, the structure containing the pointer to
414itself is represented as `!llvm.struct<"A", (ptr<"A">)>`, and the structure `A`
415containing two pointers to the structure `B` containing a pointer to the
416structure `A` is represented as `!llvm.struct<"A", (ptr<"B", (ptr<"A">)>,
417ptr<"B", (ptr<"A">))>`. Note that the structure `B` is "unrolled" for both
418elements. _A structure with the same name but different body is a syntax error._
419**The user must ensure structure name uniqueness across all modules processed in
Kazuaki Ishizaki2b638ed2021-01-06 17:35:29420a given MLIR context.** Structure names are arbitrary string literals and may
Alex Zinenkoc2751252020-12-15 17:29:12421include, e.g., spaces and keywords.
422
423Identified structs may be _opaque_. In this case, the body is unknown but the
424structure type is considered _initialized_ and is valid in the IR.
425
426#### Literal Structure Types
427
428Literal structures are uniqued according to the list of elements they contain,
429and can optionally be packed. The syntax for such structs is as follows.
430
431```
Alex Zinenko7fd18502021-01-12 11:07:12432llvm-literal-struct-type ::= `!llvm.struct<` `packed`? `(` type-list `)` `>`
433type-list ::= <maybe empty comma-separated list of types with optional !llvm.>
Alex Zinenkoc2751252020-12-15 17:29:12434```
435
436Literal structs cannot be recursive, but can contain other structs. Therefore,
437they must be constructed in a single step with the entire list of contained
438elements provided.
439
440#### Examples of Structure Types
441
442```mlir
443!llvm.struct<> // NOT allowed
444!llvm.struct<()> // empty, literal
445!llvm.struct<(i32)> // literal
446!llvm.struct<(struct<(i32)>)> // struct containing a struct
447!llvm.struct<packed (i8, i32)> // packed struct
448!llvm.struct<"a"> // recursive reference, only allowed within
449 // another struct, NOT allowed at top level
450!llvm.struct<"a", ptr<struct<"a">>> // supported example of recursive reference
451!llvm.struct<"a", ()> // empty, named (necessary to differentiate from
452 // recursive reference)
453!llvm.struct<"a", opaque> // opaque, named
454!llvm.struct<"a", (i32)> // named
455!llvm.struct<"a", packed (i8, i32)> // named, packed
456```
457
458### Unsupported Types
459
460LLVM IR `label` type does not have a counterpart in the LLVM dialect since, in
461MLIR, blocks are not values and don't need a type.
Alex Zinenkof0597cb2019-02-22 09:00:25462
River Riddle465ef552019-04-05 15:19:42463## Operations
Alex Zinenkof0597cb2019-02-22 09:00:25464
Alex Zinenko736bef72019-04-02 22:33:54465All operations in the LLVM IR dialect have a custom form in MLIR. The mnemonic
466of an operation is that used in LLVM IR prefixed with "`llvm.`".
Alex Zinenkof0597cb2019-02-22 09:00:25467
Alex Zinenkoeb4917d2020-12-17 13:09:11468[include "Dialects/LLVMOps.md"]