Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 1 | # LLVM IR Target |
| 2 | |
| 3 | This document describes the mechanisms of producing LLVM IR from MLIR. The |
| 4 | overall flow is two-stage: |
| 5 | |
| 6 | 1. **conversion** of the IR to a set of dialects translatable to LLVM IR, for |
| 7 | example [LLVM Dialect](Dialects/LLVM.md) or one of the hardware-specific |
| 8 | dialects derived from LLVM IR intrinsics such as [AMX](Dialects/AMX.md), |
| 9 | [X86Vector](Dialects/X86Vector.md) or [ArmNeon](Dialects/ArmNeon.md); |
| 10 | 2. **translation** of MLIR dialects to LLVM IR. |
| 11 | |
| 12 | This flow allows the non-trivial transformation to be performed within MLIR |
| 13 | using MLIR APIs and makes the translation between MLIR and LLVM IR *simple* and |
| 14 | potentially bidirectional. As a corollary, dialect ops translatable to LLVM IR |
| 15 | are expected to closely match the corresponding LLVM IR instructions and |
| 16 | intrinsics. This minimizes the dependency on LLVM IR libraries in MLIR as well |
| 17 | as reduces the churn in case of changes. |
| 18 | |
| 19 | SPIR-V to LLVM dialect conversion has a |
| 20 | [dedicated document](SPIRVToLLVMDialectConversion.md). |
| 21 | |
| 22 | [TOC] |
| 23 | |
| 24 | ## Conversion to the LLVM Dialect |
| 25 | |
| 26 | Conversion to the LLVM dialect from other dialects is the first step to produce |
| 27 | LLVM IR. All non-trivial IR modifications are expected to happen at this stage |
| 28 | or before. The conversion is *progressive*: most passes convert one dialect to |
| 29 | the LLVM dialect and keep operations from other dialects intact. For example, |
| 30 | the `-convert-memref-to-llvm` pass will only convert operations from the |
| 31 | `memref` dialect but will not convert operations from other dialects even if |
| 32 | they use or produce `memref`-typed values. |
| 33 | |
| 34 | The process relies on the [Dialect Conversion](DialectConversion.md) |
| 35 | infrastructure and, in particular, on the |
| 36 | [materialization](DialectConversion.md#type-conversion) hooks of `TypeConverter` |
| 37 | to support progressive lowering by injecting `unrealized_conversion_cast` |
| 38 | operations between converted and unconverted operations. After multiple partial |
| 39 | conversions to the LLVM dialect are performed, the cast operations that became |
| 40 | noop can be removed by the `-reconcile-unrealized-casts` pass. The latter pass |
| 41 | is not specific to the LLVM dialect and can remove any noop casts. |
| 42 | |
| 43 | ### Conversion of Built-in Types |
| 44 | |
| 45 | Built-in types have a default conversion to LLVM dialect types provided by the |
| 46 | `LLVMTypeConverter` class. Users targeting the LLVM dialect can reuse and extend |
| 47 | this type converter to support other types. Extra care must be taken if the |
| 48 | conversion rules for built-in types are overridden: all conversion must use the |
| 49 | same type converter. |
| 50 | |
| 51 | #### LLVM Dialect-compatible Types |
| 52 | |
| 53 | The types [compatible](Dialects/LLVM.md#built-in-type-compatibility) with the |
| 54 | LLVM dialect are kept as is. |
| 55 | |
| 56 | #### Complex Type |
| 57 | |
| 58 | Complex type is converted into an LLVM dialect literal structure type with two |
| 59 | elements: |
| 60 | |
| 61 | - real part; |
| 62 | - imaginary part. |
| 63 | |
| 64 | The elemental type is converted recursively using these rules. |
| 65 | |
| 66 | Example: |
| 67 | |
| 68 | ```mlir |
| 69 | complex<f32> |
| 70 | // -> |
| 71 | !llvm.struct<(f32, f32)> |
| 72 | ``` |
| 73 | |
| 74 | #### Index Type |
| 75 | |
| 76 | Index type is converted into an LLVM dialect integer type with the bitwidth |
| 77 | specified by the [data layout](DataLayout.md) of the closest module. For |
| 78 | example, on x86-64 CPUs it converts to i64. This behavior can be overridden by |
| 79 | the type converter configuration, which is often exposed as a pass option by |
| 80 | conversion passes. |
| 81 | |
| 82 | Example: |
| 83 | |
| 84 | ```mlir |
| 85 | index |
| 86 | // -> on x86_64 |
| 87 | i64 |
| 88 | ``` |
| 89 | |
| 90 | #### Ranked MemRef Types |
| 91 | |
| 92 | Ranked memref types are converted into an LLVM dialect literal structure type |
| 93 | that contains the dynamic information associated with the memref object, |
| 94 | referred to as *descriptor*. Only memrefs in the |
| 95 | **[strided form](Dialects/Builtin.md/#strided-memref)** can be converted to the |
| 96 | LLVM dialect with the default descriptor format. Memrefs with other, less |
| 97 | trivial layouts should be converted into the strided form first, e.g., by |
| 98 | materializing the non-trivial address remapping due to layout as `affine.apply` |
| 99 | operations. |
| 100 | |
| 101 | The default memref descriptor is a struct with the following fields: |
| 102 | |
| 103 | 1. The pointer to the data buffer as allocated, referred to as "allocated |
| 104 | pointer". This is only useful for deallocating the memref. |
| 105 | 2. The pointer to the properly aligned data pointer that the memref indexes, |
| 106 | referred to as "aligned pointer". |
| 107 | 3. A lowered converted `index`-type integer containing the distance in number |
| 108 | of elements between the beginning of the (aligned) buffer and the first |
| 109 | element to be accessed through the memref, referred to as "offset". |
| 110 | 4. An array containing as many converted `index`-type integers as the rank of |
| 111 | the memref: the array represents the size, in number of elements, of the |
| 112 | memref along the given dimension. |
| 113 | 5. A second array containing as many converted `index`-type integers as the |
| 114 | rank of memref: the second array represents the "stride" (in tensor |
| 115 | abstraction sense), i.e. the number of consecutive elements of the |
| 116 | underlying buffer one needs to jump over to get to the next logically |
| 117 | indexed element. |
| 118 | |
| 119 | For constant memref dimensions, the corresponding size entry is a constant whose |
| 120 | runtime value matches the static value. This normalization serves as an ABI for |
| 121 | the memref type to interoperate with externally linked functions. In the |
| 122 | particular case of rank `0` memrefs, the size and stride arrays are omitted, |
| 123 | resulting in a struct containing two pointers + offset. |
| 124 | |
| 125 | Examples: |
| 126 | |
| 127 | ```mlir |
| 128 | // Assuming index is converted to i64. |
| 129 | |
| 130 | memref<f32> -> !llvm.struct<(ptr<f32> , ptr<f32>, i64)> |
| 131 | memref<1 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 132 | array<1 x 64>, array<1 x i64>)> |
| 133 | memref<? x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64 |
| 134 | array<1 x 64>, array<1 x i64>)> |
| 135 | memref<10x42x42x43x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64 |
| 136 | array<5 x 64>, array<5 x i64>)> |
| 137 | memref<10x?x42x?x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64 |
| 138 | array<5 x 64>, array<5 x i64>)> |
| 139 | |
| 140 | // Memref types can have vectors as element types |
| 141 | memref<1x? x vector<4xf32>> -> !llvm.struct<(ptr<vector<4 x f32>>, |
| 142 | ptr<vector<4 x f32>>, i64, |
| 143 | array<2 x i64>, array<2 x i64>)> |
| 144 | ``` |
| 145 | |
| 146 | #### Unranked MemRef Types |
| 147 | |
| 148 | Unranked memref types are converted to LLVM dialect literal structure type that |
Markus Böck | 286a7a4 | 2021-10-29 07:19:11 | [diff] [blame] | 149 | contains the dynamic information associated with the memref object, referred to |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 150 | as *unranked descriptor*. It contains: |
| 151 | |
| 152 | 1. a converted `index`-typed integer representing the dynamic rank of the |
| 153 | memref; |
| 154 | 2. a type-erased pointer (`!llvm.ptr<i8>`) to a ranked memref descriptor with |
| 155 | the contents listed above. |
| 156 | |
| 157 | This descriptor is primarily intended for interfacing with rank-polymorphic |
| 158 | library functions. The pointer to the ranked memref descriptor points to some |
| 159 | *allocated* memory, which may reside on stack of the current function or in |
| 160 | heap. Conversion patterns for operations producing unranked memrefs are expected |
| 161 | to manage the allocation. Note that this may lead to stack allocations |
| 162 | (`llvm.alloca`) being performed in a loop and not reclaimed until the end of the |
| 163 | current function. |
| 164 | |
| 165 | #### Function Types |
| 166 | |
| 167 | Function types are converted to LLVM dialect function types as follows: |
| 168 | |
| 169 | - function argument and result types are converted recursively using these |
| 170 | rules; |
| 171 | - if a function type has multiple results, they are wrapped into an LLVM |
| 172 | dialect literal structure type since LLVM function types must have exactly |
| 173 | one result; |
| 174 | - if a function type has no results, the corresponding LLVM dialect function |
| 175 | type will have one `!llvm.void` result since LLVM function types must have a |
| 176 | result; |
| 177 | - function types used in arguments of another function type are wrapped in an |
| 178 | LLVM dialect pointer type to comply with LLVM IR expectations; |
| 179 | - the structs corresponding to `memref` types, both ranked and unranked, |
| 180 | appearing as function arguments are unbundled into individual function |
| 181 | arguments to allow for specifying metadata such as aliasing information on |
| 182 | individual pointers; |
| 183 | - the conversion of `memref`-typed arguments is subject to |
| 184 | [calling conventions](TargetLLVMIR.md#calling-conventions). |
| 185 | |
| 186 | Examples: |
| 187 | |
| 188 | ```mlir |
| 189 | // Zero-ary function type with no results: |
| 190 | () -> () |
| 191 | // is converted to a zero-ary function with `void` result. |
| 192 | !llvm.func<void ()> |
| 193 | |
| 194 | // Unary function with one result: |
| 195 | (i32) -> (i64) |
| 196 | // has its argument and result type converted, before creating the LLVM dialect |
| 197 | // function type. |
| 198 | !llvm.func<i64 (i32)> |
| 199 | |
| 200 | // Binary function with one result: |
| 201 | (i32, f32) -> (i64) |
| 202 | // has its arguments handled separately |
| 203 | !llvm.func<i64 (i32, f32)> |
| 204 | |
| 205 | // Binary function with two results: |
| 206 | (i32, f32) -> (i64, f64) |
| 207 | // has its result aggregated into a structure type. |
| 208 | !llvm.func<struct<(i64, f64)> (i32, f32)> |
| 209 | |
| 210 | // Function-typed arguments or results in higher-order functions: |
| 211 | (() -> ()) -> (() -> ()) |
| 212 | // are converted into pointers to functions. |
| 213 | !llvm.func<ptr<func<void ()>> (ptr<func<void ()>>)> |
| 214 | |
| 215 | // These rules apply recursively: a function type taking a function that takes |
| 216 | // another function |
| 217 | ( ( (i32) -> (i64) ) -> () ) -> () |
| 218 | // is converted into a function type taking a pointer-to-function that takes |
| 219 | // another point-to-function. |
| 220 | !llvm.func<void (ptr<func<void (ptr<func<i64 (i32)>>)>>)> |
| 221 | |
| 222 | // A memref descriptor appearing as function argument: |
| 223 | (memref<f32>) -> () |
| 224 | // gets converted into a list of individual scalar components of a descriptor. |
| 225 | !llvm.func<void (ptr<f32>, ptr<f32>, i64)> |
| 226 | |
| 227 | // The list of arguments is linearized and one can freely mix memref and other |
| 228 | // types in this list: |
| 229 | (memref<f32>, f32) -> () |
| 230 | // which gets converted into a flat list. |
| 231 | !llvm.func<void (ptr<f32>, ptr<f32>, i64, f32)> |
| 232 | |
| 233 | // For nD ranked memref descriptors: |
| 234 | (memref<?x?xf32>) -> () |
| 235 | // the converted signature will contain 2n+1 `index`-typed integer arguments, |
| 236 | // offset, n sizes and n strides, per memref argument type. |
| 237 | !llvm.func<void (ptr<f32>, ptr<f32>, i64, i64, i64, i64, i64)> |
| 238 | |
| 239 | // Same rules apply to unranked descriptors: |
| 240 | (memref<*xf32>) -> () |
| 241 | // which get converted into their components. |
| 242 | !llvm.func<void (i64, ptr<i8>)> |
| 243 | |
| 244 | // However, returning a memref from a function is not affected: |
| 245 | () -> (memref<?xf32>) |
| 246 | // gets converted to a function returning a descriptor structure. |
| 247 | !llvm.func<struct<(ptr<f32>, ptr<f32>, i64, array<1xi64>, array<1xi64>)> ()> |
| 248 | |
| 249 | // If multiple memref-typed results are returned: |
| 250 | () -> (memref<f32>, memref<f64>) |
| 251 | // their descriptor structures are additionally packed into another structure, |
| 252 | // potentially with other non-memref typed results. |
| 253 | !llvm.func<struct<(struct<(ptr<f32>, ptr<f32>, i64)>, |
| 254 | struct<(ptr<double>, ptr<double>, i64)>)> ()> |
| 255 | ``` |
| 256 | |
| 257 | Conversion patterns are available to convert built-in function operations and |
| 258 | standard call operations targeting those functions using these conversion rules. |
| 259 | |
| 260 | #### Multi-dimensional Vector Types |
| 261 | |
| 262 | LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can |
| 263 | be multi-dimensional. Vector types cannot be nested in either IR. In the |
| 264 | one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same |
| 265 | size with element type converted using these conversion rules. In the |
| 266 | n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types |
| 267 | of one-dimensional vectors. |
| 268 | |
| 269 | Examples: |
| 270 | |
| 271 | ``` |
| 272 | vector<4x8 x f32> |
| 273 | // -> |
| 274 | !llvm.array<4 x vector<8 x f32>> |
| 275 | |
| 276 | memref<2 x vector<4x8 x f32> |
| 277 | // -> |
| 278 | !llvm.struct<(ptr<array<4 x vector<8xf32>>>, ptr<array<4 x vector<8xf32>>> |
| 279 | i64, array<1 x i64>, array<1 x i64>)> |
| 280 | ``` |
| 281 | |
| 282 | #### Tensor Types |
| 283 | |
| 284 | Tensor types cannot be converted to the LLVM dialect. Operations on tensors must |
| 285 | be [bufferized](Bufferization.md) before being converted. |
| 286 | |
| 287 | ### Calling Conventions |
| 288 | |
| 289 | Calling conventions provides a mechanism to customize the conversion of function |
| 290 | and function call operations without changing how individual types are handled |
| 291 | elsewhere. They are implemented simultaneously by the default type converter and |
| 292 | by the conversion patterns for the relevant operations. |
| 293 | |
| 294 | #### Function Result Packing |
| 295 | |
| 296 | In case of multi-result functions, the returned values are inserted into a |
| 297 | structure-typed value before being returned and extracted from it at the call |
| 298 | site. This transformation is a part of the conversion and is transparent to the |
| 299 | defines and uses of the values being returned. |
| 300 | |
| 301 | Example: |
| 302 | |
| 303 | ```mlir |
| 304 | func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) { |
| 305 | return %arg0, %arg1 : i32, i64 |
| 306 | } |
| 307 | func @bar() { |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 308 | %0 = arith.constant 42 : i32 |
| 309 | %1 = arith.constant 17 : i64 |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 310 | %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64) |
| 311 | "use_i32"(%2#0) : (i32) -> () |
| 312 | "use_i64"(%2#1) : (i64) -> () |
| 313 | } |
| 314 | |
| 315 | // is transformed into |
| 316 | |
| 317 | llvm.func @foo(%arg0: i32, %arg1: i64) -> !llvm.struct<(i32, i64)> { |
| 318 | // insert the vales into a structure |
| 319 | %0 = llvm.mlir.undef : !llvm.struct<(i32, i64)> |
| 320 | %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i32, i64)> |
| 321 | %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i32, i64)> |
| 322 | |
| 323 | // return the structure value |
| 324 | llvm.return %2 : !llvm.struct<(i32, i64)> |
| 325 | } |
| 326 | llvm.func @bar() { |
| 327 | %0 = llvm.mlir.constant(42 : i32) : i32 |
| 328 | %1 = llvm.mlir.constant(17) : i64 |
| 329 | |
| 330 | // call and extract the values from the structure |
| 331 | %2 = llvm.call @bar(%0, %1) |
| 332 | : (i32, i32) -> !llvm.struct<(i32, i64)> |
| 333 | %3 = llvm.extractvalue %2[0] : !llvm.struct<(i32, i64)> |
| 334 | %4 = llvm.extractvalue %2[1] : !llvm.struct<(i32, i64)> |
| 335 | |
| 336 | // use as before |
| 337 | "use_i32"(%3) : (i32) -> () |
| 338 | "use_i64"(%4) : (i64) -> () |
| 339 | } |
| 340 | ``` |
| 341 | |
| 342 | #### Default Calling Convention for Ranked MemRef |
| 343 | |
| 344 | The default calling convention converts `memref`-typed function arguments to |
| 345 | LLVM dialect literal structs |
| 346 | [defined above](TargetLLVMIR.md#ranked-memref-types) before unbundling them into |
| 347 | individual scalar arguments. |
| 348 | |
| 349 | Examples: |
| 350 | |
River Riddle | 23aa5a7 | 2022-02-26 22:49:54 | [diff] [blame^] | 351 | This convention is implemented in the conversion of `builtin.func` and `func.call` to |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 352 | the LLVM dialect, with the former unpacking the descriptor into a set of |
| 353 | individual values and the latter packing those values back into a descriptor so |
| 354 | as to make it transparently usable by other operations. Conversions from other |
| 355 | dialects should take this convention into account. |
| 356 | |
| 357 | This specific convention is motivated by the necessity to specify alignment and |
| 358 | aliasing attributes on the raw pointers underpinning the memref. |
| 359 | |
| 360 | Examples: |
| 361 | |
| 362 | ```mlir |
| 363 | func @foo(%arg0: memref<?xf32>) -> () { |
| 364 | "use"(%arg0) : (memref<?xf32>) -> () |
| 365 | return |
| 366 | } |
| 367 | |
| 368 | // Gets converted to the following |
| 369 | // (using type alias for brevity): |
| 370 | !llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 371 | array<1xi64>, array<1xi64>)> |
| 372 | |
| 373 | llvm.func @foo(%arg0: !llvm.ptr<f32>, // Allocated pointer. |
| 374 | %arg1: !llvm.ptr<f32>, // Aligned pointer. |
| 375 | %arg2: i64, // Offset. |
| 376 | %arg3: i64, // Size in dim 0. |
| 377 | %arg4: i64) { // Stride in dim 0. |
| 378 | // Populate memref descriptor structure. |
| 379 | %0 = llvm.mlir.undef : |
| 380 | %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_1d |
| 381 | %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_1d |
| 382 | %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_1d |
| 383 | %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_1d |
| 384 | %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm.memref_1d |
| 385 | |
| 386 | // Descriptor is now usable as a single value. |
| 387 | "use"(%5) : (!llvm.memref_1d) -> () |
| 388 | llvm.return |
| 389 | } |
| 390 | ``` |
| 391 | |
| 392 | ```mlir |
| 393 | func @bar() { |
| 394 | %0 = "get"() : () -> (memref<?xf32>) |
| 395 | call @foo(%0) : (memref<?xf32>) -> () |
| 396 | return |
| 397 | } |
| 398 | |
| 399 | // Gets converted to the following |
| 400 | // (using type alias for brevity): |
| 401 | !llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 402 | array<1xi64>, array<1xi64>)> |
| 403 | |
| 404 | llvm.func @bar() { |
| 405 | %0 = "get"() : () -> !llvm.memref_1d |
| 406 | |
| 407 | // Unpack the memref descriptor. |
| 408 | %1 = llvm.extractvalue %0[0] : !llvm.memref_1d |
| 409 | %2 = llvm.extractvalue %0[1] : !llvm.memref_1d |
| 410 | %3 = llvm.extractvalue %0[2] : !llvm.memref_1d |
| 411 | %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_1d |
| 412 | %5 = llvm.extractvalue %0[4, 0] : !llvm.memref_1d |
| 413 | |
| 414 | // Pass individual values to the callee. |
| 415 | llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm.memref_1d) -> () |
| 416 | llvm.return |
| 417 | } |
| 418 | ``` |
| 419 | |
| 420 | #### Default Calling Convention for Unranked MemRef |
| 421 | |
| 422 | For unranked memrefs, the list of function arguments always contains two |
| 423 | elements, same as the unranked memref descriptor: an integer rank, and a |
| 424 | type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that |
| 425 | while the *calling convention* does not require allocation, *casting* to |
| 426 | unranked memref does since one cannot take an address of an SSA value containing |
| 427 | the ranked memref, which must be stored in some memory instead. The caller is in |
| 428 | charge of ensuring the thread safety and management of the allocated memory, in |
| 429 | particular the deallocation. |
| 430 | |
| 431 | Example |
| 432 | |
| 433 | ```mlir |
| 434 | llvm.func @foo(%arg0: memref<*xf32>) -> () { |
| 435 | "use"(%arg0) : (memref<*xf32>) -> () |
| 436 | return |
| 437 | } |
| 438 | |
| 439 | // Gets converted to the following. |
| 440 | |
| 441 | llvm.func @foo(%arg0: i64 // Rank. |
| 442 | %arg1: !llvm.ptr<i8>) { // Type-erased pointer to descriptor. |
| 443 | // Pack the unranked memref descriptor. |
| 444 | %0 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)> |
| 445 | %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i64, ptr<i8>)> |
| 446 | %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i64, ptr<i8>)> |
| 447 | |
| 448 | "use"(%2) : (!llvm.struct<(i64, ptr<i8>)>) -> () |
| 449 | llvm.return |
| 450 | } |
| 451 | ``` |
| 452 | |
| 453 | ```mlir |
| 454 | llvm.func @bar() { |
| 455 | %0 = "get"() : () -> (memref<*xf32>) |
| 456 | call @foo(%0): (memref<*xf32>) -> () |
| 457 | return |
| 458 | } |
| 459 | |
| 460 | // Gets converted to the following. |
| 461 | |
| 462 | llvm.func @bar() { |
| 463 | %0 = "get"() : () -> (!llvm.struct<(i64, ptr<i8>)>) |
| 464 | |
| 465 | // Unpack the memref descriptor. |
| 466 | %1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)> |
| 467 | %2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)> |
| 468 | |
| 469 | // Pass individual values to the callee. |
| 470 | llvm.call @foo(%1, %2) : (i64, !llvm.ptr<i8>) |
| 471 | llvm.return |
| 472 | } |
| 473 | ``` |
| 474 | |
| 475 | **Lifetime.** The second element of the unranked memref descriptor points to |
| 476 | some memory in which the ranked memref descriptor is stored. By convention, this |
| 477 | memory is allocated on stack and has the lifetime of the function. (*Note:* due |
| 478 | to function-length lifetime, creation of multiple unranked memref descriptors, |
| 479 | e.g., in a loop, may lead to stack overflows.) If an unranked descriptor has to |
| 480 | be returned from a function, the ranked descriptor it points to is copied into |
| 481 | dynamically allocated memory, and the pointer in the unranked descriptor is |
| 482 | updated accordingly. The allocation happens immediately before returning. It is |
| 483 | the responsibility of the caller to free the dynamically allocated memory. The |
River Riddle | 23aa5a7 | 2022-02-26 22:49:54 | [diff] [blame^] | 484 | default conversion of `func.call` and `func.call_indirect` copies the ranked |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 485 | descriptor to newly allocated memory on the caller's stack. Thus, the convention |
| 486 | of the ranked memref descriptor pointed to by an unranked memref descriptor |
| 487 | being stored on stack is respected. |
| 488 | |
| 489 | #### Bare Pointer Calling Convention for Ranked MemRef |
| 490 | |
| 491 | The "bare pointer" calling convention converts `memref`-typed function arguments |
| 492 | to a *single* pointer to the aligned data. Note that this does *not* apply to |
| 493 | uses of `memref` outside of function signatures, the default descriptor |
| 494 | structures are still used. This convention further restricts the supported cases |
| 495 | to the following. |
| 496 | |
| 497 | - `memref` types with default layout. |
| 498 | - `memref` types with all dimensions statically known. |
| 499 | - `memref` values allocated in such a way that the allocated and aligned |
| 500 | pointer match. Alternatively, the same function must handle allocation and |
| 501 | deallocation since only one pointer is passed to any callee. |
| 502 | |
| 503 | Examples: |
| 504 | |
| 505 | ``` |
| 506 | func @callee(memref<2x4xf32>) { |
| 507 | |
| 508 | func @caller(%0 : memref<2x4xf32>) { |
| 509 | call @callee(%0) : (memref<2x4xf32>) -> () |
| 510 | } |
| 511 | |
| 512 | // -> |
| 513 | |
| 514 | !descriptor = !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 515 | array<2xi64>, array<2xi64>)> |
| 516 | |
| 517 | llvm.func @callee(!llvm.ptr<f32>) |
| 518 | |
| 519 | llvm.func @caller(%arg0: !llvm.ptr<f32>) { |
| 520 | // A descriptor value is defined at the function entry point. |
| 521 | %0 = llvm.mlir.undef : !descriptor |
| 522 | |
| 523 | // Both the allocated and aligned pointer are set up to the same value. |
| 524 | %1 = llvm.insertelement %arg0, %0[0] : !descriptor |
| 525 | %2 = llvm.insertelement %arg0, %1[1] : !descriptor |
| 526 | |
| 527 | // The offset is set up to zero. |
| 528 | %3 = llvm.mlir.constant(0 : index) : i64 |
| 529 | %4 = llvm.insertelement %3, %2[2] : !descriptor |
| 530 | |
| 531 | // The sizes and strides are derived from the statically known values. |
| 532 | %5 = llvm.mlir.constant(2 : index) : i64 |
| 533 | %6 = llvm.mlir.constant(4 : index) : i64 |
| 534 | %7 = llvm.insertelement %5, %4[3, 0] : !descriptor |
| 535 | %8 = llvm.insertelement %6, %7[3, 1] : !descriptor |
| 536 | %9 = llvm.mlir.constant(1 : index) : i64 |
| 537 | %10 = llvm.insertelement %9, %8[4, 0] : !descriptor |
| 538 | %11 = llvm.insertelement %10, %9[4, 1] : !descriptor |
| 539 | |
| 540 | // The function call corresponds to extracting the aligned data pointer. |
| 541 | %12 = llvm.extractelement %11[1] : !descriptor |
| 542 | llvm.call @callee(%12) : (!llvm.ptr<f32>) -> () |
| 543 | } |
| 544 | ``` |
| 545 | |
| 546 | #### Bare Pointer Calling Convention For Unranked MemRef |
| 547 | |
| 548 | The "bare pointer" calling convention does not support unranked memrefs as their |
| 549 | shape cannot be known at compile time. |
| 550 | |
| 551 | ### C-compatible wrapper emission |
| 552 | |
| 553 | In practical cases, it may be desirable to have externally-facing functions with |
| 554 | a single attribute corresponding to a MemRef argument. When interfacing with |
| 555 | LLVM IR produced from C, the code needs to respect the corresponding calling |
| 556 | convention. The conversion to the LLVM dialect provides an option to generate |
| 557 | wrapper functions that take memref descriptors as pointers-to-struct compatible |
| 558 | with data types produced by Clang when compiling C sources. The generation of |
| 559 | such wrapper functions can additionally be controlled at a function granularity |
| 560 | by setting the `llvm.emit_c_interface` unit attribute. |
| 561 | |
| 562 | More specifically, a memref argument is converted into a pointer-to-struct |
| 563 | argument of type `{T*, T*, i64, i64[N], i64[N]}*` in the wrapper function, where |
| 564 | `T` is the converted element type and `N` is the memref rank. This type is |
| 565 | compatible with that produced by Clang for the following C++ structure template |
| 566 | instantiations or their equivalents in C. |
| 567 | |
| 568 | ```cpp |
| 569 | template<typename T, size_t N> |
| 570 | struct MemRefDescriptor { |
| 571 | T *allocated; |
| 572 | T *aligned; |
| 573 | intptr_t offset; |
| 574 | intptr_t sizes[N]; |
| 575 | intptr_t strides[N]; |
| 576 | }; |
| 577 | ``` |
| 578 | |
| 579 | Furthermore, we also rewrite function results to pointer parameters if the |
| 580 | rewritten function result has a struct type. The special result parameter is |
| 581 | added as the first parameter and is of pointer-to-struct type. |
| 582 | |
| 583 | If enabled, the option will do the following. For *external* functions declared |
| 584 | in the MLIR module. |
| 585 | |
| 586 | 1. Declare a new function `_mlir_ciface_<original name>` where memref arguments |
| 587 | are converted to pointer-to-struct and the remaining arguments are converted |
| 588 | as usual. Results are converted to a special argument if they are of struct |
| 589 | type. |
| 590 | 2. Add a body to the original function (making it non-external) that |
| 591 | 1. allocates memref descriptors, |
| 592 | 2. populates them, |
| 593 | 3. potentially allocates space for the result struct, and |
| 594 | 4. passes the pointers to these into the newly declared interface function, |
| 595 | then |
| 596 | 5. collects the result of the call (potentially from the result struct), |
| 597 | and |
| 598 | 6. returns it to the caller. |
| 599 | |
| 600 | For (non-external) functions defined in the MLIR module. |
| 601 | |
| 602 | 1. Define a new function `_mlir_ciface_<original name>` where memref arguments |
| 603 | are converted to pointer-to-struct and the remaining arguments are converted |
| 604 | as usual. Results are converted to a special argument if they are of struct |
| 605 | type. |
| 606 | 2. Populate the body of the newly defined function with IR that |
| 607 | 1. loads descriptors from pointers; |
| 608 | 2. unpacks descriptor into individual non-aggregate values; |
| 609 | 3. passes these values into the original function; |
| 610 | 4. collects the results of the call and |
| 611 | 5. either copies the results into the result struct or returns them to the |
| 612 | caller. |
| 613 | |
| 614 | Examples: |
| 615 | |
| 616 | ```mlir |
| 617 | |
| 618 | func @qux(%arg0: memref<?x?xf32>) |
| 619 | |
| 620 | // Gets converted into the following |
| 621 | // (using type alias for brevity): |
| 622 | !llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 623 | array<2xi64>, array<2xi64>)> |
| 624 | |
| 625 | // Function with unpacked arguments. |
| 626 | llvm.func @qux(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>, |
| 627 | %arg2: i64, %arg3: i64, %arg4: i64, |
| 628 | %arg5: i64, %arg6: i64) { |
| 629 | // Populate memref descriptor (as per calling convention). |
| 630 | %0 = llvm.mlir.undef : !llvm.memref_2d |
| 631 | %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d |
| 632 | %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d |
| 633 | %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d |
| 634 | %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d |
| 635 | %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d |
| 636 | %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d |
| 637 | %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d |
| 638 | |
| 639 | // Store the descriptor in a stack-allocated space. |
| 640 | %8 = llvm.mlir.constant(1 : index) : i64 |
| 641 | %9 = llvm.alloca %8 x !llvm.memref_2d |
| 642 | : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, |
| 643 | array<2xi64>, array<2xi64>)>> |
| 644 | llvm.store %7, %9 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, |
| 645 | array<2xi64>, array<2xi64>)>> |
| 646 | |
| 647 | // Call the interface function. |
| 648 | llvm.call @_mlir_ciface_qux(%9) |
| 649 | : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, |
| 650 | array<2xi64>, array<2xi64>)>>) -> () |
| 651 | |
| 652 | // The stored descriptor will be freed on return. |
| 653 | llvm.return |
| 654 | } |
| 655 | |
| 656 | // Interface function. |
| 657 | llvm.func @_mlir_ciface_qux(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, |
| 658 | array<2xi64>, array<2xi64>)>>) |
| 659 | ``` |
| 660 | |
| 661 | ```mlir |
| 662 | func @foo(%arg0: memref<?x?xf32>) { |
| 663 | return |
| 664 | } |
| 665 | |
| 666 | // Gets converted into the following |
| 667 | // (using type alias for brevity): |
| 668 | !llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 669 | array<2xi64>, array<2xi64>)> |
| 670 | !llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, |
| 671 | array<2xi64>, array<2xi64>)>> |
| 672 | |
| 673 | // Function with unpacked arguments. |
| 674 | llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>, |
| 675 | %arg2: i64, %arg3: i64, %arg4: i64, |
| 676 | %arg5: i64, %arg6: i64) { |
| 677 | llvm.return |
| 678 | } |
| 679 | |
| 680 | // Interface function callable from C. |
| 681 | llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr) { |
| 682 | // Load the descriptor. |
| 683 | %0 = llvm.load %arg0 : !llvm.memref_2d_ptr |
| 684 | |
| 685 | // Unpack the descriptor as per calling convention. |
| 686 | %1 = llvm.extractvalue %0[0] : !llvm.memref_2d |
| 687 | %2 = llvm.extractvalue %0[1] : !llvm.memref_2d |
| 688 | %3 = llvm.extractvalue %0[2] : !llvm.memref_2d |
| 689 | %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d |
| 690 | %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d |
| 691 | %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d |
| 692 | %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d |
| 693 | llvm.call @foo(%1, %2, %3, %4, %5, %6, %7) |
| 694 | : (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64, |
| 695 | i64, i64) -> () |
| 696 | llvm.return |
| 697 | } |
| 698 | ``` |
| 699 | |
| 700 | ```mlir |
| 701 | func @foo(%arg0: memref<?x?xf32>) -> memref<?x?xf32> { |
| 702 | return %arg0 : memref<?x?xf32> |
| 703 | } |
| 704 | |
| 705 | // Gets converted into the following |
| 706 | // (using type alias for brevity): |
| 707 | !llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 708 | array<2xi64>, array<2xi64>)> |
| 709 | !llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, |
| 710 | array<2xi64>, array<2xi64>)>> |
| 711 | |
| 712 | // Function with unpacked arguments. |
| 713 | llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>, %arg2: i64, |
| 714 | %arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64) |
| 715 | -> !llvm.memref_2d { |
| 716 | %0 = llvm.mlir.undef : !llvm.memref_2d |
| 717 | %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d |
| 718 | %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d |
| 719 | %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d |
| 720 | %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d |
| 721 | %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d |
| 722 | %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d |
| 723 | %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d |
| 724 | llvm.return %7 : !llvm.memref_2d |
| 725 | } |
| 726 | |
| 727 | // Interface function callable from C. |
| 728 | llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr, %arg1: !llvm.memref_2d_ptr) { |
| 729 | %0 = llvm.load %arg1 : !llvm.memref_2d_ptr |
| 730 | %1 = llvm.extractvalue %0[0] : !llvm.memref_2d |
| 731 | %2 = llvm.extractvalue %0[1] : !llvm.memref_2d |
| 732 | %3 = llvm.extractvalue %0[2] : !llvm.memref_2d |
| 733 | %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d |
| 734 | %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d |
| 735 | %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d |
| 736 | %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d |
| 737 | %8 = llvm.call @foo(%1, %2, %3, %4, %5, %6, %7) |
| 738 | : (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64, i64, i64) -> !llvm.memref_2d |
| 739 | llvm.store %8, %arg0 : !llvm.memref_2d_ptr |
| 740 | llvm.return |
| 741 | } |
| 742 | ``` |
| 743 | |
| 744 | Rationale: Introducing auxiliary functions for C-compatible interfaces is |
| 745 | preferred to modifying the calling convention since it will minimize the effect |
| 746 | of C compatibility on intra-module calls or calls between MLIR-generated |
| 747 | functions. In particular, when calling external functions from an MLIR module in |
| 748 | a (parallel) loop, the fact of storing a memref descriptor on stack can lead to |
| 749 | stack exhaustion and/or concurrent access to the same address. Auxiliary |
| 750 | interface function serves as an allocation scope in this case. Furthermore, when |
| 751 | targeting accelerators with separate memory spaces such as GPUs, stack-allocated |
| 752 | descriptors passed by pointer would have to be transferred to the device memory, |
| 753 | which introduces significant overhead. In such situations, auxiliary interface |
| 754 | functions are executed on host and only pass the values through device function |
| 755 | invocation mechanism. |
| 756 | |
| 757 | ### Address Computation |
| 758 | |
| 759 | Accesses to a memref element are transformed into an access to an element of the |
| 760 | buffer pointed to by the descriptor. The position of the element in the buffer |
| 761 | is calculated by linearizing memref indices in row-major order (lexically first |
| 762 | index is the slowest varying, similar to C, but accounting for strides). The |
| 763 | computation of the linear address is emitted as arithmetic operation in the LLVM |
| 764 | IR dialect. Strides are extracted from the memref descriptor. |
| 765 | |
| 766 | Examples: |
| 767 | |
| 768 | An access to a memref with indices: |
| 769 | |
| 770 | ```mlir |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 771 | %0 = memref.load %m[%1,%2,%3,%4] : memref<?x?x4x8xf32, offset: ?> |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 772 | ``` |
| 773 | |
| 774 | is transformed into the equivalent of the following code: |
| 775 | |
| 776 | ```mlir |
| 777 | // Compute the linearized index from strides. |
| 778 | // When strides or, in absence of explicit strides, the corresponding sizes are |
| 779 | // dynamic, extract the stride value from the descriptor. |
| 780 | %stride1 = llvm.extractvalue[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 781 | array<4xi64>, array<4xi64>)> |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 782 | %addr1 = arith.muli %stride1, %1 : i64 |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 783 | |
| 784 | // When the stride or, in absence of explicit strides, the trailing sizes are |
| 785 | // known statically, this value is used as a constant. The natural value of |
| 786 | // strides is the product of all sizes following the current dimension. |
| 787 | %stride2 = llvm.mlir.constant(32 : index) : i64 |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 788 | %addr2 = arith.muli %stride2, %2 : i64 |
| 789 | %addr3 = arith.addi %addr1, %addr2 : i64 |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 790 | |
| 791 | %stride3 = llvm.mlir.constant(8 : index) : i64 |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 792 | %addr4 = arith.muli %stride3, %3 : i64 |
| 793 | %addr5 = arith.addi %addr3, %addr4 : i64 |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 794 | |
| 795 | // Multiplication with the known unit stride can be omitted. |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 796 | %addr6 = arith.addi %addr5, %4 : i64 |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 797 | |
| 798 | // If the linear offset is known to be zero, it can also be omitted. If it is |
| 799 | // dynamic, it is extracted from the descriptor. |
| 800 | %offset = llvm.extractvalue[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 801 | array<4xi64>, array<4xi64>)> |
Mogball | a54f4ea | 2021-10-12 23:14:57 | [diff] [blame] | 802 | %addr7 = arith.addi %addr6, %offset : i64 |
Alex Zinenko | b10940e | 2021-09-10 13:47:57 | [diff] [blame] | 803 | |
| 804 | // All accesses are based on the aligned pointer. |
| 805 | %aligned = llvm.extractvalue[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, |
| 806 | array<4xi64>, array<4xi64>)> |
| 807 | |
| 808 | // Get the address of the data pointer. |
| 809 | %ptr = llvm.getelementptr %aligned[%addr8] |
| 810 | : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<4xi64>, array<4xi64>)> |
| 811 | -> !llvm.ptr<f32> |
| 812 | |
| 813 | // Perform the actual load. |
| 814 | %0 = llvm.load %ptr : !llvm.ptr<f32> |
| 815 | ``` |
| 816 | |
| 817 | For stores, the address computation code is identical and only the actual store |
| 818 | operation is different. |
| 819 | |
| 820 | Note: the conversion does not perform any sort of common subexpression |
| 821 | elimination when emitting memref accesses. |
| 822 | |
| 823 | ### Utility Classes |
| 824 | |
| 825 | Utility classes common to many conversions to the LLVM dialect can be found |
| 826 | under `lib/Conversion/LLVMCommon`. They include the following. |
| 827 | |
| 828 | - `LLVMConversionTarget` specifies all LLVM dialect operations as legal. |
| 829 | - `LLVMTypeConverter` implements the default type conversion as described |
| 830 | above. |
| 831 | - `ConvertOpToLLVMPattern` extends the conversion pattern class with LLVM |
| 832 | dialect-specific functionality. |
| 833 | - `VectorConvertOpToLLVMPattern` extends the previous class to automatically |
| 834 | unroll operations on higher-dimensional vectors into lists of operations on |
| 835 | one-dimensional vectors before. |
| 836 | - `StructBuilder` provides a convenient API for building IR that creates or |
| 837 | accesses values of LLVM dialect structure types; it is derived by |
| 838 | `MemRefDescriptor`, `UrankedMemrefDescriptor` and `ComplexBuilder` for the |
| 839 | built-in types convertible to LLVM dialect structure types. |
| 840 | |
| 841 | ## Translation to LLVM IR |
| 842 | |
| 843 | MLIR modules containing `llvm.func`, `llvm.mlir.global` and `llvm.metadata` |
| 844 | operations can be translated to LLVM IR modules using the following scheme. |
| 845 | |
| 846 | - Module-level globals are translated to LLVM IR global values. |
| 847 | - Module-level metadata are translated to LLVM IR metadata, which can be later |
| 848 | augmented with additional metadata defined on specific ops. |
| 849 | - All functions are declared in the module so that they can be referenced. |
| 850 | - Each function is then translated separately and has access to the complete |
| 851 | mappings between MLIR and LLVM IR globals, metadata, and functions. |
| 852 | - Within a function, blocks are traversed in topological order and translated |
| 853 | to LLVM IR basic blocks. In each basic block, PHI nodes are created for each |
| 854 | of the block arguments, but not connected to their source blocks. |
| 855 | - Within each block, operations are translated in their order. Each operation |
| 856 | has access to the same mappings as the function and additionally to the |
| 857 | mapping of values between MLIR and LLVM IR, including PHI nodes. Operations |
| 858 | with regions are responsible for translated the regions they contain. |
| 859 | - After operations in a function are translated, the PHI nodes of blocks in |
| 860 | this function are connected to their source values, which are now available. |
| 861 | |
| 862 | The translation mechanism provides extension hooks for translating custom |
| 863 | operations to LLVM IR via a dialect interface `LLVMTranslationDialectInterface`: |
| 864 | |
| 865 | - `convertOperation` translates an operation that belongs to the current |
| 866 | dialect to LLVM IR given an `IRBuilderBase` and various mappings; |
| 867 | - `amendOperation` performs additional actions on an operation if it contains |
| 868 | a dialect attribute that belongs to the current dialect, for example sets up |
| 869 | instruction-level metadata. |
| 870 | |
| 871 | Dialects containing operations or attributes that want to be translated to LLVM |
| 872 | IR must provide an implementation of this interface and register it with the |
| 873 | system. Note that registration may happen without creating the dialect, for |
| 874 | example, in a separate library to avoid the need for the "main" dialect library |
| 875 | to depend on LLVM IR libraries. The implementations of these methods may used |
| 876 | the |
| 877 | [`ModuleTranslation`](https://mlir.llvm.org/doxygen/classmlir_1_1LLVM_1_1ModuleTranslation.html) |
| 878 | object provided to them which holds the state of the translation and contains |
| 879 | numerous utilities. |
| 880 | |
| 881 | Note that this extension mechanism is *intentionally restrictive*. LLVM IR has a |
| 882 | small, relatively stable set of instructions and types that MLIR intends to |
| 883 | model fully. Therefore, the extension mechanism is provided only for LLVM IR |
| 884 | constructs that are more often extended -- intrinsics and metadata. The primary |
| 885 | goal of the extension mechanism is to support sets of intrinsics, for example |
| 886 | those representing a particular instruction set. The extension mechanism does |
| 887 | not allow for customizing type or block translation, nor does it support custom |
| 888 | module-level operations. Such transformations should be performed within MLIR |
| 889 | and target the corresponding MLIR constructs. |
| 890 | |
| 891 | ## Translation from LLVM IR |
| 892 | |
| 893 | An experimental flow allows one to import a substantially limited subset of LLVM |
| 894 | IR into MLIR, producing LLVM dialect operations. |
| 895 | |
| 896 | ``` |
| 897 | mlir-translate -import-llvm filename.ll |
| 898 | ``` |