Blame - mlir/docs/TargetLLVMIR.md - external/github.com/llvm/llvm-project.git

blob: e3ebe9973ba2348ceb2f39881fbbf34a45c2f0f4 [file] [log] [blame] [view]

Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	1	# LLVM IR Target
				2
				3	This document describes the mechanisms of producing LLVM IR from MLIR. The
				4	overall flow is two-stage:
				5
				6	1. conversion of the IR to a set of dialects translatable to LLVM IR, for
				7	example [LLVM Dialect](Dialects/LLVM.md) or one of the hardware-specific
				8	dialects derived from LLVM IR intrinsics such as [AMX](Dialects/AMX.md),
				9	[X86Vector](Dialects/X86Vector.md) or [ArmNeon](Dialects/ArmNeon.md);
				10	2. translation of MLIR dialects to LLVM IR.
				11
				12	This flow allows the non-trivial transformation to be performed within MLIR
				13	using MLIR APIs and makes the translation between MLIR and LLVM IR simple and
				14	potentially bidirectional. As a corollary, dialect ops translatable to LLVM IR
				15	are expected to closely match the corresponding LLVM IR instructions and
				16	intrinsics. This minimizes the dependency on LLVM IR libraries in MLIR as well
				17	as reduces the churn in case of changes.
				18
				19	SPIR-V to LLVM dialect conversion has a
				20	[dedicated document](SPIRVToLLVMDialectConversion.md).
				21
				22	[TOC]
				23
				24	## Conversion to the LLVM Dialect
				25
				26	Conversion to the LLVM dialect from other dialects is the first step to produce
				27	LLVM IR. All non-trivial IR modifications are expected to happen at this stage
				28	or before. The conversion is progressive: most passes convert one dialect to
				29	the LLVM dialect and keep operations from other dialects intact. For example,
				30	the `-convert-memref-to-llvm` pass will only convert operations from the
				31	`memref` dialect but will not convert operations from other dialects even if
				32	they use or produce `memref`-typed values.
				33
				34	The process relies on the [Dialect Conversion](DialectConversion.md)
				35	infrastructure and, in particular, on the
				36	[materialization](DialectConversion.md#type-conversion) hooks of `TypeConverter`
				37	to support progressive lowering by injecting `unrealized_conversion_cast`
				38	operations between converted and unconverted operations. After multiple partial
				39	conversions to the LLVM dialect are performed, the cast operations that became
				40	noop can be removed by the `-reconcile-unrealized-casts` pass. The latter pass
				41	is not specific to the LLVM dialect and can remove any noop casts.
				42
				43	### Conversion of Built-in Types
				44
				45	Built-in types have a default conversion to LLVM dialect types provided by the
				46	`LLVMTypeConverter` class. Users targeting the LLVM dialect can reuse and extend
				47	this type converter to support other types. Extra care must be taken if the
				48	conversion rules for built-in types are overridden: all conversion must use the
				49	same type converter.
				50
				51	#### LLVM Dialect-compatible Types
				52
				53	The types [compatible](Dialects/LLVM.md#built-in-type-compatibility) with the
				54	LLVM dialect are kept as is.
				55
				56	#### Complex Type
				57
				58	Complex type is converted into an LLVM dialect literal structure type with two
				59	elements:
				60
				61	- real part;
				62	- imaginary part.
				63
				64	The elemental type is converted recursively using these rules.
				65
				66	Example:
				67
				68	```mlir
				69	complex<f32>
				70	// ->
				71	!llvm.struct<(f32, f32)>
				72	```
				73
				74	#### Index Type
				75
				76	Index type is converted into an LLVM dialect integer type with the bitwidth
				77	specified by the [data layout](DataLayout.md) of the closest module. For
				78	example, on x86-64 CPUs it converts to i64. This behavior can be overridden by
				79	the type converter configuration, which is often exposed as a pass option by
				80	conversion passes.
				81
				82	Example:
				83
				84	```mlir
				85	index
				86	// -> on x86_64
				87	i64
				88	```
				89
				90	#### Ranked MemRef Types
				91
				92	Ranked memref types are converted into an LLVM dialect literal structure type
				93	that contains the dynamic information associated with the memref object,
				94	referred to as descriptor. Only memrefs in the
				95	[strided form](Dialects/Builtin.md/#strided-memref) can be converted to the
				96	LLVM dialect with the default descriptor format. Memrefs with other, less
				97	trivial layouts should be converted into the strided form first, e.g., by
				98	materializing the non-trivial address remapping due to layout as `affine.apply`
				99	operations.
				100
				101	The default memref descriptor is a struct with the following fields:
				102
				103	1. The pointer to the data buffer as allocated, referred to as "allocated
				104	pointer". This is only useful for deallocating the memref.
				105	2. The pointer to the properly aligned data pointer that the memref indexes,
				106	referred to as "aligned pointer".
				107	3. A lowered converted `index`-type integer containing the distance in number
				108	of elements between the beginning of the (aligned) buffer and the first
				109	element to be accessed through the memref, referred to as "offset".
				110	4. An array containing as many converted `index`-type integers as the rank of
				111	the memref: the array represents the size, in number of elements, of the
				112	memref along the given dimension.
				113	5. A second array containing as many converted `index`-type integers as the
				114	rank of memref: the second array represents the "stride" (in tensor
				115	abstraction sense), i.e. the number of consecutive elements of the
				116	underlying buffer one needs to jump over to get to the next logically
				117	indexed element.
				118
				119	For constant memref dimensions, the corresponding size entry is a constant whose
				120	runtime value matches the static value. This normalization serves as an ABI for
				121	the memref type to interoperate with externally linked functions. In the
				122	particular case of rank `0` memrefs, the size and stride arrays are omitted,
				123	resulting in a struct containing two pointers + offset.
				124
				125	Examples:
				126
				127	```mlir
				128	// Assuming index is converted to i64.
				129
				130	memref<f32> -> !llvm.struct<(ptr<f32> , ptr<f32>, i64)>
				131	memref<1 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				132	array<1 x 64>, array<1 x i64>)>
				133	memref<? x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
				134	array<1 x 64>, array<1 x i64>)>
				135	memref<10x42x42x43x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
				136	array<5 x 64>, array<5 x i64>)>
				137	memref<10x?x42x?x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
				138	array<5 x 64>, array<5 x i64>)>
				139
				140	// Memref types can have vectors as element types
				141	memref<1x? x vector<4xf32>> -> !llvm.struct<(ptr<vector<4 x f32>>,
				142	ptr<vector<4 x f32>>, i64,
				143	array<2 x i64>, array<2 x i64>)>
				144	```
				145
				146	#### Unranked MemRef Types
				147
				148	Unranked memref types are converted to LLVM dialect literal structure type that
Markus Böck	286a7a4	2021-10-29 07:19:11	[diff] [blame]	149	contains the dynamic information associated with the memref object, referred to
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	150	as unranked descriptor. It contains:
				151
				152	1. a converted `index`-typed integer representing the dynamic rank of the
				153	memref;
				154	2. a type-erased pointer (`!llvm.ptr<i8>`) to a ranked memref descriptor with
				155	the contents listed above.
				156
				157	This descriptor is primarily intended for interfacing with rank-polymorphic
				158	library functions. The pointer to the ranked memref descriptor points to some
				159	allocated memory, which may reside on stack of the current function or in
				160	heap. Conversion patterns for operations producing unranked memrefs are expected
				161	to manage the allocation. Note that this may lead to stack allocations
				162	(`llvm.alloca`) being performed in a loop and not reclaimed until the end of the
				163	current function.
				164
				165	#### Function Types
				166
				167	Function types are converted to LLVM dialect function types as follows:
				168
				169	- function argument and result types are converted recursively using these
				170	rules;
				171	- if a function type has multiple results, they are wrapped into an LLVM
				172	dialect literal structure type since LLVM function types must have exactly
				173	one result;
				174	- if a function type has no results, the corresponding LLVM dialect function
				175	type will have one `!llvm.void` result since LLVM function types must have a
				176	result;
				177	- function types used in arguments of another function type are wrapped in an
				178	LLVM dialect pointer type to comply with LLVM IR expectations;
				179	- the structs corresponding to `memref` types, both ranked and unranked,
				180	appearing as function arguments are unbundled into individual function
				181	arguments to allow for specifying metadata such as aliasing information on
				182	individual pointers;
				183	- the conversion of `memref`-typed arguments is subject to
				184	[calling conventions](TargetLLVMIR.md#calling-conventions).
				185
				186	Examples:
				187
				188	```mlir
				189	// Zero-ary function type with no results:
				190	() -> ()
				191	// is converted to a zero-ary function with `void` result.
				192	!llvm.func<void ()>
				193
				194	// Unary function with one result:
				195	(i32) -> (i64)
				196	// has its argument and result type converted, before creating the LLVM dialect
				197	// function type.
				198	!llvm.func<i64 (i32)>
				199
				200	// Binary function with one result:
				201	(i32, f32) -> (i64)
				202	// has its arguments handled separately
				203	!llvm.func<i64 (i32, f32)>
				204
				205	// Binary function with two results:
				206	(i32, f32) -> (i64, f64)
				207	// has its result aggregated into a structure type.
				208	!llvm.func<struct<(i64, f64)> (i32, f32)>
				209
				210	// Function-typed arguments or results in higher-order functions:
				211	(() -> ()) -> (() -> ())
				212	// are converted into pointers to functions.
				213	!llvm.func<ptr<func<void ()>> (ptr<func<void ()>>)>
				214
				215	// These rules apply recursively: a function type taking a function that takes
				216	// another function
				217	( ( (i32) -> (i64) ) -> () ) -> ()
				218	// is converted into a function type taking a pointer-to-function that takes
				219	// another point-to-function.
				220	!llvm.func<void (ptr<func<void (ptr<func<i64 (i32)>>)>>)>
				221
				222	// A memref descriptor appearing as function argument:
				223	(memref<f32>) -> ()
				224	// gets converted into a list of individual scalar components of a descriptor.
				225	!llvm.func<void (ptr<f32>, ptr<f32>, i64)>
				226
				227	// The list of arguments is linearized and one can freely mix memref and other
				228	// types in this list:
				229	(memref<f32>, f32) -> ()
				230	// which gets converted into a flat list.
				231	!llvm.func<void (ptr<f32>, ptr<f32>, i64, f32)>
				232
				233	// For nD ranked memref descriptors:
				234	(memref<?x?xf32>) -> ()
				235	// the converted signature will contain 2n+1 `index`-typed integer arguments,
				236	// offset, n sizes and n strides, per memref argument type.
				237	!llvm.func<void (ptr<f32>, ptr<f32>, i64, i64, i64, i64, i64)>
				238
				239	// Same rules apply to unranked descriptors:
				240	(memref<*xf32>) -> ()
				241	// which get converted into their components.
				242	!llvm.func<void (i64, ptr<i8>)>
				243
				244	// However, returning a memref from a function is not affected:
				245	() -> (memref<?xf32>)
				246	// gets converted to a function returning a descriptor structure.
				247	!llvm.func<struct<(ptr<f32>, ptr<f32>, i64, array<1xi64>, array<1xi64>)> ()>
				248
				249	// If multiple memref-typed results are returned:
				250	() -> (memref<f32>, memref<f64>)
				251	// their descriptor structures are additionally packed into another structure,
				252	// potentially with other non-memref typed results.
				253	!llvm.func<struct<(struct<(ptr<f32>, ptr<f32>, i64)>,
				254	struct<(ptr<double>, ptr<double>, i64)>)> ()>
				255	```
				256
				257	Conversion patterns are available to convert built-in function operations and
				258	standard call operations targeting those functions using these conversion rules.
				259
				260	#### Multi-dimensional Vector Types
				261
				262	LLVM IR only supports one-dimensional vectors, unlike MLIR where vectors can
				263	be multi-dimensional. Vector types cannot be nested in either IR. In the
				264	one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same
				265	size with element type converted using these conversion rules. In the
				266	n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types
				267	of one-dimensional vectors.
				268
				269	Examples:
				270
				271	```
				272	vector<4x8 x f32>
				273	// ->
				274	!llvm.array<4 x vector<8 x f32>>
				275
				276	memref<2 x vector<4x8 x f32>
				277	// ->
				278	!llvm.struct<(ptr<array<4 x vector<8xf32>>>, ptr<array<4 x vector<8xf32>>>
				279	i64, array<1 x i64>, array<1 x i64>)>
				280	```
				281
				282	#### Tensor Types
				283
				284	Tensor types cannot be converted to the LLVM dialect. Operations on tensors must
				285	be [bufferized](Bufferization.md) before being converted.
				286
				287	### Calling Conventions
				288
				289	Calling conventions provides a mechanism to customize the conversion of function
				290	and function call operations without changing how individual types are handled
				291	elsewhere. They are implemented simultaneously by the default type converter and
				292	by the conversion patterns for the relevant operations.
				293
				294	#### Function Result Packing
				295
				296	In case of multi-result functions, the returned values are inserted into a
				297	structure-typed value before being returned and extracted from it at the call
				298	site. This transformation is a part of the conversion and is transparent to the
				299	defines and uses of the values being returned.
				300
				301	Example:
				302
				303	```mlir
				304	func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) {
				305	return %arg0, %arg1 : i32, i64
				306	}
				307	func @bar() {
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	308	%0 = arith.constant 42 : i32
				309	%1 = arith.constant 17 : i64
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	310	%2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64)
				311	"use_i32"(%2#0) : (i32) -> ()
				312	"use_i64"(%2#1) : (i64) -> ()
				313	}
				314
				315	// is transformed into
				316
				317	llvm.func @foo(%arg0: i32, %arg1: i64) -> !llvm.struct<(i32, i64)> {
				318	// insert the vales into a structure
				319	%0 = llvm.mlir.undef : !llvm.struct<(i32, i64)>
				320	%1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i32, i64)>
				321	%2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i32, i64)>
				322
				323	// return the structure value
				324	llvm.return %2 : !llvm.struct<(i32, i64)>
				325	}
				326	llvm.func @bar() {
				327	%0 = llvm.mlir.constant(42 : i32) : i32
				328	%1 = llvm.mlir.constant(17) : i64
				329
				330	// call and extract the values from the structure
				331	%2 = llvm.call @bar(%0, %1)
				332	: (i32, i32) -> !llvm.struct<(i32, i64)>
				333	%3 = llvm.extractvalue %2[0] : !llvm.struct<(i32, i64)>
				334	%4 = llvm.extractvalue %2[1] : !llvm.struct<(i32, i64)>
				335
				336	// use as before
				337	"use_i32"(%3) : (i32) -> ()
				338	"use_i64"(%4) : (i64) -> ()
				339	}
				340	```
				341
				342	#### Default Calling Convention for Ranked MemRef
				343
				344	The default calling convention converts `memref`-typed function arguments to
				345	LLVM dialect literal structs
				346	[defined above](TargetLLVMIR.md#ranked-memref-types) before unbundling them into
				347	individual scalar arguments.
				348
				349	Examples:
				350
River Riddle	23aa5a7	2022-02-26 22:49:54	[diff] [blame^]	351	This convention is implemented in the conversion of `builtin.func` and `func.call` to
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	352	the LLVM dialect, with the former unpacking the descriptor into a set of
				353	individual values and the latter packing those values back into a descriptor so
				354	as to make it transparently usable by other operations. Conversions from other
				355	dialects should take this convention into account.
				356
				357	This specific convention is motivated by the necessity to specify alignment and
				358	aliasing attributes on the raw pointers underpinning the memref.
				359
				360	Examples:
				361
				362	```mlir
				363	func @foo(%arg0: memref<?xf32>) -> () {
				364	"use"(%arg0) : (memref<?xf32>) -> ()
				365	return
				366	}
				367
				368	// Gets converted to the following
				369	// (using type alias for brevity):
				370	!llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				371	array<1xi64>, array<1xi64>)>
				372
				373	llvm.func @foo(%arg0: !llvm.ptr<f32>, // Allocated pointer.
				374	%arg1: !llvm.ptr<f32>, // Aligned pointer.
				375	%arg2: i64, // Offset.
				376	%arg3: i64, // Size in dim 0.
				377	%arg4: i64) { // Stride in dim 0.
				378	// Populate memref descriptor structure.
				379	%0 = llvm.mlir.undef :
				380	%1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_1d
				381	%2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_1d
				382	%3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_1d
				383	%4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_1d
				384	%5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm.memref_1d
				385
				386	// Descriptor is now usable as a single value.
				387	"use"(%5) : (!llvm.memref_1d) -> ()
				388	llvm.return
				389	}
				390	```
				391
				392	```mlir
				393	func @bar() {
				394	%0 = "get"() : () -> (memref<?xf32>)
				395	call @foo(%0) : (memref<?xf32>) -> ()
				396	return
				397	}
				398
				399	// Gets converted to the following
				400	// (using type alias for brevity):
				401	!llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				402	array<1xi64>, array<1xi64>)>
				403
				404	llvm.func @bar() {
				405	%0 = "get"() : () -> !llvm.memref_1d
				406
				407	// Unpack the memref descriptor.
				408	%1 = llvm.extractvalue %0[0] : !llvm.memref_1d
				409	%2 = llvm.extractvalue %0[1] : !llvm.memref_1d
				410	%3 = llvm.extractvalue %0[2] : !llvm.memref_1d
				411	%4 = llvm.extractvalue %0[3, 0] : !llvm.memref_1d
				412	%5 = llvm.extractvalue %0[4, 0] : !llvm.memref_1d
				413
				414	// Pass individual values to the callee.
				415	llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm.memref_1d) -> ()
				416	llvm.return
				417	}
				418	```
				419
				420	#### Default Calling Convention for Unranked MemRef
				421
				422	For unranked memrefs, the list of function arguments always contains two
				423	elements, same as the unranked memref descriptor: an integer rank, and a
				424	type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that
				425	while the calling convention does not require allocation, casting to
				426	unranked memref does since one cannot take an address of an SSA value containing
				427	the ranked memref, which must be stored in some memory instead. The caller is in
				428	charge of ensuring the thread safety and management of the allocated memory, in
				429	particular the deallocation.
				430
				431	Example
				432
				433	```mlir
				434	llvm.func @foo(%arg0: memref<*xf32>) -> () {
				435	"use"(%arg0) : (memref<*xf32>) -> ()
				436	return
				437	}
				438
				439	// Gets converted to the following.
				440
				441	llvm.func @foo(%arg0: i64 // Rank.
				442	%arg1: !llvm.ptr<i8>) { // Type-erased pointer to descriptor.
				443	// Pack the unranked memref descriptor.
				444	%0 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
				445	%1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i64, ptr<i8>)>
				446	%2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i64, ptr<i8>)>
				447
				448	"use"(%2) : (!llvm.struct<(i64, ptr<i8>)>) -> ()
				449	llvm.return
				450	}
				451	```
				452
				453	```mlir
				454	llvm.func @bar() {
				455	%0 = "get"() : () -> (memref<*xf32>)
				456	call @foo(%0): (memref<*xf32>) -> ()
				457	return
				458	}
				459
				460	// Gets converted to the following.
				461
				462	llvm.func @bar() {
				463	%0 = "get"() : () -> (!llvm.struct<(i64, ptr<i8>)>)
				464
				465	// Unpack the memref descriptor.
				466	%1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
				467	%2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
				468
				469	// Pass individual values to the callee.
				470	llvm.call @foo(%1, %2) : (i64, !llvm.ptr<i8>)
				471	llvm.return
				472	}
				473	```
				474
				475	Lifetime. The second element of the unranked memref descriptor points to
				476	some memory in which the ranked memref descriptor is stored. By convention, this
				477	memory is allocated on stack and has the lifetime of the function. (Note: due
				478	to function-length lifetime, creation of multiple unranked memref descriptors,
				479	e.g., in a loop, may lead to stack overflows.) If an unranked descriptor has to
				480	be returned from a function, the ranked descriptor it points to is copied into
				481	dynamically allocated memory, and the pointer in the unranked descriptor is
				482	updated accordingly. The allocation happens immediately before returning. It is
				483	the responsibility of the caller to free the dynamically allocated memory. The
River Riddle	23aa5a7	2022-02-26 22:49:54	[diff] [blame^]	484	default conversion of `func.call` and `func.call_indirect` copies the ranked
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	485	descriptor to newly allocated memory on the caller's stack. Thus, the convention
				486	of the ranked memref descriptor pointed to by an unranked memref descriptor
				487	being stored on stack is respected.
				488
				489	#### Bare Pointer Calling Convention for Ranked MemRef
				490
				491	The "bare pointer" calling convention converts `memref`-typed function arguments
				492	to a single pointer to the aligned data. Note that this does not apply to
				493	uses of `memref` outside of function signatures, the default descriptor
				494	structures are still used. This convention further restricts the supported cases
				495	to the following.
				496
				497	- `memref` types with default layout.
				498	- `memref` types with all dimensions statically known.
				499	- `memref` values allocated in such a way that the allocated and aligned
				500	pointer match. Alternatively, the same function must handle allocation and
				501	deallocation since only one pointer is passed to any callee.
				502
				503	Examples:
				504
				505	```
				506	func @callee(memref<2x4xf32>) {
				507
				508	func @caller(%0 : memref<2x4xf32>) {
				509	call @callee(%0) : (memref<2x4xf32>) -> ()
				510	}
				511
				512	// ->
				513
				514	!descriptor = !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				515	array<2xi64>, array<2xi64>)>
				516
				517	llvm.func @callee(!llvm.ptr<f32>)
				518
				519	llvm.func @caller(%arg0: !llvm.ptr<f32>) {
				520	// A descriptor value is defined at the function entry point.
				521	%0 = llvm.mlir.undef : !descriptor
				522
				523	// Both the allocated and aligned pointer are set up to the same value.
				524	%1 = llvm.insertelement %arg0, %0[0] : !descriptor
				525	%2 = llvm.insertelement %arg0, %1[1] : !descriptor
				526
				527	// The offset is set up to zero.
				528	%3 = llvm.mlir.constant(0 : index) : i64
				529	%4 = llvm.insertelement %3, %2[2] : !descriptor
				530
				531	// The sizes and strides are derived from the statically known values.
				532	%5 = llvm.mlir.constant(2 : index) : i64
				533	%6 = llvm.mlir.constant(4 : index) : i64
				534	%7 = llvm.insertelement %5, %4[3, 0] : !descriptor
				535	%8 = llvm.insertelement %6, %7[3, 1] : !descriptor
				536	%9 = llvm.mlir.constant(1 : index) : i64
				537	%10 = llvm.insertelement %9, %8[4, 0] : !descriptor
				538	%11 = llvm.insertelement %10, %9[4, 1] : !descriptor
				539
				540	// The function call corresponds to extracting the aligned data pointer.
				541	%12 = llvm.extractelement %11[1] : !descriptor
				542	llvm.call @callee(%12) : (!llvm.ptr<f32>) -> ()
				543	}
				544	```
				545
				546	#### Bare Pointer Calling Convention For Unranked MemRef
				547
				548	The "bare pointer" calling convention does not support unranked memrefs as their
				549	shape cannot be known at compile time.
				550
				551	### C-compatible wrapper emission
				552
				553	In practical cases, it may be desirable to have externally-facing functions with
				554	a single attribute corresponding to a MemRef argument. When interfacing with
				555	LLVM IR produced from C, the code needs to respect the corresponding calling
				556	convention. The conversion to the LLVM dialect provides an option to generate
				557	wrapper functions that take memref descriptors as pointers-to-struct compatible
				558	with data types produced by Clang when compiling C sources. The generation of
				559	such wrapper functions can additionally be controlled at a function granularity
				560	by setting the `llvm.emit_c_interface` unit attribute.
				561
				562	More specifically, a memref argument is converted into a pointer-to-struct
				563	argument of type `{T, T, i64, i64[N], i64[N]}*` in the wrapper function, where
				564	`T` is the converted element type and `N` is the memref rank. This type is
				565	compatible with that produced by Clang for the following C++ structure template
				566	instantiations or their equivalents in C.
				567
				568	```cpp
				569	template<typename T, size_t N>
				570	struct MemRefDescriptor {
				571	T *allocated;
				572	T *aligned;
				573	intptr_t offset;
				574	intptr_t sizes[N];
				575	intptr_t strides[N];
				576	};
				577	```
				578
				579	Furthermore, we also rewrite function results to pointer parameters if the
				580	rewritten function result has a struct type. The special result parameter is
				581	added as the first parameter and is of pointer-to-struct type.
				582
				583	If enabled, the option will do the following. For external functions declared
				584	in the MLIR module.
				585
				586	1. Declare a new function `_mlir_ciface_<original name>` where memref arguments
				587	are converted to pointer-to-struct and the remaining arguments are converted
				588	as usual. Results are converted to a special argument if they are of struct
				589	type.
				590	2. Add a body to the original function (making it non-external) that
				591	1. allocates memref descriptors,
				592	2. populates them,
				593	3. potentially allocates space for the result struct, and
				594	4. passes the pointers to these into the newly declared interface function,
				595	then
				596	5. collects the result of the call (potentially from the result struct),
				597	and
				598	6. returns it to the caller.
				599
				600	For (non-external) functions defined in the MLIR module.
				601
				602	1. Define a new function `_mlir_ciface_<original name>` where memref arguments
				603	are converted to pointer-to-struct and the remaining arguments are converted
				604	as usual. Results are converted to a special argument if they are of struct
				605	type.
				606	2. Populate the body of the newly defined function with IR that
				607	1. loads descriptors from pointers;
				608	2. unpacks descriptor into individual non-aggregate values;
				609	3. passes these values into the original function;
				610	4. collects the results of the call and
				611	5. either copies the results into the result struct or returns them to the
				612	caller.
				613
				614	Examples:
				615
				616	```mlir
				617
				618	func @qux(%arg0: memref<?x?xf32>)
				619
				620	// Gets converted into the following
				621	// (using type alias for brevity):
				622	!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				623	array<2xi64>, array<2xi64>)>
				624
				625	// Function with unpacked arguments.
				626	llvm.func @qux(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>,
				627	%arg2: i64, %arg3: i64, %arg4: i64,
				628	%arg5: i64, %arg6: i64) {
				629	// Populate memref descriptor (as per calling convention).
				630	%0 = llvm.mlir.undef : !llvm.memref_2d
				631	%1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
				632	%2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
				633	%3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
				634	%4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
				635	%5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
				636	%6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
				637	%7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
				638
				639	// Store the descriptor in a stack-allocated space.
				640	%8 = llvm.mlir.constant(1 : index) : i64
				641	%9 = llvm.alloca %8 x !llvm.memref_2d
				642	: (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
				643	array<2xi64>, array<2xi64>)>>
				644	llvm.store %7, %9 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
				645	array<2xi64>, array<2xi64>)>>
				646
				647	// Call the interface function.
				648	llvm.call @_mlir_ciface_qux(%9)
				649	: (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
				650	array<2xi64>, array<2xi64>)>>) -> ()
				651
				652	// The stored descriptor will be freed on return.
				653	llvm.return
				654	}
				655
				656	// Interface function.
				657	llvm.func @_mlir_ciface_qux(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
				658	array<2xi64>, array<2xi64>)>>)
				659	```
				660
				661	```mlir
				662	func @foo(%arg0: memref<?x?xf32>) {
				663	return
				664	}
				665
				666	// Gets converted into the following
				667	// (using type alias for brevity):
				668	!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				669	array<2xi64>, array<2xi64>)>
				670	!llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
				671	array<2xi64>, array<2xi64>)>>
				672
				673	// Function with unpacked arguments.
				674	llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>,
				675	%arg2: i64, %arg3: i64, %arg4: i64,
				676	%arg5: i64, %arg6: i64) {
				677	llvm.return
				678	}
				679
				680	// Interface function callable from C.
				681	llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr) {
				682	// Load the descriptor.
				683	%0 = llvm.load %arg0 : !llvm.memref_2d_ptr
				684
				685	// Unpack the descriptor as per calling convention.
				686	%1 = llvm.extractvalue %0[0] : !llvm.memref_2d
				687	%2 = llvm.extractvalue %0[1] : !llvm.memref_2d
				688	%3 = llvm.extractvalue %0[2] : !llvm.memref_2d
				689	%4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
				690	%5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
				691	%6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
				692	%7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
				693	llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
				694	: (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64,
				695	i64, i64) -> ()
				696	llvm.return
				697	}
				698	```
				699
				700	```mlir
				701	func @foo(%arg0: memref<?x?xf32>) -> memref<?x?xf32> {
				702	return %arg0 : memref<?x?xf32>
				703	}
				704
				705	// Gets converted into the following
				706	// (using type alias for brevity):
				707	!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				708	array<2xi64>, array<2xi64>)>
				709	!llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
				710	array<2xi64>, array<2xi64>)>>
				711
				712	// Function with unpacked arguments.
				713	llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>, %arg2: i64,
				714	%arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64)
				715	-> !llvm.memref_2d {
				716	%0 = llvm.mlir.undef : !llvm.memref_2d
				717	%1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
				718	%2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
				719	%3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
				720	%4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
				721	%5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
				722	%6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
				723	%7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
				724	llvm.return %7 : !llvm.memref_2d
				725	}
				726
				727	// Interface function callable from C.
				728	llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr, %arg1: !llvm.memref_2d_ptr) {
				729	%0 = llvm.load %arg1 : !llvm.memref_2d_ptr
				730	%1 = llvm.extractvalue %0[0] : !llvm.memref_2d
				731	%2 = llvm.extractvalue %0[1] : !llvm.memref_2d
				732	%3 = llvm.extractvalue %0[2] : !llvm.memref_2d
				733	%4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
				734	%5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
				735	%6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
				736	%7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
				737	%8 = llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
				738	: (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64, i64, i64) -> !llvm.memref_2d
				739	llvm.store %8, %arg0 : !llvm.memref_2d_ptr
				740	llvm.return
				741	}
				742	```
				743
				744	Rationale: Introducing auxiliary functions for C-compatible interfaces is
				745	preferred to modifying the calling convention since it will minimize the effect
				746	of C compatibility on intra-module calls or calls between MLIR-generated
				747	functions. In particular, when calling external functions from an MLIR module in
				748	a (parallel) loop, the fact of storing a memref descriptor on stack can lead to
				749	stack exhaustion and/or concurrent access to the same address. Auxiliary
				750	interface function serves as an allocation scope in this case. Furthermore, when
				751	targeting accelerators with separate memory spaces such as GPUs, stack-allocated
				752	descriptors passed by pointer would have to be transferred to the device memory,
				753	which introduces significant overhead. In such situations, auxiliary interface
				754	functions are executed on host and only pass the values through device function
				755	invocation mechanism.
				756
				757	### Address Computation
				758
				759	Accesses to a memref element are transformed into an access to an element of the
				760	buffer pointed to by the descriptor. The position of the element in the buffer
				761	is calculated by linearizing memref indices in row-major order (lexically first
				762	index is the slowest varying, similar to C, but accounting for strides). The
				763	computation of the linear address is emitted as arithmetic operation in the LLVM
				764	IR dialect. Strides are extracted from the memref descriptor.
				765
				766	Examples:
				767
				768	An access to a memref with indices:
				769
				770	```mlir
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	771	%0 = memref.load %m[%1,%2,%3,%4] : memref<?x?x4x8xf32, offset: ?>
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	772	```
				773
				774	is transformed into the equivalent of the following code:
				775
				776	```mlir
				777	// Compute the linearized index from strides.
				778	// When strides or, in absence of explicit strides, the corresponding sizes are
				779	// dynamic, extract the stride value from the descriptor.
				780	%stride1 = llvm.extractvalue[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				781	array<4xi64>, array<4xi64>)>
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	782	%addr1 = arith.muli %stride1, %1 : i64
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	783
				784	// When the stride or, in absence of explicit strides, the trailing sizes are
				785	// known statically, this value is used as a constant. The natural value of
				786	// strides is the product of all sizes following the current dimension.
				787	%stride2 = llvm.mlir.constant(32 : index) : i64
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	788	%addr2 = arith.muli %stride2, %2 : i64
				789	%addr3 = arith.addi %addr1, %addr2 : i64
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	790
				791	%stride3 = llvm.mlir.constant(8 : index) : i64
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	792	%addr4 = arith.muli %stride3, %3 : i64
				793	%addr5 = arith.addi %addr3, %addr4 : i64
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	794
				795	// Multiplication with the known unit stride can be omitted.
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	796	%addr6 = arith.addi %addr5, %4 : i64
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	797
				798	// If the linear offset is known to be zero, it can also be omitted. If it is
				799	// dynamic, it is extracted from the descriptor.
				800	%offset = llvm.extractvalue[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				801	array<4xi64>, array<4xi64>)>
Mogball	a54f4ea	2021-10-12 23:14:57	[diff] [blame]	802	%addr7 = arith.addi %addr6, %offset : i64
Alex Zinenko	b10940e	2021-09-10 13:47:57	[diff] [blame]	803
				804	// All accesses are based on the aligned pointer.
				805	%aligned = llvm.extractvalue[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
				806	array<4xi64>, array<4xi64>)>
				807
				808	// Get the address of the data pointer.
				809	%ptr = llvm.getelementptr %aligned[%addr8]
				810	: !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<4xi64>, array<4xi64>)>
				811	-> !llvm.ptr<f32>
				812
				813	// Perform the actual load.
				814	%0 = llvm.load %ptr : !llvm.ptr<f32>
				815	```
				816
				817	For stores, the address computation code is identical and only the actual store
				818	operation is different.
				819
				820	Note: the conversion does not perform any sort of common subexpression
				821	elimination when emitting memref accesses.
				822
				823	### Utility Classes
				824
				825	Utility classes common to many conversions to the LLVM dialect can be found
				826	under `lib/Conversion/LLVMCommon`. They include the following.
				827
				828	- `LLVMConversionTarget` specifies all LLVM dialect operations as legal.
				829	- `LLVMTypeConverter` implements the default type conversion as described
				830	above.
				831	- `ConvertOpToLLVMPattern` extends the conversion pattern class with LLVM
				832	dialect-specific functionality.
				833	- `VectorConvertOpToLLVMPattern` extends the previous class to automatically
				834	unroll operations on higher-dimensional vectors into lists of operations on
				835	one-dimensional vectors before.
				836	- `StructBuilder` provides a convenient API for building IR that creates or
				837	accesses values of LLVM dialect structure types; it is derived by
				838	`MemRefDescriptor`, `UrankedMemrefDescriptor` and `ComplexBuilder` for the
				839	built-in types convertible to LLVM dialect structure types.
				840
				841	## Translation to LLVM IR
				842
				843	MLIR modules containing `llvm.func`, `llvm.mlir.global` and `llvm.metadata`
				844	operations can be translated to LLVM IR modules using the following scheme.
				845
				846	- Module-level globals are translated to LLVM IR global values.
				847	- Module-level metadata are translated to LLVM IR metadata, which can be later
				848	augmented with additional metadata defined on specific ops.
				849	- All functions are declared in the module so that they can be referenced.
				850	- Each function is then translated separately and has access to the complete
				851	mappings between MLIR and LLVM IR globals, metadata, and functions.
				852	- Within a function, blocks are traversed in topological order and translated
				853	to LLVM IR basic blocks. In each basic block, PHI nodes are created for each
				854	of the block arguments, but not connected to their source blocks.
				855	- Within each block, operations are translated in their order. Each operation
				856	has access to the same mappings as the function and additionally to the
				857	mapping of values between MLIR and LLVM IR, including PHI nodes. Operations
				858	with regions are responsible for translated the regions they contain.
				859	- After operations in a function are translated, the PHI nodes of blocks in
				860	this function are connected to their source values, which are now available.
				861
				862	The translation mechanism provides extension hooks for translating custom
				863	operations to LLVM IR via a dialect interface `LLVMTranslationDialectInterface`:
				864
				865	- `convertOperation` translates an operation that belongs to the current
				866	dialect to LLVM IR given an `IRBuilderBase` and various mappings;
				867	- `amendOperation` performs additional actions on an operation if it contains
				868	a dialect attribute that belongs to the current dialect, for example sets up
				869	instruction-level metadata.
				870
				871	Dialects containing operations or attributes that want to be translated to LLVM
				872	IR must provide an implementation of this interface and register it with the
				873	system. Note that registration may happen without creating the dialect, for
				874	example, in a separate library to avoid the need for the "main" dialect library
				875	to depend on LLVM IR libraries. The implementations of these methods may used
				876	the
				877	[`ModuleTranslation`](https://mlir.llvm.org/doxygen/classmlir_1_1LLVM_1_1ModuleTranslation.html)
				878	object provided to them which holds the state of the translation and contains
				879	numerous utilities.
				880
				881	Note that this extension mechanism is intentionally restrictive. LLVM IR has a
				882	small, relatively stable set of instructions and types that MLIR intends to
				883	model fully. Therefore, the extension mechanism is provided only for LLVM IR
				884	constructs that are more often extended -- intrinsics and metadata. The primary
				885	goal of the extension mechanism is to support sets of intrinsics, for example
				886	those representing a particular instruction set. The extension mechanism does
				887	not allow for customizing type or block translation, nor does it support custom
				888	module-level operations. Such transformations should be performed within MLIR
				889	and target the corresponding MLIR constructs.
				890
				891	## Translation from LLVM IR
				892
				893	An experimental flow allows one to import a substantially limited subset of LLVM
				894	IR into MLIR, producing LLVM dialect operations.
				895
				896	```
				897	mlir-translate -import-llvm filename.ll
				898	```