Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 1 | # MLIR C API |
| 2 | |
| 3 | **Current status: Under development, API unstable, built by default.** |
| 4 | |
| 5 | ## Design |
| 6 | |
| 7 | Many languages can interoperate with C but have a harder time with C++ due to |
| 8 | name mangling and memory model differences. Although the C API for MLIR can be |
| 9 | used directly from C, it is primarily intended to be wrapped in higher-level |
| 10 | language- or library-specific constructs. Therefore the API tends towards |
| 11 | simplicity and feature minimalism. |
| 12 | |
| 13 | **Note:** while the C API is expected to be more stable than C++ API, it |
| 14 | currently offers no stability guarantees. |
| 15 | |
| 16 | ### Scope |
| 17 | |
| 18 | The API is provided for core IR components (attributes, blocks, operations, |
| 19 | regions, types, values), Passes and some fundamental type and attribute kinds. |
| 20 | The core IR API is intentionally low-level, e.g. exposes a plain list of |
| 21 | operation's operands and attributes without attempting to assign "semantic" |
| 22 | names to them. Users of specific dialects are expected to wrap the core API in a |
| 23 | dialect-specific way, for example, by implementing an ODS backend. |
| 24 | |
| 25 | ### Object Model |
| 26 | |
| 27 | Core IR components are exposed as opaque _handles_ to an IR object existing in |
| 28 | C++. They are not intended to be inspected by the API users (and, in many cases, |
| 29 | cannot be meaningfully inspected). Instead the users are expected to pass |
| 30 | handles to the appropriate manipulation functions. |
| 31 | |
| 32 | The handle _may or may not_ own the underlying object. |
| 33 | |
| 34 | ### Naming Convention and Ownership Model |
| 35 | |
| 36 | All objects are prefixed with `Mlir`. They are typedefs and should be used |
| 37 | without `struct`. |
| 38 | |
| 39 | All functions are prefixed with `mlir`. |
| 40 | |
| 41 | Functions primarily operating on an instance of `MlirX` are prefixed with |
| 42 | `mlirX`. They take the instance being acted upon as their first argument (except |
| 43 | for creation functions). For example, `mlirOperationGetNumOperands` inspects an |
| 44 | `MlirOperation`, which it takes as its first operand. |
| 45 | |
| 46 | The *ownership* model is encoded in the naming convention as follows. |
| 47 | |
Kazuaki Ishizaki | 603a8a6 | 2020-08-26 18:50:14 | [diff] [blame] | 48 | - By default, the ownership is not transferred. |
| 49 | - Functions that transfer the ownership of the result to the caller can be in |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 50 | one of two forms: |
| 51 | * functions that create a new object have the name `mlirXCreate<...>`, for |
| 52 | example, `mlirOperationCreate`; |
| 53 | * functions that detach an object from a parent object have the name |
| 54 | `mlirYTake<...>`, for example `mlirOperationStateTakeRegion`. |
| 55 | - Functions that take ownership of some of their arguments have the form |
| 56 | `mlirY<...>OwnedX<...>` where `X` can refer to the type or any other |
| 57 | sufficiently unique description of the argument, the ownership of which will |
| 58 | be taken by the callee, for example `mlirRegionAppendOwnedBlock`. |
| 59 | - Functions that create an object by default do not transfer its ownership to |
| 60 | the caller, i.e. one of other objects passed in as an argument retains the |
| 61 | ownership, they have the form `mlirX<...>Get`. For example, |
| 62 | `mlirTypeParseGet`. |
| 63 | - Functions that destroy an object owned by the caller are of the form |
| 64 | `mlirXDestroy`. |
| 65 | |
| 66 | If the code owns an object, it is responsible for destroying the object when it |
| 67 | is no longer necessary. If an object that owns other objects is destroyed, any |
| 68 | handles to those objects become invalid. Note that types and attributes are |
| 69 | owned by the `MlirContext` in which they were created. |
| 70 | |
| 71 | ### Nullity |
| 72 | |
| 73 | A handle may refer to a _null_ object. It is the responsibility of the caller to |
Alex Zinenko | 321aa19 | 2020-08-11 16:25:09 | [diff] [blame] | 74 | check if an object is null by using `mlirXIsNull(MlirX)`. API functions do _not_ |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 75 | expect null objects as arguments unless explicitly stated otherwise. API |
| 76 | functions _may_ return null objects. |
| 77 | |
Alex Zinenko | 74f57784 | 2020-08-18 08:26:30 | [diff] [blame] | 78 | ### Type Hierarchies |
| 79 | |
| 80 | MLIR objects can form type hierarchies in C++. For example, all IR classes |
| 81 | representing types are derived from `mlir::Type`, some of them may also be also |
| 82 | derived from common base classes such as `mlir::ShapedType` or dialect-specific |
| 83 | base classes. Type hierarchies are exposed to C API through naming conventions |
| 84 | as follows. |
| 85 | |
| 86 | - Only the top-level class of each hierarchy is exposed, e.g. `MlirType` is |
| 87 | defined as a type but `MlirShapedType` is not. This avoids the need for |
| 88 | explicit upcasting when passing an object of a derived type to a function |
| 89 | that expects a base type (this happens more often in core/standard APIs, |
| 90 | while downcasting usually involves further checks anyway). |
| 91 | - A type `Y` that derives from `X` provides a function `int mlirXIsAY(MlirX)` |
| 92 | that returns a non-zero value if the given dynamic instance of `X` is also |
| 93 | an instance of `Y`. For example, `int MlirTypeIsAInteger(MlirType)`. |
| 94 | - A function that expects a derived type as its first argument takes the base |
| 95 | type instead and documents the expectation by using `Y` in its name |
| 96 | `MlirY<...>(MlirX, ...)`. This function asserts that the dynamic instance of |
| 97 | its first argument is `Y`, and it is the responsibility of the caller to |
| 98 | ensure it is indeed the case. |
| 99 | |
Alex Zinenko | 855ec51 | 2020-09-15 10:04:59 | [diff] [blame] | 100 | ### Auxiliary Types |
| 101 | |
| 102 | #### `StringRef` |
Alex Zinenko | da56297 | 2020-08-19 16:38:56 | [diff] [blame] | 103 | |
| 104 | Numerous MLIR functions return instances of `StringRef` to refer to a non-owning |
| 105 | segment of a string. This segment may or may not be null-terminated. In C API, |
Alex Zinenko | 855ec51 | 2020-09-15 10:04:59 | [diff] [blame] | 106 | these are represented as instances of `MlirStringRef` structure that contains a |
| 107 | pointer to the first character of the string fragment (`str`) and the fragment |
| 108 | length (`length`). Note that the fragment is _not necessarily_ null-terminated, |
| 109 | the `length` field must be used to identify the last character. `MlirStringRef` |
Kazuaki Ishizaki | 2b638ed | 2021-01-06 17:35:29 | [diff] [blame] | 110 | is a non-owning pointer, the caller is in charge of performing the copy or |
Alex Zinenko | 855ec51 | 2020-09-15 10:04:59 | [diff] [blame] | 111 | ensuring that the pointee outlives all uses of `MlirStringRef`. |
Alex Zinenko | da56297 | 2020-08-19 16:38:56 | [diff] [blame] | 112 | |
Alex Zinenko | 855ec51 | 2020-09-15 10:04:59 | [diff] [blame] | 113 | ### Printing |
Alex Zinenko | da56297 | 2020-08-19 16:38:56 | [diff] [blame] | 114 | |
Alex Zinenko | 855ec51 | 2020-09-15 10:04:59 | [diff] [blame] | 115 | IR objects can be printed using `mlirXPrint(MlirX, MlirStringCallback, void *)` |
| 116 | functions. These functions accept take arguments a callback with signature `void |
| 117 | (*)(const char *, intptr_t, void *)` and a pointer to user-defined data. They |
| 118 | call the callback and supply it with chunks of the string representation, |
| 119 | provided as a pointer to the first character and a length, and forward the |
| 120 | user-defined data unmodified. It is up to the caller to allocate memory if the |
| 121 | string representation must be stored and perform the copy. There is no guarantee |
| 122 | that the pointer supplied to the callback points to a null-terminated string, |
| 123 | the size argument should be used to find the end of the string. The callback may |
| 124 | be called multiple times with consecutive chunks of the string representation |
| 125 | (the printing itself is buffered). |
Alex Zinenko | 321aa19 | 2020-08-11 16:25:09 | [diff] [blame] | 126 | |
| 127 | *Rationale*: this approach allows the caller to have full control of the |
| 128 | allocation and avoid unnecessary allocation and copying inside the printer. |
| 129 | |
| 130 | For convenience, `mlirXDump(MlirX)` functions are provided to print the given |
| 131 | object to the standard error stream. |
| 132 | |
Alex Zinenko | 74f57784 | 2020-08-18 08:26:30 | [diff] [blame] | 133 | ## Common Patterns |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 134 | |
| 135 | The API adopts the following patterns for recurrent functionality in MLIR. |
| 136 | |
Alex Zinenko | 74f57784 | 2020-08-18 08:26:30 | [diff] [blame] | 137 | ### Indexed Components |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 138 | |
| 139 | An object has an _indexed component_ if it has fields accessible using a |
| 140 | zero-based contiguous integer index, typically arrays. For example, an |
Kazuaki Ishizaki | 603a8a6 | 2020-08-26 18:50:14 | [diff] [blame] | 141 | `MlirBlock` has its arguments as an indexed component. An object may have |
| 142 | several such components. For example, an `MlirOperation` has attributes, |
| 143 | operands, regions, results and successors. |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 144 | |
| 145 | For indexed components, the following pair of functions is provided. |
| 146 | |
Alex Zinenko | af83858 | 2020-08-11 16:34:32 | [diff] [blame] | 147 | - `intptr_t mlirXGetNum<Y>s(MlirX)` returns the upper bound on the index. |
| 148 | - `MlirY mlirXGet<Y>(MlirX, intptr_t pos)` returns 'pos'-th subobject. |
| 149 | |
| 150 | The sizes are accepted and returned as signed pointer-sized integers, i.e. |
Kazuaki Ishizaki | 603a8a6 | 2020-08-26 18:50:14 | [diff] [blame] | 151 | `intptr_t`. This typedef is available in C99. |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 152 | |
| 153 | Note that the name of subobject in the function does not necessarily match the |
Kazuaki Ishizaki | 603a8a6 | 2020-08-26 18:50:14 | [diff] [blame] | 154 | type of the subobject. For example, `mlirOperationGetOperand` returns an |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 155 | `MlirValue`. |
| 156 | |
Alex Zinenko | 74f57784 | 2020-08-18 08:26:30 | [diff] [blame] | 157 | ### Iterable Components |
Alex Zinenko | 75f239e | 2020-08-05 12:36:16 | [diff] [blame] | 158 | |
| 159 | An object has an _iterable component_ if it has iterators accessing its fields |
| 160 | in some order other than integer indexing, typically linked lists. For example, |
| 161 | an `MlirBlock` has an iterable list of operations it contains. An object may |
| 162 | have several iterable components. |
| 163 | |
| 164 | For iterable components, the following triple of functions is provided. |
| 165 | |
| 166 | - `MlirY mlirXGetFirst<Y>(MlirX)` returns the first subobject in the list. |
| 167 | - `MlirY mlirYGetNextIn<X>(MlirY)` returns the next subobject in the list that |
| 168 | contains the given object, or a null object if the given object is the last |
| 169 | in this list. |
| 170 | - `int mlirYIsNull(MlirY)` returns 1 if the given object is null. |
| 171 | |
| 172 | Note that the name of subobject in the function may or may not match its type. |
| 173 | |
| 174 | This approach enables one to iterate as follows. |
| 175 | |
| 176 | ```c++ |
| 177 | MlirY iter; |
| 178 | for (iter = mlirXGetFirst<Y>(x); !mlirYIsNull(iter); |
| 179 | iter = mlirYGetNextIn<X>(iter)) { |
| 180 | /* User 'iter'. */ |
| 181 | } |
| 182 | ``` |
Alex Zinenko | 74f57784 | 2020-08-18 08:26:30 | [diff] [blame] | 183 | |
| 184 | ## Extending the API |
| 185 | |
| 186 | ### Extensions for Dialect Attributes and Types |
| 187 | |
River Riddle | c7cae0e | 2020-12-04 01:22:57 | [diff] [blame] | 188 | Dialect attributes and types can follow the example of builtin attributes and |
Alex Zinenko | 74f57784 | 2020-08-18 08:26:30 | [diff] [blame] | 189 | types, provided that implementations live in separate directories, i.e. |
| 190 | `include/mlir-c/<...>Dialect/` and `lib/CAPI/<...>Dialect/`. The core APIs |
| 191 | provide implementation-private headers in `include/mlir/CAPI/IR` that allow one |
| 192 | to convert between opaque C structures for core IR components and their C++ |
| 193 | counterparts. `wrap` converts a C++ class into a C structure and `unwrap` does |
River Riddle | c7cae0e | 2020-12-04 01:22:57 | [diff] [blame] | 194 | the inverse conversion. Once the C++ object is available, the API implementation |
| 195 | should rely on `isa` to implement `mlirXIsAY` and is expected to use `cast` |
| 196 | inside other API calls. |
Alex Zinenko | 14c9207 | 2021-10-14 15:18:28 | [diff] [blame] | 197 | |
| 198 | ### Extensions for Interfaces |
| 199 | |
| 200 | Interfaces can follow the example of IR interfaces and should be placed in the |
| 201 | appropriate library (e.g., common interfaces in `mlir-c/Interfaces` and |
| 202 | dialect-specific interfaces in their dialect library). Similarly to other type |
| 203 | hierarchies, interfaces are not expected to have objects of their own type and |
| 204 | instead operate on top-level objects: `MlirAttribute`, `MlirOperation` and |
| 205 | `MlirType`. Static interface methods are expected to take as leading argument a |
| 206 | canonical identifier of the class, `MlirStringRef` with the name for operations |
| 207 | and `MlirTypeID` for attributes and types, followed by `MlirContext` in which |
| 208 | the interfaces are registered. |
| 209 | |
| 210 | Individual interfaces are expected provide a `mlir<InterfaceName>TypeID()` |
| 211 | function that can be used to check whether an object or a class implements this |
| 212 | interface using `mlir<Attribute/Operation/Type>ImplementsInterface` or |
| 213 | `mlir<Attribute/Operation?Type>ImplementsInterfaceStatic` functions, |
| 214 | respectively. Rationale: C++ `isa` only works when an object exists, static |
| 215 | methods are usually dispatched to using templates; lookup by `TypeID` in |
| 216 | `MLIRContext` works even without an object. |