Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 1 | .. raw:: html |
| 2 | |
| 3 | <style type="text/css"> |
| 4 | .none { background-color: #FFCCCC } |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 5 | .part { background-color: #FFFF99 } |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 6 | .good { background-color: #CCFF99 } |
| 7 | </style> |
| 8 | |
| 9 | .. role:: none |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 10 | .. role:: part |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 11 | .. role:: good |
| 12 | |
| 13 | .. contents:: |
| 14 | :local: |
| 15 | |
| 16 | ================== |
| 17 | OpenCL Support |
| 18 | ================== |
| 19 | |
Anastasia Stulova | fdd615d | 2022-02-16 12:05:55 | [diff] [blame] | 20 | Clang has complete support of OpenCL C versions from 1.0 to 3.0. |
| 21 | Support for OpenCL 3.0 is in experimental phase (:ref:`OpenCL 3.0 <opencl_300>`). |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 22 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 23 | Clang also supports :ref:`the C++ for OpenCL kernel language <cxx_for_opencl_impl>`. |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 24 | |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 25 | There are also other :ref:`new and experimental features <opencl_experimenal>` |
| 26 | available. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 27 | |
Anastasia Stulova | 7df2597 | 2022-05-27 10:12:44 | [diff] [blame] | 28 | Details about usage of clang for OpenCL can be found in :doc:`UsersManual`. |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 29 | |
| 30 | Missing features or with limited support |
| 31 | ======================================== |
| 32 | |
| 33 | - For general issues and bugs with OpenCL in clang refer to `the GitHub issue |
| 34 | list |
| 35 | <https://github.com/llvm/llvm-project/issues?q=is%3Aopen+is%3Aissue+label%3Aopencl>`__. |
| 36 | |
KAWASHIMA Takahiro | 799b6b9 | 2022-11-16 04:23:51 | [diff] [blame^] | 37 | - Command-line flag :option:`-cl-ext` (used to override extensions/ |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 38 | features supported by a target) is missing support of some functionality i.e. that is |
| 39 | implemented fully through libraries (see :ref:`library-based features and |
| 40 | extensions <opencl_ext_libs>`). |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 41 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 42 | Internals Manual |
| 43 | ================ |
| 44 | |
| 45 | This section acts as internal documentation for OpenCL features design |
| 46 | as well as some important implementation aspects. It is primarily targeted |
| 47 | at the advanced users and the toolchain developers integrating frontend |
| 48 | functionality as a component. |
| 49 | |
| 50 | OpenCL Metadata |
| 51 | --------------- |
| 52 | |
| 53 | Clang uses metadata to provide additional OpenCL semantics in IR needed for |
| 54 | backends and OpenCL runtime. |
| 55 | |
| 56 | Each kernel will have function metadata attached to it, specifying the arguments. |
| 57 | Kernel argument metadata is used to provide source level information for querying |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 58 | at runtime, for example using the `clGetKernelArgInfo |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 59 | <https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf#167>`_ |
| 60 | call. |
| 61 | |
| 62 | Note that ``-cl-kernel-arg-info`` enables more information about the original |
| 63 | kernel code to be added e.g. kernel parameter names will appear in the OpenCL |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 64 | metadata along with other information. |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 65 | |
| 66 | The IDs used to encode the OpenCL's logical address spaces in the argument info |
| 67 | metadata follows the SPIR address space mapping as defined in the SPIR |
| 68 | specification `section 2.2 |
| 69 | <https://www.khronos.org/registry/spir/specs/spir_spec-2.0.pdf#18>`_ |
| 70 | |
| 71 | OpenCL Specific Options |
| 72 | ----------------------- |
| 73 | |
| 74 | In addition to the options described in :doc:`UsersManual` there are the |
| 75 | following options specific to the OpenCL frontend. |
| 76 | |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 77 | All the options in this section are frontend-only and therefore if used |
| 78 | with regular clang driver they require frontend forwarding, e.g. ``-cc1`` |
| 79 | or ``-Xclang``. |
| 80 | |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 81 | .. _opencl_finclude_default_header: |
| 82 | |
| 83 | .. option:: -finclude-default-header |
| 84 | |
| 85 | Adds most of builtin types and function declarations during compilations. By |
| 86 | default the OpenCL headers are not loaded by the frontend and therefore certain |
| 87 | builtin types and most of builtin functions are not declared. To load them |
| 88 | automatically this flag can be passed to the frontend (see also :ref:`the |
| 89 | section on the OpenCL Header <opencl_header>`): |
| 90 | |
| 91 | .. code-block:: console |
| 92 | |
| 93 | $ clang -Xclang -finclude-default-header test.cl |
| 94 | |
| 95 | Alternatively the internal header `opencl-c.h` containing the declarations |
| 96 | can be included manually using ``-include`` or ``-I`` followed by the path |
| 97 | to the header location. The header can be found in the clang source tree or |
| 98 | installation directory. |
| 99 | |
| 100 | .. code-block:: console |
| 101 | |
| 102 | $ clang -I<path to clang sources>/lib/Headers/opencl-c.h test.cl |
| 103 | $ clang -I<path to clang installation>/lib/clang/<llvm version>/include/opencl-c.h/opencl-c.h test.cl |
| 104 | |
| 105 | In this example it is assumed that the kernel code contains |
| 106 | ``#include <opencl-c.h>`` just as a regular C include. |
| 107 | |
| 108 | Because the header is very large and long to parse, PCH (:doc:`PCHInternals`) |
| 109 | and modules (:doc:`Modules`) can be used internally to improve the compilation |
| 110 | speed. |
| 111 | |
| 112 | To enable modules for OpenCL: |
| 113 | |
| 114 | .. code-block:: console |
| 115 | |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 116 | $ clang -target spir-unknown-unknown -c -emit-llvm -Xclang -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=<path to the generated module> test.cl |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 117 | |
| 118 | Another way to circumvent long parsing latency for the OpenCL builtin |
| 119 | declarations is to use mechanism enabled by :ref:`-fdeclare-opencl-builtins |
| 120 | <opencl_fdeclare_opencl_builtins>` flag that is available as an alternative |
| 121 | feature. |
| 122 | |
| 123 | .. _opencl_fdeclare_opencl_builtins: |
| 124 | |
| 125 | .. option:: -fdeclare-opencl-builtins |
| 126 | |
| 127 | In addition to regular header includes with builtin types and functions using |
| 128 | :ref:`-finclude-default-header <opencl_finclude_default_header>`, clang |
| 129 | supports a fast mechanism to declare builtin functions with |
| 130 | ``-fdeclare-opencl-builtins``. This does not declare the builtin types and |
| 131 | therefore it has to be used in combination with ``-finclude-default-header`` |
| 132 | if full functionality is required. |
| 133 | |
| 134 | **Example of Use**: |
| 135 | |
| 136 | .. code-block:: console |
Shao-Ce SUN | 0c66025 | 2021-11-15 01:17:08 | [diff] [blame] | 137 | |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 138 | $ clang -Xclang -fdeclare-opencl-builtins test.cl |
| 139 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 140 | .. _opencl_fake_address_space_map: |
| 141 | |
| 142 | .. option:: -ffake-address-space-map |
| 143 | |
| 144 | Overrides the target address space map with a fake map. |
| 145 | This allows adding explicit address space IDs to the bitcode for non-segmented |
| 146 | memory architectures that do not have separate IDs for each of the OpenCL |
| 147 | logical address spaces by default. Passing ``-ffake-address-space-map`` will |
| 148 | add/override address spaces of the target compiled for with the following values: |
| 149 | ``1-global``, ``2-constant``, ``3-local``, ``4-generic``. The private address |
| 150 | space is represented by the absence of an address space attribute in the IR (see |
| 151 | also :ref:`the section on the address space attribute <opencl_addrsp>`). |
| 152 | |
| 153 | .. code-block:: console |
| 154 | |
| 155 | $ clang -cc1 -ffake-address-space-map test.cl |
| 156 | |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 157 | .. _opencl_builtins: |
| 158 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 159 | OpenCL builtins |
| 160 | --------------- |
| 161 | |
Sven van Haastregt | 18a7079 | 2021-02-12 09:56:32 | [diff] [blame] | 162 | **Clang builtins** |
| 163 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 164 | There are some standard OpenCL functions that are implemented as Clang builtins: |
| 165 | |
| 166 | - All pipe functions from `section 6.13.16.2/6.13.16.3 |
| 167 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#160>`_ of |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 168 | the OpenCL v2.0 kernel language specification. |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 169 | |
| 170 | - Address space qualifier conversion functions ``to_global``/``to_local``/``to_private`` |
| 171 | from `section 6.13.9 |
| 172 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#101>`_. |
| 173 | |
| 174 | - All the ``enqueue_kernel`` functions from `section 6.13.17.1 |
| 175 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#164>`_ and |
| 176 | enqueue query functions from `section 6.13.17.5 |
| 177 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#171>`_. |
| 178 | |
Sven van Haastregt | 18a7079 | 2021-02-12 09:56:32 | [diff] [blame] | 179 | **Fast builtin function declarations** |
| 180 | |
| 181 | The implementation of the fast builtin function declarations (available via the |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 182 | :ref:`-fdeclare-opencl-builtins option <opencl_fdeclare_opencl_builtins>`) consists |
| 183 | of the following main components: |
Sven van Haastregt | 18a7079 | 2021-02-12 09:56:32 | [diff] [blame] | 184 | |
| 185 | - A TableGen definitions file ``OpenCLBuiltins.td``. This contains a compact |
| 186 | representation of the supported builtin functions. When adding new builtin |
| 187 | function declarations, this is normally the only file that needs modifying. |
| 188 | |
| 189 | - A Clang TableGen emitter defined in ``ClangOpenCLBuiltinEmitter.cpp``. During |
| 190 | Clang build time, the emitter reads the TableGen definition file and |
| 191 | generates ``OpenCLBuiltins.inc``. This generated file contains various tables |
| 192 | and functions that capture the builtin function data from the TableGen |
| 193 | definitions in a compact manner. |
| 194 | |
| 195 | - OpenCL specific code in ``SemaLookup.cpp``. When ``Sema::LookupBuiltin`` |
| 196 | encounters a potential builtin function, it will check if the name corresponds |
| 197 | to a valid OpenCL builtin function. If so, all overloads of the function are |
| 198 | inserted using ``InsertOCLBuiltinDeclarationsFromTable`` and overload |
| 199 | resolution takes place. |
| 200 | |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 201 | OpenCL Extensions and Features |
| 202 | ------------------------------ |
| 203 | |
| 204 | Clang implements various extensions to OpenCL kernel languages. |
| 205 | |
| 206 | New functionality is accepted as soon as the documentation is detailed to the |
| 207 | level sufficient to be implemented. There should be an evidence that the |
| 208 | extension is designed with implementation feasibility in consideration and |
| 209 | assessment of complexity for C/C++ based compilers. Alternatively, the |
| 210 | documentation can be accepted in a format of a draft that can be further |
| 211 | refined during the implementation. |
| 212 | |
| 213 | Implementation guidelines |
| 214 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 215 | |
| 216 | This section explains how to extend clang with the new functionality. |
| 217 | |
| 218 | **Parsing functionality** |
| 219 | |
| 220 | If an extension modifies the standard parsing it needs to be added to |
| 221 | the clang frontend source code. This also means that the associated macro |
| 222 | indicating the presence of the extension should be added to clang. |
| 223 | |
| 224 | The default flow for adding a new extension into the frontend is to |
| 225 | modify `OpenCLExtensions.def |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 226 | <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Basic/OpenCLExtensions.def>`__, |
| 227 | containing the list of all extensions and optional features supported by |
| 228 | the frontend. |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 229 | |
| 230 | This will add the macro automatically and also add a field in the target |
| 231 | options ``clang::TargetOptions::OpenCLFeaturesMap`` to control the exposure |
| 232 | of the new extension during the compilation. |
| 233 | |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 234 | Note that by default targets like `SPIR-V`, `SPIR` or `X86` expose all the OpenCL |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 235 | extensions. For all other targets the configuration has to be made explicitly. |
| 236 | |
| 237 | Note that the target extension support performed by clang can be overridden |
KAWASHIMA Takahiro | 799b6b9 | 2022-11-16 04:23:51 | [diff] [blame^] | 238 | with :option:`-cl-ext` command-line flags. |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 239 | |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 240 | .. _opencl_ext_libs: |
| 241 | |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 242 | **Library functionality** |
| 243 | |
| 244 | If an extension adds functionality that does not modify standard language |
Sven van Haastregt | 22fdf61 | 2021-08-06 09:21:26 | [diff] [blame] | 245 | parsing it should not require modifying anything other than header files and |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 246 | ``OpenCLBuiltins.td`` detailed in :ref:`OpenCL builtins <opencl_builtins>`. |
| 247 | Most commonly such extensions add functionality via libraries (by adding |
| 248 | non-native types or functions) parsed regularly. Similar to other languages this |
| 249 | is the most common way to add new functionality. |
| 250 | |
| 251 | Clang has standard headers where new types and functions are being added, |
| 252 | for more details refer to |
| 253 | :ref:`the section on the OpenCL Header <opencl_header>`. The macros indicating |
| 254 | the presence of such extensions can be added in the standard header files |
| 255 | conditioned on target specific predefined macros or/and language version |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 256 | predefined macros (see `feature/extension preprocessor macros defined in |
| 257 | opencl-c-base.h |
| 258 | <https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/opencl-c-base.h>`__). |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 259 | |
| 260 | **Pragmas** |
| 261 | |
| 262 | Some extensions alter standard parsing dynamically via pragmas. |
| 263 | |
| 264 | Clang provides a mechanism to add the standard extension pragma |
| 265 | ``OPENCL EXTENSION`` by setting a dedicated flag in the extension list entry of |
| 266 | ``OpenCLExtensions.def``. Note that there is no default behavior for the |
| 267 | standard extension pragmas as it is not specified (for the standards up to and |
| 268 | including version 3.0) in a sufficient level of detail and, therefore, |
| 269 | there is no default functionality provided by clang. |
| 270 | |
| 271 | Pragmas without detailed information of their behavior (e.g. an explanation of |
| 272 | changes it triggers in the parsing) should not be added to clang. Moreover, the |
| 273 | pragmas should provide useful functionality to the user. For example, such |
| 274 | functionality should address a practical use case and not be redundant i.e. |
| 275 | cannot be achieved using existing features. |
| 276 | |
| 277 | Note that some legacy extensions (published prior to OpenCL 3.0) still |
| 278 | provide some non-conformant functionality for pragmas e.g. add diagnostics on |
| 279 | the use of types or functions. This functionality is not guaranteed to remain in |
| 280 | future releases. However, any future changes should not affect backward |
| 281 | compatibility. |
| 282 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 283 | .. _opencl_addrsp: |
| 284 | |
| 285 | Address spaces attribute |
| 286 | ------------------------ |
| 287 | |
| 288 | Clang has arbitrary address space support using the ``address_space(N)`` |
| 289 | attribute, where ``N`` is an integer number in the range specified in the |
| 290 | Clang source code. This addresses spaces can be used along with the OpenCL |
| 291 | address spaces however when such addresses spaces converted to/from OpenCL |
| 292 | address spaces the behavior is not governed by OpenCL specification. |
| 293 | |
| 294 | An OpenCL implementation provides a list of standard address spaces using |
| 295 | keywords: ``private``, ``local``, ``global``, and ``generic``. In the AST and |
| 296 | in the IR each of the address spaces will be represented by unique number |
| 297 | provided in the Clang source code. The specific IDs for an address space do not |
| 298 | have to match between the AST and the IR. Typically in the AST address space |
| 299 | numbers represent logical segments while in the IR they represent physical |
| 300 | segments. |
| 301 | Therefore, machines with flat memory segments can map all AST address space |
| 302 | numbers to the same physical segment ID or skip address space attribute |
| 303 | completely while generating the IR. However, if the address space information |
| 304 | is needed by the IR passes e.g. to improve alias analysis, it is recommended |
| 305 | to keep it and only lower to reflect physical memory segments in the late |
| 306 | machine passes. The mapping between logical and target address spaces is |
| 307 | specified in the Clang's source code. |
| 308 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 309 | .. _cxx_for_opencl_impl: |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 310 | |
| 311 | C++ for OpenCL Implementation Status |
| 312 | ==================================== |
| 313 | |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 314 | Clang implements language versions 1.0 and 2021 published in `the official |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 315 | release of C++ for OpenCL Documentation |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 316 | <https://github.com/KhronosGroup/OpenCL-Docs/releases/tag/cxxforopencl-docrev2021.12>`_. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 317 | |
Anastasia Stulova | bc84f89 | 2021-01-15 17:19:16 | [diff] [blame] | 318 | Limited support of experimental C++ libraries is described in the :ref:`experimental features <opencl_experimenal>`. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 319 | |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 320 | GitHub issues for this functionality are typically prefixed |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 321 | with '[C++4OpenCL]' - click `here |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 322 | <https://github.com/llvm/llvm-project/issues?q=is%3Aissue+is%3Aopen+%5BC%2B%2B4OpenCL%5D>`__ |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 323 | to view the full bug list. |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 324 | |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 325 | |
| 326 | Missing features or with limited support |
| 327 | ---------------------------------------- |
| 328 | |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 329 | - Support of C++ for OpenCL 2021 is currently in experimental phase. Refer to |
| 330 | :ref:`OpenCL 3.0 status <opencl_300>` for details of common missing |
| 331 | functionality from OpenCL 3.0. |
| 332 | |
| 333 | - IR generation for non-trivial global destructors is incomplete (See: |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 334 | `PR48047 <https://llvm.org/PR48047>`_). |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 335 | |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 336 | - Support of `destrutors with non-default address spaces |
| 337 | <https://www.khronos.org/opencl/assets/CXX_for_OpenCL.html#_construction_initialization_and_destruction>`_ |
| 338 | is incomplete (See: `D109609 <https://reviews.llvm.org/D109609>`_). |
| 339 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 340 | .. _opencl_300: |
| 341 | |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 342 | OpenCL C 3.0 Usage |
Anastasia Stulova | 5ccc79d | 2021-05-24 13:18:56 | [diff] [blame] | 343 | ================== |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 344 | |
| 345 | OpenCL C 3.0 language standard makes most OpenCL C 2.0 features optional. Optional |
| 346 | functionality in OpenCL C 3.0 is indicated with the presence of feature-test macros |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 347 | (list of feature-test macros is `here <https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#features>`__). |
KAWASHIMA Takahiro | 799b6b9 | 2022-11-16 04:23:51 | [diff] [blame^] | 348 | Command-line flag :option:`-cl-ext` can be used to override features supported by a target. |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 349 | |
| 350 | For cases when there is an associated extension for a specific feature (fp64 and 3d image writes) |
| 351 | user should specify both (extension and feature) in command-line flag: |
| 352 | |
| 353 | .. code-block:: console |
| 354 | |
Anastasia Stulova | 3087afb | 2022-05-26 14:47:56 | [diff] [blame] | 355 | $ clang -cl-std=CL3.0 -cl-ext=+cl_khr_fp64,+__opencl_c_fp64 ... |
| 356 | $ clang -cl-std=CL3.0 -cl-ext=-cl_khr_fp64,-__opencl_c_fp64 ... |
| 357 | |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 358 | |
| 359 | |
| 360 | OpenCL C 3.0 Implementation Status |
Anastasia Stulova | 5ccc79d | 2021-05-24 13:18:56 | [diff] [blame] | 361 | ---------------------------------- |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 362 | |
| 363 | The following table provides an overview of features in OpenCL C 3.0 and their |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 364 | implementation status. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 365 | |
Anastasia Stulova | fdd615d | 2022-02-16 12:05:55 | [diff] [blame] | 366 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 367 | | Category | Feature | Status | Reviews | |
| 368 | +==============================+=========================+=========================================+======================+================================================================================================================================+ |
| 369 | | Command line interface | New value for ``-cl-std`` flag | :good:`done` | https://reviews.llvm.org/D88300 | |
| 370 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 371 | | Predefined macros | New version macro | :good:`done` | https://reviews.llvm.org/D88300 | |
| 372 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 373 | | Predefined macros | Feature macros | :good:`done` | https://reviews.llvm.org/D95776 | |
| 374 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 375 | | Feature optionality | Generic address space | :good:`done` | https://reviews.llvm.org/D95778 and https://reviews.llvm.org/D103401 | |
| 376 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 377 | | Feature optionality | Builtin function overloads with generic address space | :good:`done` | https://reviews.llvm.org/D105526, https://reviews.llvm.org/D107769 | |
| 378 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 379 | | Feature optionality | Program scope variables in global memory | :good:`done` | https://reviews.llvm.org/D103191 | |
| 380 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 381 | | Feature optionality | 3D image writes including builtin functions | :good:`done` | https://reviews.llvm.org/D106260 (frontend) | |
| 382 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 383 | | Feature optionality | read_write images including builtin functions | :good:`done` | https://reviews.llvm.org/D104915 (frontend) and https://reviews.llvm.org/D107539, https://reviews.llvm.org/D117899 (functions) | |
| 384 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 385 | | Feature optionality | C11 atomics memory scopes, ordering and builtin function | :good:`done` | https://reviews.llvm.org/D106111, https://reviews.llvm.org/D119420 | |
| 386 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 387 | | Feature optionality | Blocks and Device-side kernel enqueue including builtin functions | :good:`done` | https://reviews.llvm.org/D115640, https://reviews.llvm.org/D118605 | |
| 388 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 389 | | Feature optionality | Pipes including builtin functions | :good:`done` | https://reviews.llvm.org/D107154 (frontend) and https://reviews.llvm.org/D105858 (functions) | |
| 390 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 391 | | Feature optionality | Work group collective builtin functions | :good:`done` | https://reviews.llvm.org/D105858 | |
| 392 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 393 | | Feature optionality | Image types and builtin functions | :good:`done` | https://reviews.llvm.org/D103911 (frontend) and https://reviews.llvm.org/D107539 (functions) | |
| 394 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 395 | | Feature optionality | Double precision floating point type | :good:`done` | https://reviews.llvm.org/D96524 | |
| 396 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 397 | | New functionality | RGBA vector components | :good:`done` | https://reviews.llvm.org/D99969 | |
| 398 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 399 | | New functionality | Subgroup functions | :good:`done` | https://reviews.llvm.org/D105858, https://reviews.llvm.org/D118999 | |
| 400 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
| 401 | | New functionality | Atomic mem scopes: subgroup, all devices including functions | :good:`done` | https://reviews.llvm.org/D103241 | |
| 402 | +------------------------------+-------------------------+-----------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------------------------+ |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 403 | |
| 404 | .. _opencl_experimenal: |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 405 | |
| 406 | Experimental features |
| 407 | ===================== |
| 408 | |
| 409 | Clang provides the following new WIP features for the developers to experiment |
| 410 | and provide early feedback or contribute with further improvements. |
tlattner | eb1ffd8 | 2022-07-01 21:07:48 | [diff] [blame] | 411 | Feel free to contact us on `the Discourse forums (Clang Frontend category) |
| 412 | <https://discourse.llvm.org/c/clang/6>`_ or file `a GitHub issue |
Anastasia Stulova | 30ad174 | 2022-01-04 11:14:30 | [diff] [blame] | 413 | <https://github.com/llvm/llvm-project/issues/new>`_. |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 414 | |
Anastasia Stulova | 6e8601f | 2021-04-08 09:59:44 | [diff] [blame] | 415 | .. _opencl_experimental_cxxlibs: |
Anastasia Stulova | 7c541a1 | 2021-04-01 12:54:54 | [diff] [blame] | 416 | |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 417 | C++ libraries for OpenCL |
| 418 | ------------------------ |
| 419 | |
| 420 | There is ongoing work to support C++ standard libraries from `LLVM's libcxx |
| 421 | <https://libcxx.llvm.org/>`_ in OpenCL kernel code using C++ for OpenCL mode. |
| 422 | |
| 423 | It is currently possible to include `type_traits` from C++17 in the kernel |
| 424 | sources when the following clang extensions are enabled |
| 425 | ``__cl_clang_function_pointers`` and ``__cl_clang_variadic_functions``, |
| 426 | see :doc:`LanguageExtensions` for more details. The use of non-conformant |
| 427 | features enabled by the extensions does not expose non-conformant behavior |
| 428 | beyond the compilation i.e. does not get generated in IR or binary. |
| 429 | The extension only appear in metaprogramming |
| 430 | mechanism to identify or verify the properties of types. This allows to provide |
| 431 | the full C++ functionality without a loss of portability. To avoid unsafe use |
| 432 | of the extensions it is recommended that the extensions are disabled directly |
| 433 | after the header include. |
| 434 | |
| 435 | **Example of Use**: |
| 436 | |
| 437 | The example of kernel code with `type_traits` is illustrated here. |
| 438 | |
| 439 | .. code-block:: c++ |
| 440 | |
| 441 | #pragma OPENCL EXTENSION __cl_clang_function_pointers : enable |
| 442 | #pragma OPENCL EXTENSION __cl_clang_variadic_functions : enable |
| 443 | #include <type_traits> |
| 444 | #pragma OPENCL EXTENSION __cl_clang_function_pointers : disable |
| 445 | #pragma OPENCL EXTENSION __cl_clang_variadic_functions : disable |
| 446 | |
| 447 | using sint_type = std::make_signed<unsigned int>::type; |
| 448 | |
| 449 | __kernel void foo() { |
| 450 | static_assert(!std::is_same<sint_type, unsigned int>::value); |
| 451 | } |
| 452 | |
| 453 | The possible clang invocation to compile the example is as follows: |
| 454 | |
| 455 | .. code-block:: console |
| 456 | |
Ole Strohm | f372ff1 | 2021-05-07 11:30:31 | [diff] [blame] | 457 | $ clang -I<path to libcxx checkout or installation>/include test.clcpp |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 458 | |
| 459 | Note that `type_traits` is a header only library and therefore no extra |
Anastasia Stulova | 7c541a1 | 2021-04-01 12:54:54 | [diff] [blame] | 460 | linking step against the standard libraries is required. See full example |
| 461 | in `Compiler Explorer <https://godbolt.org/z/5WbnTfb65>`_. |
Anastasia Stulova | 9685631 | 2021-09-10 11:29:11 | [diff] [blame] | 462 | |
| 463 | More OpenCL specific C++ library implementations built on top of libcxx |
| 464 | are available in `libclcxx <https://github.com/KhronosGroup/libclcxx>`_ |
| 465 | project. |