Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 1 | .. raw:: html |
| 2 | |
| 3 | <style type="text/css"> |
| 4 | .none { background-color: #FFCCCC } |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 5 | .part { background-color: #FFFF99 } |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 6 | .good { background-color: #CCFF99 } |
| 7 | </style> |
| 8 | |
| 9 | .. role:: none |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 10 | .. role:: part |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 11 | .. role:: good |
| 12 | |
| 13 | .. contents:: |
| 14 | :local: |
| 15 | |
| 16 | ================== |
| 17 | OpenCL Support |
| 18 | ================== |
| 19 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 20 | Clang has complete support of OpenCL C versions from 1.0 to 2.0. |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 21 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 22 | Clang also supports :ref:`the C++ for OpenCL kernel language <cxx_for_opencl_impl>`. |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 23 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 24 | There is an ongoing work to support :ref:`OpenCL 3.0 <opencl_300>`. |
| 25 | |
| 26 | There are also other :ref:`new and experimental features <opencl_experimenal>` available. |
| 27 | |
| 28 | For general issues and bugs with OpenCL in clang refer to `Bugzilla |
| 29 | <https://bugs.llvm.org/buglist.cgi?component=OpenCL&list_id=172679&product=clang&resolution=--->`__. |
| 30 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 31 | Internals Manual |
| 32 | ================ |
| 33 | |
| 34 | This section acts as internal documentation for OpenCL features design |
| 35 | as well as some important implementation aspects. It is primarily targeted |
| 36 | at the advanced users and the toolchain developers integrating frontend |
| 37 | functionality as a component. |
| 38 | |
| 39 | OpenCL Metadata |
| 40 | --------------- |
| 41 | |
| 42 | Clang uses metadata to provide additional OpenCL semantics in IR needed for |
| 43 | backends and OpenCL runtime. |
| 44 | |
| 45 | Each kernel will have function metadata attached to it, specifying the arguments. |
| 46 | Kernel argument metadata is used to provide source level information for querying |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 47 | at runtime, for example using the `clGetKernelArgInfo |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 48 | <https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf#167>`_ |
| 49 | call. |
| 50 | |
| 51 | Note that ``-cl-kernel-arg-info`` enables more information about the original |
| 52 | kernel code to be added e.g. kernel parameter names will appear in the OpenCL |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 53 | metadata along with other information. |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 54 | |
| 55 | The IDs used to encode the OpenCL's logical address spaces in the argument info |
| 56 | metadata follows the SPIR address space mapping as defined in the SPIR |
| 57 | specification `section 2.2 |
| 58 | <https://www.khronos.org/registry/spir/specs/spir_spec-2.0.pdf#18>`_ |
| 59 | |
| 60 | OpenCL Specific Options |
| 61 | ----------------------- |
| 62 | |
| 63 | In addition to the options described in :doc:`UsersManual` there are the |
| 64 | following options specific to the OpenCL frontend. |
| 65 | |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 66 | All the options in this section are frontend-only and therefore if used |
| 67 | with regular clang driver they require frontend forwarding, e.g. ``-cc1`` |
| 68 | or ``-Xclang``. |
| 69 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 70 | .. _opencl_cl_ext: |
| 71 | |
| 72 | .. option:: -cl-ext |
| 73 | |
| 74 | Disables support of OpenCL extensions. All OpenCL targets provide a list |
| 75 | of extensions that they support. Clang allows to amend this using the ``-cl-ext`` |
| 76 | flag with a comma-separated list of extensions prefixed with ``'+'`` or ``'-'``. |
| 77 | The syntax: ``-cl-ext=<(['-'|'+']<extension>[,])+>``, where extensions |
| 78 | can be either one of `the OpenCL published extensions |
| 79 | <https://www.khronos.org/registry/OpenCL>`_ |
| 80 | or any vendor extension. Alternatively, ``'all'`` can be used to enable |
| 81 | or disable all known extensions. |
| 82 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 83 | Example disabling double support for the 64-bit SPIR target: |
| 84 | |
| 85 | .. code-block:: console |
| 86 | |
| 87 | $ clang -cc1 -triple spir64-unknown-unknown -cl-ext=-cl_khr_fp64 test.cl |
| 88 | |
| 89 | Enabling all extensions except double support in R600 AMD GPU can be done using: |
| 90 | |
| 91 | .. code-block:: console |
| 92 | |
| 93 | $ clang -cc1 -triple r600-unknown-unknown -cl-ext=-all,+cl_khr_fp16 test.cl |
| 94 | |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 95 | .. _opencl_finclude_default_header: |
| 96 | |
| 97 | .. option:: -finclude-default-header |
| 98 | |
| 99 | Adds most of builtin types and function declarations during compilations. By |
| 100 | default the OpenCL headers are not loaded by the frontend and therefore certain |
| 101 | builtin types and most of builtin functions are not declared. To load them |
| 102 | automatically this flag can be passed to the frontend (see also :ref:`the |
| 103 | section on the OpenCL Header <opencl_header>`): |
| 104 | |
| 105 | .. code-block:: console |
| 106 | |
| 107 | $ clang -Xclang -finclude-default-header test.cl |
| 108 | |
| 109 | Alternatively the internal header `opencl-c.h` containing the declarations |
| 110 | can be included manually using ``-include`` or ``-I`` followed by the path |
| 111 | to the header location. The header can be found in the clang source tree or |
| 112 | installation directory. |
| 113 | |
| 114 | .. code-block:: console |
| 115 | |
| 116 | $ clang -I<path to clang sources>/lib/Headers/opencl-c.h test.cl |
| 117 | $ clang -I<path to clang installation>/lib/clang/<llvm version>/include/opencl-c.h/opencl-c.h test.cl |
| 118 | |
| 119 | In this example it is assumed that the kernel code contains |
| 120 | ``#include <opencl-c.h>`` just as a regular C include. |
| 121 | |
| 122 | Because the header is very large and long to parse, PCH (:doc:`PCHInternals`) |
| 123 | and modules (:doc:`Modules`) can be used internally to improve the compilation |
| 124 | speed. |
| 125 | |
| 126 | To enable modules for OpenCL: |
| 127 | |
| 128 | .. code-block:: console |
| 129 | |
| 130 | $ clang -target spir-unknown-unknown -c -emit-llvm -Xclang -finclude-default-header -fmodules -fimplicit-module-maps -fm odules-cache-path=<path to the generated module> test.cl |
| 131 | |
| 132 | Another way to circumvent long parsing latency for the OpenCL builtin |
| 133 | declarations is to use mechanism enabled by :ref:`-fdeclare-opencl-builtins |
| 134 | <opencl_fdeclare_opencl_builtins>` flag that is available as an alternative |
| 135 | feature. |
| 136 | |
| 137 | .. _opencl_fdeclare_opencl_builtins: |
| 138 | |
| 139 | .. option:: -fdeclare-opencl-builtins |
| 140 | |
| 141 | In addition to regular header includes with builtin types and functions using |
| 142 | :ref:`-finclude-default-header <opencl_finclude_default_header>`, clang |
| 143 | supports a fast mechanism to declare builtin functions with |
| 144 | ``-fdeclare-opencl-builtins``. This does not declare the builtin types and |
| 145 | therefore it has to be used in combination with ``-finclude-default-header`` |
| 146 | if full functionality is required. |
| 147 | |
| 148 | **Example of Use**: |
| 149 | |
| 150 | .. code-block:: console |
| 151 | |
| 152 | $ clang -Xclang -fdeclare-opencl-builtins test.cl |
| 153 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 154 | .. _opencl_fake_address_space_map: |
| 155 | |
| 156 | .. option:: -ffake-address-space-map |
| 157 | |
| 158 | Overrides the target address space map with a fake map. |
| 159 | This allows adding explicit address space IDs to the bitcode for non-segmented |
| 160 | memory architectures that do not have separate IDs for each of the OpenCL |
| 161 | logical address spaces by default. Passing ``-ffake-address-space-map`` will |
| 162 | add/override address spaces of the target compiled for with the following values: |
| 163 | ``1-global``, ``2-constant``, ``3-local``, ``4-generic``. The private address |
| 164 | space is represented by the absence of an address space attribute in the IR (see |
| 165 | also :ref:`the section on the address space attribute <opencl_addrsp>`). |
| 166 | |
| 167 | .. code-block:: console |
| 168 | |
| 169 | $ clang -cc1 -ffake-address-space-map test.cl |
| 170 | |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 171 | .. _opencl_builtins: |
| 172 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 173 | OpenCL builtins |
| 174 | --------------- |
| 175 | |
Sven van Haastregt | 18a7079 | 2021-02-12 09:56:32 | [diff] [blame] | 176 | **Clang builtins** |
| 177 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 178 | There are some standard OpenCL functions that are implemented as Clang builtins: |
| 179 | |
| 180 | - All pipe functions from `section 6.13.16.2/6.13.16.3 |
| 181 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#160>`_ of |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 182 | the OpenCL v2.0 kernel language specification. |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 183 | |
| 184 | - Address space qualifier conversion functions ``to_global``/``to_local``/``to_private`` |
| 185 | from `section 6.13.9 |
| 186 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#101>`_. |
| 187 | |
| 188 | - All the ``enqueue_kernel`` functions from `section 6.13.17.1 |
| 189 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#164>`_ and |
| 190 | enqueue query functions from `section 6.13.17.5 |
| 191 | <https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#171>`_. |
| 192 | |
Sven van Haastregt | 18a7079 | 2021-02-12 09:56:32 | [diff] [blame] | 193 | **Fast builtin function declarations** |
| 194 | |
| 195 | The implementation of the fast builtin function declarations (available via the |
Anastasia Stulova | 90355d6 | 2021-02-23 11:44:13 | [diff] [blame] | 196 | :ref:`-fdeclare-opencl-builtins option <opencl_fdeclare_opencl_builtins>`) consists |
| 197 | of the following main components: |
Sven van Haastregt | 18a7079 | 2021-02-12 09:56:32 | [diff] [blame] | 198 | |
| 199 | - A TableGen definitions file ``OpenCLBuiltins.td``. This contains a compact |
| 200 | representation of the supported builtin functions. When adding new builtin |
| 201 | function declarations, this is normally the only file that needs modifying. |
| 202 | |
| 203 | - A Clang TableGen emitter defined in ``ClangOpenCLBuiltinEmitter.cpp``. During |
| 204 | Clang build time, the emitter reads the TableGen definition file and |
| 205 | generates ``OpenCLBuiltins.inc``. This generated file contains various tables |
| 206 | and functions that capture the builtin function data from the TableGen |
| 207 | definitions in a compact manner. |
| 208 | |
| 209 | - OpenCL specific code in ``SemaLookup.cpp``. When ``Sema::LookupBuiltin`` |
| 210 | encounters a potential builtin function, it will check if the name corresponds |
| 211 | to a valid OpenCL builtin function. If so, all overloads of the function are |
| 212 | inserted using ``InsertOCLBuiltinDeclarationsFromTable`` and overload |
| 213 | resolution takes place. |
| 214 | |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 215 | OpenCL Extensions and Features |
| 216 | ------------------------------ |
| 217 | |
| 218 | Clang implements various extensions to OpenCL kernel languages. |
| 219 | |
| 220 | New functionality is accepted as soon as the documentation is detailed to the |
| 221 | level sufficient to be implemented. There should be an evidence that the |
| 222 | extension is designed with implementation feasibility in consideration and |
| 223 | assessment of complexity for C/C++ based compilers. Alternatively, the |
| 224 | documentation can be accepted in a format of a draft that can be further |
| 225 | refined during the implementation. |
| 226 | |
| 227 | Implementation guidelines |
| 228 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 229 | |
| 230 | This section explains how to extend clang with the new functionality. |
| 231 | |
| 232 | **Parsing functionality** |
| 233 | |
| 234 | If an extension modifies the standard parsing it needs to be added to |
| 235 | the clang frontend source code. This also means that the associated macro |
| 236 | indicating the presence of the extension should be added to clang. |
| 237 | |
| 238 | The default flow for adding a new extension into the frontend is to |
| 239 | modify `OpenCLExtensions.def |
| 240 | <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Basic/OpenCLExtensions.def>`_ |
| 241 | |
| 242 | This will add the macro automatically and also add a field in the target |
| 243 | options ``clang::TargetOptions::OpenCLFeaturesMap`` to control the exposure |
| 244 | of the new extension during the compilation. |
| 245 | |
| 246 | Note that by default targets like `SPIR` or `X86` expose all the OpenCL |
| 247 | extensions. For all other targets the configuration has to be made explicitly. |
| 248 | |
| 249 | Note that the target extension support performed by clang can be overridden |
| 250 | with :ref:`-cl-ext <opencl_cl_ext>` command-line flags. |
| 251 | |
| 252 | **Library functionality** |
| 253 | |
| 254 | If an extension adds functionality that does not modify standard language |
Sven van Haastregt | 22fdf61 | 2021-08-06 09:21:26 | [diff] [blame] | 255 | parsing it should not require modifying anything other than header files and |
Anastasia Stulova | bafcb4c | 2021-03-11 14:05:15 | [diff] [blame] | 256 | ``OpenCLBuiltins.td`` detailed in :ref:`OpenCL builtins <opencl_builtins>`. |
| 257 | Most commonly such extensions add functionality via libraries (by adding |
| 258 | non-native types or functions) parsed regularly. Similar to other languages this |
| 259 | is the most common way to add new functionality. |
| 260 | |
| 261 | Clang has standard headers where new types and functions are being added, |
| 262 | for more details refer to |
| 263 | :ref:`the section on the OpenCL Header <opencl_header>`. The macros indicating |
| 264 | the presence of such extensions can be added in the standard header files |
| 265 | conditioned on target specific predefined macros or/and language version |
| 266 | predefined macros. |
| 267 | |
| 268 | **Pragmas** |
| 269 | |
| 270 | Some extensions alter standard parsing dynamically via pragmas. |
| 271 | |
| 272 | Clang provides a mechanism to add the standard extension pragma |
| 273 | ``OPENCL EXTENSION`` by setting a dedicated flag in the extension list entry of |
| 274 | ``OpenCLExtensions.def``. Note that there is no default behavior for the |
| 275 | standard extension pragmas as it is not specified (for the standards up to and |
| 276 | including version 3.0) in a sufficient level of detail and, therefore, |
| 277 | there is no default functionality provided by clang. |
| 278 | |
| 279 | Pragmas without detailed information of their behavior (e.g. an explanation of |
| 280 | changes it triggers in the parsing) should not be added to clang. Moreover, the |
| 281 | pragmas should provide useful functionality to the user. For example, such |
| 282 | functionality should address a practical use case and not be redundant i.e. |
| 283 | cannot be achieved using existing features. |
| 284 | |
| 285 | Note that some legacy extensions (published prior to OpenCL 3.0) still |
| 286 | provide some non-conformant functionality for pragmas e.g. add diagnostics on |
| 287 | the use of types or functions. This functionality is not guaranteed to remain in |
| 288 | future releases. However, any future changes should not affect backward |
| 289 | compatibility. |
| 290 | |
Anastasia Stulova | d7cc3a0 | 2021-01-27 12:21:22 | [diff] [blame] | 291 | .. _opencl_addrsp: |
| 292 | |
| 293 | Address spaces attribute |
| 294 | ------------------------ |
| 295 | |
| 296 | Clang has arbitrary address space support using the ``address_space(N)`` |
| 297 | attribute, where ``N`` is an integer number in the range specified in the |
| 298 | Clang source code. This addresses spaces can be used along with the OpenCL |
| 299 | address spaces however when such addresses spaces converted to/from OpenCL |
| 300 | address spaces the behavior is not governed by OpenCL specification. |
| 301 | |
| 302 | An OpenCL implementation provides a list of standard address spaces using |
| 303 | keywords: ``private``, ``local``, ``global``, and ``generic``. In the AST and |
| 304 | in the IR each of the address spaces will be represented by unique number |
| 305 | provided in the Clang source code. The specific IDs for an address space do not |
| 306 | have to match between the AST and the IR. Typically in the AST address space |
| 307 | numbers represent logical segments while in the IR they represent physical |
| 308 | segments. |
| 309 | Therefore, machines with flat memory segments can map all AST address space |
| 310 | numbers to the same physical segment ID or skip address space attribute |
| 311 | completely while generating the IR. However, if the address space information |
| 312 | is needed by the IR passes e.g. to improve alias analysis, it is recommended |
| 313 | to keep it and only lower to reflect physical memory segments in the late |
| 314 | machine passes. The mapping between logical and target address spaces is |
| 315 | specified in the Clang's source code. |
| 316 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 317 | .. _cxx_for_opencl_impl: |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 318 | |
| 319 | C++ for OpenCL Implementation Status |
| 320 | ==================================== |
| 321 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 322 | Clang implements language version 1.0 published in `the official |
| 323 | release of C++ for OpenCL Documentation |
Anastasia Stulova | d4e9fe813 | 2021-04-01 19:31:00 | [diff] [blame] | 324 | <https://github.com/KhronosGroup/OpenCL-Docs/releases/tag/cxxforopencl-v1.0-r2>`_. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 325 | |
Anastasia Stulova | bc84f89 | 2021-01-15 17:19:16 | [diff] [blame] | 326 | Limited support of experimental C++ libraries is described in the :ref:`experimental features <opencl_experimenal>`. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 327 | |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 328 | Bugzilla bugs for this functionality are typically prefixed |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 329 | with '[C++4OpenCL]' - click `here |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 330 | <https://bugs.llvm.org/buglist.cgi?component=OpenCL&list_id=204139&product=clang&query_format=advanced&resolution=---&short_desc=%5BC%2B%2B4OpenCL%5D&short_desc_type=allwordssubstr>`__ |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 331 | to view the full bug list. |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 332 | |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 333 | |
| 334 | Missing features or with limited support |
| 335 | ---------------------------------------- |
| 336 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 337 | - IR generation for global destructors is incomplete (See: |
| 338 | `PR48047 <https://llvm.org/PR48047>`_). |
Sven van Haastregt | 5e962e8 | 2019-10-17 12:56:02 | [diff] [blame] | 339 | |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 340 | .. _opencl_300: |
| 341 | |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 342 | OpenCL C 3.0 Usage |
Anastasia Stulova | 5ccc79d | 2021-05-24 13:18:56 | [diff] [blame] | 343 | ================== |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 344 | |
| 345 | OpenCL C 3.0 language standard makes most OpenCL C 2.0 features optional. Optional |
| 346 | functionality in OpenCL C 3.0 is indicated with the presence of feature-test macros |
Aaron Ballman | 96ef4f4 | 2021-05-27 14:25:39 | [diff] [blame] | 347 | (list of feature-test macros is `here <https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#features>`__). |
Anton Zabaznov | 8269057 | 2021-05-21 11:07:23 | [diff] [blame] | 348 | Command-line flag :ref:`-cl-ext <opencl_cl_ext>` can be used to override features supported by a target. |
| 349 | |
| 350 | For cases when there is an associated extension for a specific feature (fp64 and 3d image writes) |
| 351 | user should specify both (extension and feature) in command-line flag: |
| 352 | |
| 353 | .. code-block:: console |
| 354 | |
| 355 | $ clang -cc1 -cl-std=CL3.0 -cl-ext=+cl_khr_fp64,+__opencl_c_fp64 ... |
| 356 | $ clang -cc1 -cl-std=CL3.0 -cl-ext=-cl_khr_fp64,-__opencl_c_fp64 ... |
| 357 | |
| 358 | |
| 359 | OpenCL C 3.0 Implementation Status |
Anastasia Stulova | 5ccc79d | 2021-05-24 13:18:56 | [diff] [blame] | 360 | ---------------------------------- |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 361 | |
| 362 | The following table provides an overview of features in OpenCL C 3.0 and their |
Sven van Haastregt | 18f16c9 | 2021-02-12 09:58:18 | [diff] [blame] | 363 | implementation status. |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 364 | |
Anastasia Stulova | cff03d5 | 2021-09-06 12:44:23 | [diff] [blame] | 365 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 366 | | Category | Feature | Status | Reviews | |
| 367 | +==============================+=========================+=========================================+======================+==============================================================================================+ |
| 368 | | Command line interface | New value for ``-cl-std`` flag | :good:`done` | https://reviews.llvm.org/D88300 | |
| 369 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 370 | | Predefined macros | New version macro | :good:`done` | https://reviews.llvm.org/D88300 | |
| 371 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 372 | | Predefined macros | Feature macros | :good:`done` | https://reviews.llvm.org/D95776 | |
| 373 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 374 | | Feature optionality | Generic address space | :good:`done` | https://reviews.llvm.org/D95778 and https://reviews.llvm.org/D103401 | |
| 375 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 376 | | Feature optionality | Builtin function overloads with generic address space | :good:`done` | https://reviews.llvm.org/D105526 | |
| 377 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 378 | | Feature optionality | Program scope variables in global memory | :good:`done` | https://reviews.llvm.org/D103191 | |
| 379 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 380 | | Feature optionality | 3D image writes including builtin functions | :part:`worked on` | https://reviews.llvm.org/D106260 (frontend) | |
| 381 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 382 | | Feature optionality | read_write images including builtin functions | :part:`worked on` | https://reviews.llvm.org/D104915 (frontend) and https://reviews.llvm.org/D107539 (functions) | |
| 383 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 384 | | Feature optionality | C11 atomics memory scopes, ordering and builtin function | :good:`done` | https://reviews.llvm.org/D106111 | |
| 385 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 386 | | Feature optionality | Blocks and Device-side kernel enqueue including builtin functions | :none:`unclaimed` | | |
| 387 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 388 | | Feature optionality | Pipes including builtin functions | :part:`worked on` | https://reviews.llvm.org/D107154 (frontend) and https://reviews.llvm.org/D105858 (functions) | |
| 389 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 390 | | Feature optionality | Work group collective builtin functions | :part:`worked on` | https://reviews.llvm.org/D105858 | |
| 391 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 392 | | Feature optionality | Image types and builtin functions | :good:`done` | https://reviews.llvm.org/D103911 (frontend) and https://reviews.llvm.org/D107539 (functions) | |
| 393 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 394 | | Feature optionality | Double precision floating point type | :good:`done` | https://reviews.llvm.org/D96524 | |
| 395 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 396 | | New functionality | RGBA vector components | :good:`done` | https://reviews.llvm.org/D99969 | |
| 397 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 398 | | New functionality | Subgroup functions | :part:`worked on` | https://reviews.llvm.org/D105858 | |
| 399 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
| 400 | | New functionality | Atomic mem scopes: subgroup, all devices including functions | :part:`worked on` | https://reviews.llvm.org/D103241 | |
| 401 | +------------------------------+-------------------------+-----------------------------------------+----------------------+----------------------------------------------------------------------------------------------+ |
Anastasia Stulova | adb77a7 | 2021-01-14 14:52:54 | [diff] [blame] | 402 | |
| 403 | .. _opencl_experimenal: |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 404 | |
| 405 | Experimental features |
| 406 | ===================== |
| 407 | |
| 408 | Clang provides the following new WIP features for the developers to experiment |
| 409 | and provide early feedback or contribute with further improvements. |
| 410 | Feel free to contact us on `cfe-dev |
| 411 | <https://lists.llvm.org/mailman/listinfo/cfe-dev>`_ or via `Bugzilla |
| 412 | <https://bugs.llvm.org/>`__. |
| 413 | |
Anastasia Stulova | 6e8601f | 2021-04-08 09:59:44 | [diff] [blame] | 414 | .. _opencl_experimental_cxxlibs: |
Anastasia Stulova | 7c541a1 | 2021-04-01 12:54:54 | [diff] [blame] | 415 | |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 416 | C++ libraries for OpenCL |
| 417 | ------------------------ |
| 418 | |
| 419 | There is ongoing work to support C++ standard libraries from `LLVM's libcxx |
| 420 | <https://libcxx.llvm.org/>`_ in OpenCL kernel code using C++ for OpenCL mode. |
| 421 | |
| 422 | It is currently possible to include `type_traits` from C++17 in the kernel |
| 423 | sources when the following clang extensions are enabled |
| 424 | ``__cl_clang_function_pointers`` and ``__cl_clang_variadic_functions``, |
| 425 | see :doc:`LanguageExtensions` for more details. The use of non-conformant |
| 426 | features enabled by the extensions does not expose non-conformant behavior |
| 427 | beyond the compilation i.e. does not get generated in IR or binary. |
| 428 | The extension only appear in metaprogramming |
| 429 | mechanism to identify or verify the properties of types. This allows to provide |
| 430 | the full C++ functionality without a loss of portability. To avoid unsafe use |
| 431 | of the extensions it is recommended that the extensions are disabled directly |
| 432 | after the header include. |
| 433 | |
| 434 | **Example of Use**: |
| 435 | |
| 436 | The example of kernel code with `type_traits` is illustrated here. |
| 437 | |
| 438 | .. code-block:: c++ |
| 439 | |
| 440 | #pragma OPENCL EXTENSION __cl_clang_function_pointers : enable |
| 441 | #pragma OPENCL EXTENSION __cl_clang_variadic_functions : enable |
| 442 | #include <type_traits> |
| 443 | #pragma OPENCL EXTENSION __cl_clang_function_pointers : disable |
| 444 | #pragma OPENCL EXTENSION __cl_clang_variadic_functions : disable |
| 445 | |
| 446 | using sint_type = std::make_signed<unsigned int>::type; |
| 447 | |
| 448 | __kernel void foo() { |
| 449 | static_assert(!std::is_same<sint_type, unsigned int>::value); |
| 450 | } |
| 451 | |
| 452 | The possible clang invocation to compile the example is as follows: |
| 453 | |
| 454 | .. code-block:: console |
| 455 | |
Ole Strohm | f372ff1 | 2021-05-07 11:30:31 | [diff] [blame] | 456 | $ clang -I<path to libcxx checkout or installation>/include test.clcpp |
Anastasia Stulova | 0ef2b68 | 2021-01-08 13:37:27 | [diff] [blame] | 457 | |
| 458 | Note that `type_traits` is a header only library and therefore no extra |
Anastasia Stulova | 7c541a1 | 2021-04-01 12:54:54 | [diff] [blame] | 459 | linking step against the standard libraries is required. See full example |
| 460 | in `Compiler Explorer <https://godbolt.org/z/5WbnTfb65>`_. |
Anastasia Stulova | 9685631 | 2021-09-10 11:29:11 | [diff] [blame^] | 461 | |
| 462 | More OpenCL specific C++ library implementations built on top of libcxx |
| 463 | are available in `libclcxx <https://github.com/KhronosGroup/libclcxx>`_ |
| 464 | project. |