OpenMP API Specification 5 2
OpenMP API Specification 5 2
Application Programming
Interface
Copyright
1997-2021
c OpenMP Architecture Review Board.
Permission to copy without fee all or part of this material is granted, provided the OpenMP
Architecture Review Board copyright notice and the title of this document appear. Notice is
given that copying is by permission of the OpenMP Architecture Review Board.
This page intentionally left blank in published version.
Contents
i
2 Internal Control Variables 38
2.1 ICV Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 ICV Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Modifying and Retrieving ICV Values . . . . . . . . . . . . . . . . . . . . . . 42
2.4 How the Per-Data Environment ICVs Work . . . . . . . . . . . . . . . . . . . 45
2.5 ICV Override Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Data Environment 96
5.1 Data-Sharing Attribute Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.1 Variables Referenced in a Construct . . . . . . . . . . . . . . . . . . . . . . 96
5.1.2 Variables Referenced in a Region but not in a Construct . . . . . . . . . . . 100
5.2 threadprivate Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 List Item Privatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4 Data-Sharing Attribute Clauses . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.4.1 default Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4.2 shared Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4.3 private Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.4 firstprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.4.5 lastprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.4.6 linear Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.4.7 is_device_ptr Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.4.8 use_device_ptr Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.4.9 has_device_addr Clause . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.4.10 use_device_addr Clause . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.5 Reduction Clauses and Directives . . . . . . . . . . . . . . . . . . . . . . . . 124
5.5.1 OpenMP Reduction Identifiers . . . . . . . . . . . . . . . . . . . . . . . . 124
5.5.2 OpenMP Reduction Expressions . . . . . . . . . . . . . . . . . . . . . . . 125
5.5.3 Implicitly Declared OpenMP Reduction Identifiers . . . . . . . . . . . . . . 128
5.5.4 initializer Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.5.5 Properties Common to All Reduction Clauses . . . . . . . . . . . . . . . . 131
5.5.6 Reduction Scoping Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.5.7 Reduction Participating Clauses . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5.8 reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5.9 task_reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.5.10 in_reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.11 declare reduction Directive . . . . . . . . . . . . . . . . . . . . . . 139
5.6 scan Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.6.1 inclusive Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.6.2 exclusive Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Contents iii
5.7 Data Copying Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.7.1 copyin Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.7.2 copyprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.8 Data-Mapping Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.8.1 Implicit Data-Mapping Attribute Rules . . . . . . . . . . . . . . . . . . . . 148
5.8.2 Mapper Identifiers and mapper Modifiers . . . . . . . . . . . . . . . . . . 149
5.8.3 map Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.8.4 enter Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.8.5 link Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.8.6 Pointer Initialization for Device Data Environments . . . . . . . . . . . . . 160
5.8.7 defaultmap Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.8.8 declare mapper Directive . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.9 Data-Motion Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.9.1 to Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.9.2 from Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.10 uniform Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.11 aligned Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Contents v
9 Loop Transformation Constructs 219
9.1 tile Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.1.1 sizes Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.2 unroll Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.2.1 full Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.2.2 partial Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
14 Interoperability 291
14.1 interop Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
14.1.1 OpenMP Foreign Runtime Identifiers . . . . . . . . . . . . . . . . . . . . . 293
14.1.2 init Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Contents vii
14.1.3 use Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
14.2 Interoperability Requirement Set . . . . . . . . . . . . . . . . . . . . . . . . . 294
Contents ix
18.3 Thread Affinity Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
18.3.1 omp_get_proc_bind . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
18.3.2 omp_get_num_places . . . . . . . . . . . . . . . . . . . . . . . . . . 364
18.3.3 omp_get_place_num_procs . . . . . . . . . . . . . . . . . . . . . . 365
18.3.4 omp_get_place_proc_ids . . . . . . . . . . . . . . . . . . . . . . . 365
18.3.5 omp_get_place_num . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
18.3.6 omp_get_partition_num_places . . . . . . . . . . . . . . . . . . 367
18.3.7 omp_get_partition_place_nums . . . . . . . . . . . . . . . . . . 368
18.3.8 omp_set_affinity_format . . . . . . . . . . . . . . . . . . . . . . 368
18.3.9 omp_get_affinity_format . . . . . . . . . . . . . . . . . . . . . . 369
18.3.10 omp_display_affinity . . . . . . . . . . . . . . . . . . . . . . . . . 370
18.3.11 omp_capture_affinity . . . . . . . . . . . . . . . . . . . . . . . . . 371
18.4 Teams Region Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
18.4.1 omp_get_num_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
18.4.2 omp_get_team_num . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
18.4.3 omp_set_num_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
18.4.4 omp_get_max_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
18.4.5 omp_set_teams_thread_limit . . . . . . . . . . . . . . . . . . . . 375
18.4.6 omp_get_teams_thread_limit . . . . . . . . . . . . . . . . . . . . 376
18.5 Tasking Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
18.5.1 omp_get_max_task_priority . . . . . . . . . . . . . . . . . . . . . 377
18.5.2 omp_in_explicit_task . . . . . . . . . . . . . . . . . . . . . . . . . 377
18.5.3 omp_in_final . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
18.6 Resource Relinquishing Routines . . . . . . . . . . . . . . . . . . . . . . . . 378
18.6.1 omp_pause_resource . . . . . . . . . . . . . . . . . . . . . . . . . . 378
18.6.2 omp_pause_resource_all . . . . . . . . . . . . . . . . . . . . . . . 380
18.7 Device Information Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
18.7.1 omp_get_num_procs . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
18.7.2 omp_set_default_device . . . . . . . . . . . . . . . . . . . . . . . 382
18.7.3 omp_get_default_device . . . . . . . . . . . . . . . . . . . . . . . 382
18.7.4 omp_get_num_devices . . . . . . . . . . . . . . . . . . . . . . . . . . 383
18.7.5 omp_get_device_num . . . . . . . . . . . . . . . . . . . . . . . . . . 384
18.7.6 omp_is_initial_device . . . . . . . . . . . . . . . . . . . . . . . . 384
Contents xi
18.12.7 omp_get_interop_rc_desc . . . . . . . . . . . . . . . . . . . . . . 421
18.13 Memory Management Routines . . . . . . . . . . . . . . . . . . . . . . . . . 422
18.13.1 Memory Management Types . . . . . . . . . . . . . . . . . . . . . . . . . 422
18.13.2 omp_init_allocator . . . . . . . . . . . . . . . . . . . . . . . . . . 425
18.13.3 omp_destroy_allocator . . . . . . . . . . . . . . . . . . . . . . . . 426
18.13.4 omp_set_default_allocator . . . . . . . . . . . . . . . . . . . . . 427
18.13.5 omp_get_default_allocator . . . . . . . . . . . . . . . . . . . . . 428
18.13.6 omp_alloc and omp_aligned_alloc . . . . . . . . . . . . . . . . . 428
18.13.7 omp_free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
18.13.8 omp_calloc and omp_aligned_calloc . . . . . . . . . . . . . . . . 431
18.13.9 omp_realloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
18.14 Tool Control Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
18.15 Environment Display Routine . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Contents xiii
20.5.5 Thread Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
20.5.6 Parallel Region Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
20.5.7 Task Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
20.5.8 Querying Thread States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
20.5.9 Display Control Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
20.5.10 Accessing Scope-Specific Information . . . . . . . . . . . . . . . . . . . . 590
20.6 Runtime Entry Points for OMPD . . . . . . . . . . . . . . . . . . . . . . . . . 594
20.6.1 Beginning Parallel Regions . . . . . . . . . . . . . . . . . . . . . . . . . . 594
20.6.2 Ending Parallel Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
20.6.3 Beginning Task Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
20.6.4 Ending Task Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
20.6.5 Beginning OpenMP Threads . . . . . . . . . . . . . . . . . . . . . . . . . . 596
20.6.6 Ending OpenMP Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
20.6.7 Initializing OpenMP Devices . . . . . . . . . . . . . . . . . . . . . . . . . 597
20.6.8 Finalizing OpenMP Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Index 641
Contents xv
List of Figures
19.1 First-Party Tool Activation Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . 442
xvi
List of Tables
2.1 ICV Scopes and Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 ICV Initial Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Ways to Modify and to Retrieve ICV Values . . . . . . . . . . . . . . . . . . . . . 42
2.4 ICV Override Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
19.1 OMPT Callback Interface Runtime Entry Point Names and Their Type Signatures . 445
19.2 Callbacks for which ompt_set_callback Must Return ompt_set_always 447
19.3 Callbacks for which ompt_set_callback May Return Any Non-Error Code . . 448
19.4 OMPT Tracing Interface Runtime Entry Point Names and Their Type Signatures . . 449
xvii
This page intentionally left blank in published version.
1 1 Overview of the OpenMP API
2 The collection of compiler directives, library routines, and environment variables that this
3 document describes collectively define the specification of the OpenMP Application Program
4 Interface (OpenMP API) for parallelism in C, C++ and Fortran programs.
5 This specification provides a model for parallel programming that is portable across architectures
6 from different vendors. Compilers from numerous vendors support the OpenMP API. More
7 information about the OpenMP API can be found at the following web site
8 http://www.openmp.org
9 The directives, library routines, environment variables, and tool support that this document defines
10 allow users to create, to manage, to debug and to analyze parallel programs while permitting
11 portability. The directives extend the C, C++ and Fortran base languages with single program
12 multiple data (SPMD) constructs, tasking constructs, device constructs, work-distribution
13 constructs, and synchronization constructs, and they provide support for sharing, mapping and
14 privatizing data. The functionality to control the runtime environment is provided by library
15 routines and environment variables. Compilers that support the OpenMP API often include
16 command line options to enable or to disable interpretation of some or all OpenMP directives.
17 1.1 Scope
18 The OpenMP API covers only user-directed parallelization, wherein the programmer explicitly
19 specifies the actions to be taken by the compiler and runtime system in order to execute the program
20 in parallel. OpenMP-compliant implementations are not required to check for data dependences,
21 data conflicts, race conditions, or deadlocks. Compliant implementations also are not required to
22 check for any code sequences that cause a program to be classified as non-conforming. Application
23 developers are responsible for correctly using the OpenMP API to produce a conforming program.
24 The OpenMP API does not cover compiler-generated automatic parallelization.
1
1 1.2 Glossary
2 1.2.1 Threading Concepts
20 inactive parallel region A parallel region that is executed by a team of only one thread.
21 active target region A target region that is executed on a device other than the device that encountered
22 the target construct.
23 inactive target region A target region that is executed on the same device that encountered the target
24 construct.
25 sequential part All code encountered during the execution of an initial task region that is not part of
26 a parallel region corresponding to a parallel construct or a task region
27 corresponding to a task construct.
28 COMMENTS:
29 A sequential part is enclosed by an implicit parallel region.
30 Executable statements in called routines may be in both a sequential part
31 and any number of explicit parallel regions at different points in the
32 program execution.
33 primary thread An OpenMP thread that has thread number 0. A primary thread may be an initial
34 thread or the thread that encounters a parallel construct, creates a team,
35 generates a set of implicit tasks, and then executes one of those tasks as thread
36 number 0.
18 target task A mergeable and untied task that is generated by a device construct or a call to a
19 device memory routine and that coordinates activity between the current device and
20 the target device.
21 taskgroup set A set of tasks that are logically grouped by a taskgroup region, such that a task is
22 a member of the taskgroup set if and only if its task region is nested in the
23 taskgroup region and it binds to the same parallel region as the taskgroup
24 region.
2 variable A named data storage block, for which the value can be defined and redefined during
3 the execution of a program.
4 COMMENT: An array element or structure element is a variable that is
5 part of another variable.
6 scalar variable For C/C++, a scalar variable, as defined by the base language.
7 For Fortran, a scalar variable with intrinsic type, as defined by the base language,
8 excluding character type.
9 aggregate variable A variable, such as an array or structure, composed of other variables. For Fortran, a
10 variable of character type is considered an aggregate variable.
11 array section A designated subset of the elements of an array that is specified using a subscript
12 notation that can select more than one element.
13 array item An array, an array section, or an array element.
14 shape-operator For C/C++, an array shaping operator that reinterprets a pointer expression as an
15 array with one or more specified dimensions.
16 implicit array For C/C++, the set of array elements of non-array type T that may be accessed by
17 applying a sequence of [] operators to a given pointer that is either a pointer to type T
18 or a pointer to a multidimensional array of elements of type T .
19 For Fortran, the set of array elements for a given array pointer.
20 COMMENT: For C/C++, the implicit array for pointer p with type T
21 (*)[10] consists of all accessible elements p[i][j], for all i and j=0,1,...,9.
22 base pointer For C/C++, an lvalue pointer expression that is used by a given lvalue expression or
23 array section to refer indirectly to its storage, where the lvalue expression or array
24 section is part of the implicit array for that lvalue pointer expression.
25 For Fortran, a data pointer that appears last in the designator for a given variable or
26 array section, where the variable or array section is part of the pointer target for that
27 data pointer.
28 COMMENT: For the array section
29 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
30 pointer type declaration and identifiers xi have an array type declaration,
31 the base pointer is: (*p0).x0[k1].p1->p2.
32 named pointer For C/C++, the base pointer of a given lvalue expression or array section, or the base
33 pointer of one of its named pointers.
12 simply contiguous An array section that statically can be determined to have contiguous storage or that,
13 array section in Fortran, has the CONTIGUOUS attribute.
2 tool Code that can observe and/or modify the execution of an application.
3 first-party tool A tool that executes in the address space of the program that it is monitoring.
4 third-party tool A tool that executes as a separate process from the process that it is monitoring and
5 potentially controlling.
6 activated tool A first-party tool that successfully completed its initialization.
7 event A point of interest in the execution of a thread.
8 native thread A thread defined by an underlying thread implementation.
9 tool callback A function that a tool provides to an OpenMP implementation to invoke when an
10 associated event occurs.
11 registering a callback Providing a tool callback to an OpenMP implementation.
12 dispatching a callback Processing a callback when an associated event occurs in a manner consistent with
13 at an event the return code provided when a first-party tool registered the callback.
14 thread state An enumeration type that describes the current OpenMP activity of a thread. A
15 thread can be in only one state at any time.
16 wait identifier A unique opaque handle associated with each data object (for example, a lock) that
17 the OpenMP runtime uses to enforce mutual exclusion and potentially to cause a
18 thread to wait actively or passively.
19 frame A storage area on a thread’s stack associated with a procedure invocation. A frame
20 includes space for one or more saved registers and often also includes space for saved
21 arguments, local variables, and padding for alignment.
22 canonical frame An address associated with a procedure frame on a call stack that was the value of the
23 address stack pointer immediately prior to calling the procedure for which the frame
24 represents the invocation.
25 runtime entry point A function interface provided by an OpenMP runtime for use by a tool. A runtime
26 entry point is typically not associated with a global function symbol.
27 trace record A data structure in which to store information associated with an occurrence of an
28 event.
29 native trace record A trace record for an OpenMP device that is in a device-specific format.
30 signal A software interrupt delivered to a thread.
31 signal handler A function called asynchronously when a signal is delivered to a thread.
22 Note – OpenMP synchronization operations, described in Chapter 15 and in Section 18.9, are
23 recommended for enforcing this order. Synchronization through variables is possible but is not
24 recommended because the proper timing of flushes is difficult.
25
26 The flush properties that define whether a flush operation is a strong flush, a release flush, or an
27 acquire flush are not mutually disjoint. A flush operation may be a strong flush and a release flush;
28 it may be a strong flush and an acquire flush; it may be a release flush and an acquire flush; or it
29 may be all three.
27 Note – Since flush operations by themselves cannot prevent data races, explicit flush operations
28 are only useful in combination with non-sequentially consistent atomic directives.
29
12 1.5.1 OMPT
13 The OMPT interface, which is intended for first-party tools, provides the following:
14 • A mechanism to initialize a first-party tool;
15 • Routines that enable a tool to determine the capabilities of an OpenMP implementation;
16 • Routines that enable a tool to examine OpenMP state information associated with a thread;
17 • Mechanisms that enable a tool to map implementation-level calling contexts back to their
18 source-level representations;
19 • A callback interface that enables a tool to receive notification of OpenMP events;
20 • A tracing interface that enables a tool to trace activity on OpenMP target devices; and
21 • A runtime library routine that an application can use to control a tool.
22 OpenMP implementations may differ with respect to the thread states that they support, the mutual
23 exclusion implementations that they employ, and the OpenMP events for which tool callbacks are
24 invoked. For some OpenMP events, OpenMP implementations must guarantee that a registered
25 callback will be invoked for each occurrence of the event. For other OpenMP events, OpenMP
26 implementations are permitted to invoke a registered callback for some or no occurrences of the
27 event; for such OpenMP events, however, OpenMP implementations are encouraged to invoke tool
28 callbacks on as many occurrences of the event as is practical. Section 19.2.4 specifies the subset of
29 OMPT callbacks that an OpenMP implementation must support for a minimal implementation of
30 the OMPT interface.
31 With the exception of the omp_control_tool runtime library routine for tool control, all other
32 routines in the OMPT interface are intended for use only by tools and are not visible to
3 1.5.2 OMPD
4 The OMPD interface is intended for third-party tools, which run as separate processes. An
5 OpenMP implementation must provide an OMPD library that can be dynamically loaded and used
6 by a third-party tool. A third-party tool, such as a debugger, uses the OMPD library to access
7 OpenMP state of a program that has begun execution. OMPD defines the following:
8 • An interface that an OMPD library exports, which a tool can use to access OpenMP state of a
9 program that has begun execution;
10 • A callback interface that a tool provides to the OMPD library so that the library can use it to
11 access the OpenMP state of a program that has begun execution; and
12 • A small number of symbols that must be defined by an OpenMP implementation to help the tool
13 find the correct OMPD library to use for that OpenMP implementation and to facilitate
14 notification of events.
15 Chapter 20 describes OMPD in detail.
Fortran (cont.)
C/C++ (cont.)
7 Some text is for information only, and is not part of the normative specification. Such text is
8 designated as a note or comment, like this:
9
10 Note – Non-normative text...
11
12 COMMENT: Non-normative text...
38
ICV Scope Description
debug-var global Controls whether an OpenMP implementation
will collect information that an OMPD library
can access to satisfy requests from a tool
def-allocator-var implicit task Controls the memory allocator used by memory
allocation routines, directives and clauses that do
not specify one explicitly
default-device-var data environment Controls the default target device
display-affinity-var global Controls the display of thread affinity
dyn-var data environment Enables dynamic adjustment of the number of
threads used for encountered parallel regions
explicit-task-var data environment Whether a given task is an explicit task
final-task-var data environment Whether a given task is a final task
levels-var data environment Number of nested parallel regions such that
all parallel regions are enclosed by the outer-
most initial task region on the device
max-active-levels-var data environment Controls the maximum number of nested ac-
tive parallel regions when the innermost
parallel region is generated by a given task
max-task-priority-var global Controls the maximum value that can be speci-
fied in the priority clause
nteams-var device Controls the number of teams requested for en-
countered teams regions
nthreads-var data environment Controls the number of threads requested for
encountered parallel regions
num-procs-var device The number of processors available on the device
place-partition-var implicit task Controls the place partition available for encoun-
tered parallel regions
run-sched-var data environment Controls the schedule used for worksharing-loop
regions that specify the runtime schedule kind
stacksize-var device Controls the stack size for threads that the
OpenMP implementation creates
target-offload-var global Controls the offloading behavior
team-size-var data environment Size of the current team
teams-thread-limit-var device Controls the maximum number of threads in each
contention group that a teams construct creates
thread-limit-var data environment Controls the maximum number of threads that
participate in the contention group
1 If an ICV has an associated environment variable and that ICV does not have global scope then the
2 ICV has a set of associated device-specific environment variables that extend the associated
3 environment variable with the following syntax:
4 <ENVIRONMENT VARIABLE>_DEV[_<device>]
5 where <ENVIRONMENT VARIABLE> is the associated environment variable and <device> is the
6 device number as specified in the device clause (see Section 13.2).
7 Semantics
8 • The initial value of dyn-var is implementation defined if the implementation supports dynamic
9 adjustment of the number of threads; otherwise, the initial value is false.
10 • If target-offload-var is mandatory and the number of non-host devices is zero then the
11 default-device-var is initialized to omp_invalid_device. Otherwise, the initial value is an
12 implementation-defined non-negative integer that is less than or, if target-offload-var is not
13 mandatory, equal to omp_get_initial_device().
14 • The value of the nthreads-var ICV is a list.
15 • The value of the bind-var ICV is a list.
16 The host and non-host device ICVs are initialized before any OpenMP API construct or OpenMP
17 API routine executes. After the initial values are assigned, the values of any OpenMP environment
18 variables that were set by the user are read and the associated ICVs are modified accordingly. If no
19 <device> number is specified on the device-specific environment variable then the value is applied
20 to all non-host devices.
21 Cross References
22 • OMP_AFFINITY_FORMAT, see Section 21.2.5
23 • OMP_ALLOCATOR, see Section 21.5.1
24 • OMP_CANCELLATION, see Section 21.2.6
1 Semantics
2 • The value of the bind-var ICV is a list. The runtime call omp_get_proc_bind retrieves the
3 value of the first element of this list.
4 • The value of the nthreads-var ICV is a list. The runtime call omp_set_num_threads sets
5 the value of the first element of this list, and omp_get_max_threads retrieves the value of
6 the first element of this list.
7 • Detailed values in the place-partition-var ICV are retrieved using the listed runtime calls.
3 Cross References
4 • omp_get_active_level, see Section 18.2.20
5 • omp_get_affinity_format, see Section 18.3.9
6 • omp_get_cancellation, see Section 18.2.8
7 • omp_get_default_allocator, see Section 18.13.5
8 • omp_get_default_device, see Section 18.7.3
9 • omp_get_dynamic, see Section 18.2.7
10 • omp_get_level, see Section 18.2.17
11 • omp_get_max_active_levels, see Section 18.2.16
12 • omp_get_max_task_priority, see Section 18.5.1
13 • omp_get_max_teams, see Section 18.4.4
14 • omp_get_max_threads, see Section 18.2.3
15 • omp_get_num_procs, see Section 18.7.1
16 • omp_get_num_threads, see Section 18.2.2
17 • omp_get_partition_num_places, see Section 18.3.6
18 • omp_get_partition_place_nums, see Section 18.3.7
19 • omp_get_place_num_procs, see Section 18.3.3
20 • omp_get_place_proc_ids, see Section 18.3.4
21 • omp_get_proc_bind, see Section 18.3.1
22 • omp_get_schedule, see Section 18.2.12
23 • omp_get_supported_active_levels, see Section 18.2.14
24 • omp_get_teams_thread_limit, see Section 18.4.6
25 • omp_get_thread_limit, see Section 18.2.13
26 • omp_get_thread_num, see Section 18.2.4
27 • omp_in_final, see Section 18.5.3
28 • omp_set_affinity_format, see Section 18.3.8
29 • omp_set_default_allocator, see Section 18.13.4
30 • omp_set_default_device, see Section 18.7.2
14 Semantics
15 • The num_threads clause overrides the value of the first element of the nthreads-var ICV.
16 • If a schedule clause specifies a modifier then that modifier overrides any modifier that is
17 specified in the run-sched-var ICV.
18 • If bind-var is not set to false then the proc_bind clause overrides the value of the first element
19 of the bind-var ICV; otherwise, the proc_bind clause has no effect.
20 Cross References
21 • allocate clause, see Section 6.6
22 • allocator clause, see Section 6.4
23 • num_teams clause, see Section 10.2.1
9 Restrictions
10 The following restrictions apply to OpenMP directives:
11 • Unless otherwise specified, a program must not depend on any ordering of the evaluations of the
12 expressions that appear in the clauses specified on a directive.
13 • Unless otherwise specified, a program must not depend on any side effects of the evaluations of
14 the expressions that appear in the clauses specified on a directive.
15 Restrictions on explicit OpenMP regions (that arise from executable directives) are as follows:
C++
16 • A throw executed inside a region that arises from a thread-limiting directive must cause
17 execution to resume within the same region, and the same thread that threw the exception must
18 catch it. If the directive is also exception-aborting then whether the exception is caught or the
19 throw results in runtime error termination is implementation defined.
C++
Fortran
20 • A directive may not appear in a pure procedure unless it is pure.
21 • A directive may not appear in a WHERE, FORALL or DO CONCURRENT construct.
22 • If more than one image is executing the program, any image control statement, ERROR STOP
23 statement, FAIL IMAGE statement, collective subroutine call or access to a coindexed object that
24 appears in an explicit OpenMP region will result in unspecified behavior.
Fortran
C / C++
10 White space in a directive-name is not optional.
C / C++
11 Some OpenMP directives specify a paired end directive, where the directive-name of the paired
12 end directives is:
13 • If directive-name starts with begin, the end-directive-name replaces begin with end
14 • otherwise it is end directive-name unless otherwise specified.
15 The directive-specification of a paired end directive may include one or more optional end-clause:
16 directive-specifier [[,] end-clause[ [,] end-clause]...]
17 where end-clause has the end-clause property, which explicitly allows it on a paired end directive.
C / C++
18 An OpenMP directive may be specified as a pragma directive:
19 #pragma omp directive-specification new-line
20 or a pragma operator:
21 _Pragma("omp directive-specification")
22 The use of omp as the first preprocessing token of a pragma directive is reserved for OpenMP
23 directives that are defined in this specification. The use of ompx as the first preprocessing token of
24 a pragma directive is reserved for implementation-defined extensions to the OpenMP directives.
6 White space can be used before and after the #. Preprocessing tokens in directive-specification of
7 #pragma and _Pragma pragmas are subject to macro expansion.
C / C++
C++
8 In C++11 and higher, an OpenMP directive may be specified as a C++ attribute specifier:
9 [[ omp :: directive-attr ]]
10 or
11 [[ using omp : directive-attr ]]
12 where directive-attr is
13 directive( directive-specification )
14 or
15 sequence( [omp::]directive-attr [[, [omp::]directive-attr] ... ] )
16 Multiple attributes on the same statement are allowed. Attribute directives that apply to the same
17 statement are unordered unless the sequence attribute is specified, in which case the right-to-left
18 ordering applies. The omp:: namespace qualifier within a sequence attribute is optional. The
19 application of multiple attributes in a sequence attribute is ordered as if each directive had been
20 specified as a pragma directive on subsequent lines.
21
29
30 The use of omp as the attribute namespace of an attribute specifier, or as the optional namespace
31 qualifier within a sequence attribute, is reserved for OpenMP directives that are defined in this
32 specification. The use of ompx as the attribute namespace of an attribute specifier, or as the
12 All OpenMP compiler directives must begin with a directive sentinel. The format of a sentinel
13 differs between fixed form and free form source files, as described in Section 3.1.1 and
14 Section 3.1.2. In order to simplify the presentation, free form is used for the syntax of OpenMP
15 directives for Fortran throughout this document, except as noted.
16 Directives are case insensitive. Directives cannot be embedded within continued statements, and
17 statements cannot be embedded within directives. Each expression used in the OpenMP syntax
18 inside of a clause must be a valid expression of the base language unless otherwise specified.
Fortran
19 A directive may be categorized as one of the following:
20 • meta
21 • declarative
22 • executable
23 • informational
24 • utility
25 • subsidiary
15 Formations that result from a block-associated directive have the following syntax:
C / C++
16 directive
17 structured-block
C / C++
Fortran
18 directive
19 structured-block
20 [end-directive]
6 or:
7 directive
8 declaration-associated-specification
26 wrapped in a single compound statement for C/C++ or optionally wrapped in a single BLOCK
27 construct for Fortran.
22 The sentinels that end with omp are reserved for OpenMP directives that are defined in this
23 specification. The sentinels that end with omx are reserved for implementation-defined extensions
24 to the OpenMP directives.
10 Note – In the following example, the three formats for specifying the directive are equivalent (the
11 first line represents the position of the first 9 columns):
12 c23456789
13 !$omp parallel do shared(a,b,c)
14
15 c$omp parallel do
16 c$omp+shared(a,b,c)
17
18 c$omp paralleldoshared(a,b,c)
19
Fortran
Fortran
23 The !$omp sentinel is reserved for OpenMP directives that are defined in this specification. The
24 !$ompx sentinel is reserved for implementation-defined extensions to the OpenMP directives.
25 The sentinel can appear in any column as long as it is preceded only by white space. It must appear
26 as a single word with no intervening white space. Fortran free form line length and white space
27 rules apply to the directive line. Initial directive lines must have a space after the sentinel. The
28 initial line of a directive must not be a continuation line for a base language statement. Fortran free
29 form continuation rules apply. Thus, continued directive lines must have an ampersand (&) as the
30 last non-blank character on the line, prior to any comment placed inside the directive; continuation
31 directive lines can have an ampersand after the directive sentinel with optional white space before
32 and after the ampersand.
33 Comments may appear on the same line as a directive. The exclamation point (!) initiates a
34 comment. The comment extends to the end of the source line and is ignored. If the first non-blank
35 character after the directive sentinel is an exclamation point, the line is ignored.
14
Fortran
22 Inarguable clauses often form natural groupings that have similar semantic effect and so are
23 frequently specified as a clause grouping. For argument-modified clauses, clause-specification is:
24 clause-name[(clause-argument-specification [; clause-argument-specification [;...]])]
C / C++
25 White space in a clause-name is prohibited. White space within a clause-argument-specification
26 and between another clause-argument-specification is optional.
C / C++
27 An implementation may allow clauses with clause names that start with the ompx_ prefix for use
28 on any OpenMP directive, and the format and semantics of any such clause is implementation
29 defined. All other clause names are reserved.
30 For argument-modified clauses, the first clause-argument-specification is required unless otherwise
31 explicitly stated while additional ones are only permitted on clauses that explicitly allow them.
32 When the first one is omitted, the syntax is identical to an inarguable clause. Clause arguments may
33 be unmodified or modified. For an unmodified argument, clause-argument-specification is:
2 Unless otherwise specified, modified arguments are pre-modified, for which the format is:
3 [modifier-specification [[, modifier-specification] ,... ] :]clause-argument-list
4 A few modified arguments are explicitly specified as post-modified, for which the format is:
5 clause-argument-list[: modifier-specification [[, modifier-specification] ,... ]]
10 For all other OpenMP clauses, clause-argument-list is a comma-separated list of arguments so the
11 format is:
12 argument-name [, argument-name [,... ]]
13 In most of these cases, the list only has a single item so the format of clause-argument-list is again:
14 argument-name
21
22 The clauses that a directive accepts may form sets. These sets may imply restrictions on their use
23 on that directive or may otherwise capture properties for the clauses on the directive. While specific
24 properties may be defined for a clause set on a particular directive, the following clause-set
25 properties have general meanings and implications as indicated by the restrictions below: required,
26 unique, and exclusive.
5 Restrictions
6 Restrictions to clauses and clause sets are as follows:
7 • A required clause for a directive must appear on the directive.
8 • A unique clause for a directive may appear at most once on the directive.
9 • An exclusive clause for a directive must not appear if a clause with a different clause-name also
10 appears on the directive.
11 • An ultimate clause for a directive must be the lexically last clause to appear on the directive.
12 • If a clause set has the required property, at least one clause in the set must be present on the
13 directive for which the clause set is specified.
14 • If a clause is a member of a set that has the unique property for a directive then the clause has the
15 unique property for that directive regardless of whether it has the unique property when it is not
16 part of such a set.
17 • If one clause of a clause set with the exclusive property appears on a directive, no other clauses
18 with a different clause-name in that set may appear on the directive.
19 • A required argument must appear in the clause-specification.
20 • A unique argument may appear at most once in a clause-argument-specification.
21 • An exclusive argument must not appear if an argument with a different argument-name appears
22 in the clause-argument-specification.
23 • A required modifier must appear in the clause-argument-specification.
24 • A unique modifier may appear at most once in a clause-argument-specification.
25 • An exclusive modifier must not appear if a modifier with a different modifier-name also appears
26 in the clause-argument-specification.
27 • If a clause is pre-modified, an ultimate modifier must be the last modifier in a
28 clause-argument-specification in which any modifier appears.
29 • If a clause is post-modified, an ultimate modifier must be the first modifier in a
30 clause-argument-specification in which any modifier appears.
31 • A modifier that is an expression must neither lexically match the name of a simple modifier
32 defined for the clause that is an OpenMP keyword nor modifier-name parenthesized-tokens,
33 where modifier-name is the modifier-name of a complex modifier defined for the clause and
34 parenthesized-tokens is a token sequence that starts with ( and ends with ).
7 Cross References
8 • Directive Format, see Section 3.1
9 • OpenMP Argument Lists, see Section 3.2.1
10 • OpenMP Stylized Expressions, see Section 4.2
11 • OpenMP Types and Identifiers, see Section 4.1
14 The reserved locator omp_all_memory is a reserved identifier that denotes a list item treated as
15 having storage that corresponds to the storage of all other objects in memory.
11 Restrictions
12 Restrictions to the shape-operator are as follows:
13 • The type T must be a complete type.
14 • The shape-operator can appear only in clauses for which it is explicitly allowed.
15 • The result of a shape-operator must be a named array of a list item.
16 • The type of the expression upon which a shape-operator is applied must be a pointer type.
C++
17 • If the type T is a reference to a type T’, then the type will be considered to be T’ for all purposes
18 of the designated array.
C++
C / C++
1 The precedence of a subscript operator that uses the array section syntax is the same as the
2 precedence of a subscript operator that does not use the array section syntax.
3
4 Note – The following are examples of array sections:
5 a[0:6]
6 a[0:6:1]
7 a[1:10]
8 a[1:]
9 a[:10:2]
10 b[10][:][:]
11 b[10][:][:0]
12 c[42][0:6][:]
13 c[42][0:6:2][:]
14 c[1:10][42][0:6]
15 S.c[:100]
16 p->y[:10]
17 this->a[:N]
18 (p+10)[:N]
19 Assume a is declared to be a 1-dimensional array with dimension size 11. The first two examples
20 are equivalent, and the third and fourth examples are equivalent. The fifth example specifies a stride
21 of 2 and therefore is not contiguous.
22 Assume b is declared to be a pointer to a 2-dimensional array with dimension sizes 10 and 10. The
23 sixth example refers to all elements of the 2-dimensional array given by b[10]. The seventh
24 example is a zero-length array section.
25 Assume c is declared to be a 3-dimensional array with dimension sizes 50, 50, and 50. The eighth
26 example is contiguous, while the ninth and tenth examples are not contiguous.
27 The final four examples show array sections that are formed from more general base expressions.
28 The following are examples that are non-conforming array sections:
29 s[:10].x
30 p[:10]->y
31 *(xp[:10])
32 For all three examples, a base language operator is applied in an undefined manner to an array
4 Clauses
5 affinity, depend, from, map, to
6 An iterator modifier is a unique, complex modifier that defines a set of iterators, each of which is an
7 iterator-identifier and an associated set of values. An iterator-identifier expands to those values in
8 the clause argument for which it is specified. Each member of the modifier-parameter-specification
9 list of an iterator modifier is an iterator-specifier with this format:
C / C++
10 [ iterator-type ] iterator-identifier = range-specification
C / C++
Fortran
11 [ iterator-type :: ] iterator-identifier = range-specification
Fortran
12 where:
13 • iterator-identifier is a base-language identifier.
14 • iterator-type is a type that is permitted in a type-name list.
15 • range-specification is of the form begin:end[:step], where begin and end are expressions for
16 which their types can be converted to iterator-type and step is an integral expression.
C / C++
17 In an iterator-specifier, if the iterator-type is not specified then that iterator is of int type.
C / C++
Fortran
18 In an iterator-specifier, if the iterator-type is not specified then that iterator has default integer type.
Fortran
19 In a range-specification, if the step is not specified its value is implicitly defined to be 1.
20 An iterator only exists in the context of the clause argument that it modifies. An iterator also hides
21 all accessible symbols with the same name in the context of that clause argument.
22 The use of a variable in an expression that appears in the range-specification causes an implicit
23 reference to the variable in all enclosing constructs.
12 Cross References
13 • affinity clause, see Section 12.5.1
14 • depend clause, see Section 15.9.5
15 • from clause, see Section 5.9.2
16 • map clause, see Section 5.8.3
17 • to clause, see Section 5.9.1
4 To enable conditional compilation, a line with a conditional compilation sentinel must satisfy the
5 following criteria:
6 • The sentinel must start in column 1 and appear as a single word with no intervening white space;
7 • After the sentinel is replaced with two spaces, initial lines must have a space or zero in column 6
8 and only white space and numbers in columns 1 through 5; and
9 • After the sentinel is replaced with two spaces, continuation lines must have a character other than
10 a space or zero in column 6 and only white space in columns 1 through 5.
11 If these criteria are met, the sentinel is replaced by two spaces. If these criteria are not met, the line
12 is left unchanged.
13
14 Note – In the following example, the two forms for specifying conditional compilation in fixed
15 source form are equivalent (the first line represents the position of the first 9 columns):
16 c23456789
17 !$ 10 iam = omp_get_thread_num() +
18 !$ & index
19
20 #ifdef _OPENMP
21 10 iam = omp_get_thread_num() +
22 & index
23 #endif
24
25
Fortran
4 To enable conditional compilation, a line with a conditional compilation sentinel must satisfy the
5 following criteria:
6 • The sentinel can appear in any column but must be preceded only by white space;
7 • The sentinel must appear as a single word with no intervening white space;
8 • Initial lines must have a blank character after the sentinel; and
9 • Continued lines must have an ampersand as the last non-blank character on the line, prior to any
10 comment appearing on the conditionally compiled line.
11 Continuation lines can have an ampersand after the sentinel, with optional white space before and
12 after the ampersand. If these criteria are met, the sentinel is replaced by two spaces. If these criteria
13 are not met, the line is left unchanged.
14
15 Note – In the following example, the two forms for specifying conditional compilation in free
16 source form are equivalent (the first line represents the position of the first 9 columns):
17 c23456789
18 !$ iam = omp_get_thread_num() + &
19 !$& index
20
21 #ifdef _OPENMP
22 iam = omp_get_thread_num() + &
23 index
24 #endif
25
26
Fortran
3 Arguments
Name Type Properties
4
if-expression expression of logical type default
5 Modifiers
Name Modifies Type Properties
6 directive-name- if-expression Keyword: unique
modifier directive-name
7 Directives
8 cancel, parallel, simd, target, target data, target enter data, target
9 exit data, target update, task, taskloop
10 Semantics
11 If no directive-name-modifier is specified then the effect is as if a directive-name-modifier was
12 specified with the directive-name of the directive on which the clause appears.
13 The effect of the if clause depends on the construct to which it is applied. If the construct is not a
14 combined or composite construct then the effect is described in the section that describes that
15 construct. For combined or composite constructs, the if clause only applies to the semantics of the
16 construct named in the directive-name-modifier. For a combined or composite construct, if no
17 directive-name-modifier is specified then the if clause applies to all constituent constructs to
18 which an if clause can apply.
19 Restrictions
20 Restrictions to the if clause are as follows:
21 • At most one if clause can be specified that applies to the semantics of any construct or
22 constituent construct of a directive-specification.
23 • The directive-name-modifier must specify the directive-name of the construct or of a constituent
24 construct of the directive-specification on which the if clause appears.
25 Cross References
26 • cancel directive, see Section 16.1
27 • parallel directive, see Section 10.1
28 • simd directive, see Section 10.4
29 • target data directive, see Section 13.5
30 • target directive, see Section 13.8
31 • target enter data directive, see Section 13.6
7 Arguments
Name Type Properties
8
destroy-var variable of OpenMP variable type default
9 Directives
10 depobj, interop
11 Additional information
12 When the destroy clause appears on the depobj construct, the destroy-var argument may be
13 omitted. This syntax has been deprecated.
14 Semantics
15 If the destroy clause appears on a depobj construct and destroy-var is not specified, the effect
16 is as if destroy-var refers to the same OpenMP depend object as the depobj argument of the
17 construct. The syntax of the destroy clause on the depobj construct that does not specify
18 destroy-var has been deprecated. When the destroy clause appears on a depobj construct, the
19 state of destroy-var is set to uninitialized.
20 When the destroy clause appears on an interop construct, the interop-type is inferred based
21 on the interop-type used to initialize destroy-var, and destroy-var is set to the value of
22 omp_interop_none after resources associated with destroy-var are released. The object
23 referred to by destroy-var is unusable after destruction and the effect of using values associated
24 with it is unspecified until it is initialized again by another interop construct.
25 Restrictions
26 • destroy-var must be non-const.
27 • If the destroy clause appears on a depobj construct, destroy-var must refer to the same
28 depend object as the depobj argument of the construct.
29 • If the destroy clause appears on an interop construct destroy-var must refer to a variable of
30 OpenMP interop type.
31 Cross References
32 • depobj directive, see Section 15.9.4
33 • interop directive, see Section 14.1
5 Restrictions
6 The following restrictions apply generally for base language code in an OpenMP program:
7 • Programs must not declare names that begin with the omp_ or ompx_ prefix, as these are
8 reserved for the OpenMP implementation.
C++
9 • Programs must not declare a namespace with the omp or ompx names, as these are reserved for
10 the OpenMP implementation.
C++
9 Cross References
10 • OpenMP Combiner Expressions, see Section 5.5.2.1
11 • OpenMP Initializer Expressions, see Section 5.5.2.2
6 or
7 target-call ( [expression-list] );
C / C++
Fortran
8 A function dispatch structured block is an expression statement with one of the following forms:
9 expression = target-call ( [arguments] )
10 or
11 CALL target-call [ ( [arguments] )]
12 For purposes of the dispatch construct, the expression statement is considered a strictly
13 structured block.
Fortran
14 Restrictions
15 Restrictions to the function dispatch structured blocks are as follows:
C++
16 • The target-call expression can only be a direct call.
C++
Fortran
17 • target-call must be a procedure name.
18 • target-call must not be a procedure pointer.
Fortran
19 Cross References
20 • dispatch directive, see Section 7.6
23 or cond-update-stmt, a conditional update statement that has one of the following forms:
24 if(expr ordop x) { x = expr; }
25 if(x ordop expr) { x = expr; }
26 if(x == e) { x = d; }
C / C++
6 or
7 if (x equalop e) x = d
4 or
5 capture-statement
6 statement
21 All capture-atomic structured blocks are considered loosely structured blocks for the purpose of the
22 atomic construct.
Fortran
23 Restrictions
24 Restrictions to OpenMP atomic structured blocks are as follows:
C / C++
25 • In forms where e is assigned it must be an lvalue.
26 • r must be of integral type.
27 • During the execution of an atomic region, multiple syntactic occurrences of x must designate
28 the same storage location.
29 • During the execution of an atomic region, multiple syntactic occurrences of r must designate
30 the same storage location.
25 Symbol Meaning
18 or
19 generated-canonical-loop
22 or
C / C++
23 {
24 [intervening-code]
25 loop-body
26 [intervening-code]
27 }
C / C++
28 or
9 generated-canonical- A generated loop from a loop transformation construct that has canonical loop nest
10 loop form and for which the loop body matches loop-body.
11 intervening-code A non-empty structured block sequence that does not contain OpenMP directives or
12 calls to the OpenMP runtime API in its corresponding region, referred to as
13 intervening code. If intervening code is present, then a loop at the same depth within
14 the loop nest is not a perfectly nested loop.
C / C++
15 It must not contain iteration statements, continue statements or break statements
16 that apply to the enclosing loop.
C / C++
Fortran
17 It must not contain loops, array expressions, CYCLE statements or EXIT statements.
Fortran
18 final-loop-body A structured block that terminates the scope of loops in the loop nest. If the loop nest
19 is associated with a loop-associated directive, loops in this structured block cannot be
20 associated with that directive.
C / C++
10 a1, a2, incr Integer expressions that are loop invariant with respect to the outermost loop of the
11 loop nest.
12 If the loop is associated with a loop-associated directive, the expressions are
13 evaluated before the construct formed from that directive.
14 var-outer The loop iteration variable of a surrounding loop in the loop nest.
C++
15 range-decl A declaration of a variable as defined by the base language for range-based for
16 loops.
17 range-expr An expression that is valid as defined by the base language for range-based for
18 loops. It must be invariant with respect to the outermost loop of the loop nest and the
19 iterator derived from it must be a random access iterator.
C++
20 Restrictions
21 Restrictions to canonical loop nests are as follows:
C / C++
22 • If test-expr is of the form var relational-op b and relational-op is < or <= then incr-expr must
23 cause var to increase on each iteration of the loop. If test-expr is of the form var relational-op b
24 and relational-op is > or >= then incr-expr must cause var to decrease on each iteration of the
25 loop. Increase and decrease are using the order induced by relational-op.
26 • If test-expr is of the form ub relational-op var and relational-op is < or <= then incr-expr must
27 cause var to decrease on each iteration of the loop. If test-expr is of the form ub relational-op
28 var and relational-op is > or >= then incr-expr must cause var to increase on each iteration of the
29 loop. Increase and decrease are using the order induced by relational-op.
14 Cross References
15 • Loop Transformation Constructs, see Chapter 9
16 • threadprivate directive, see Section 5.2
18 Arguments
Name Type Properties
19
n expression of integer type default
20 Directives
21 distribute, do, for, loop, simd, taskloop
22 Semantics
23 The collapse clause associates one or more loops with the directive on which it appears for the
24 purpose of identifying the portion of the depth of the canonical loop nest to which to apply the
25 semantics of the directive. The argument n specifies the number of loops of the associated loop nest
26 to which to apply those semantics. On all directives on which the collapse clause may appear,
27 the effect is as if a value of one was specified for n if the collapse clause is not specified.
28 Restrictions
29 • n must not evaluate to a value greater than the depth of the associated loop nest.
12 Arguments
13 threadprivate(list)
Name Type Properties
14
list list of variable list item type default
15 Semantics
16 The threadprivate directive specifies that variables are replicated, with each thread having its
17 own copy. Unless otherwise specified, each copy of a threadprivate variable is initialized once, in
18 the manner specified by the program, but at an unspecified point in the program prior to the first
19 reference to that copy. The storage of all copies of a threadprivate variable is freed according to
20 how static variables are handled in the base language, but at an unspecified point in the program.
C++
21 Each copy of a block-scope threadprivate variable that has a dynamic initializer is initialized the
22 first time its thread encounters its definition; if its thread does not encounter its definition, its
23 initialization is unspecified.
C++
24 The content of a threadprivate variable can change across a task scheduling point if the executing
25 thread switches to another task that modifies the variable. For more details on task scheduling, see
26 Section 1.3 and Chapter 12.
27 In parallel regions, references by the primary thread are to the copy of the variable in the thread
28 that encountered the parallel region.
29 During a sequential part, references are to the initial thread’s copy of the variable. The values of
30 data in the initial thread’s copy of a threadprivate variable are guaranteed to persist between any
25 Restrictions
26 The following restrictions apply to any list item that is privatized unless otherwise stated for a given
27 data-sharing attribute clause:
C++
28 • A variable of class type (or array thereof) that is privatized requires an accessible, unambiguous
29 default constructor for the class type.
C++
3 Arguments
Name Type Properties
4 data-sharing-attribute Keyword: firstprivate, none, default
private, shared
5 Directives
6 parallel, task, taskloop, teams
7 Semantics
8 The default clause determines the implicit data-sharing attribute of certain variables that are
9 referenced in the construct, in accordance with the rules given in Section 5.1.1.
10 If data-sharing-attribute is not none, the data-sharing attribute of all variables referenced in the
11 construct that have implicitly determined data-sharing attributes will be data-sharing-attribute. If
12 data-sharing-attribute is none, the data-sharing attribute is not implicitly determined.
13 Restrictions
14 Restrictions to the default clause are as follows:
15 • If data-sharing-attribute is none, each variable that is referenced in the construct and does not
16 have a predetermined data-sharing attribute must have its data-sharing attribute explicitly
17 determined by being listed in a data-sharing attribute clause.
C / C++
18 • If data-sharing-attribute is firstprivate or private, each variable with static storage
19 duration that is declared in a namespace or global scope, is referenced in the construct, and does
20 not have a predetermined data-sharing attribute must have its data-sharing attribute explicitly
21 determined by being listed in a data-sharing attribute clause.
C / C++
22 Cross References
23 • parallel directive, see Section 10.1
24 • task directive, see Section 12.5
25 • taskloop directive, see Section 12.6
26 • teams directive, see Section 10.2
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Directives
6 parallel, task, taskloop, teams
7 Semantics
8 The shared clause declares one or more list items to be shared by tasks generated by the construct
9 on which it appears. All references to a list item within a task refer to the storage area of the
10 original variable at the point the directive was encountered.
11 The programmer must ensure, by adding proper synchronization, that storage shared by an explicit
12 task region does not reach the end of its lifetime before the explicit task region completes its
13 execution.
Fortran
14 The association status of a shared pointer becomes undefined upon entry to and exit from the
15 construct if it is associated with a target or a subobject of a target that appears as a privatized list
16 item in a data-sharing attribute clause on the construct. A reference to the shared storage that is
17 associated with the dummy argument by any other task must be synchronized with the reference to
18 the procedure to avoid possible data races.
Fortran
19 Cross References
20 • parallel directive, see Section 10.1
21 • task directive, see Section 12.5
22 • taskloop directive, see Section 12.6
23 • teams directive, see Section 10.2
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Directives
6 distribute, do, for, loop, parallel, scope, sections, simd, single, target,
7 task, taskloop, teams
8 Semantics
9 The private clause specifies that its list items are to be privatized according to Section 5.3. Each
10 task or SIMD lane that references a list item in the construct receives only one new list item, unless
11 the construct has one or more associated loops and an order clause that specifies concurrent
12 is also present.
13 Restrictions
14 Restrictions to the private clause are as specified in Section 5.3.
15 Cross References
16 • List Item Privatization, see Section 5.3
17 • distribute directive, see Section 11.6
18 • do directive, see Section 11.5.2
19 • for directive, see Section 11.5.1
20 • loop directive, see Section 11.7
21 • parallel directive, see Section 10.1
22 • scope directive, see Section 11.2
23 • sections directive, see Section 11.3
24 • simd directive, see Section 10.4
25 • single directive, see Section 11.1
26 • target directive, see Section 13.8
27 • task directive, see Section 12.5
28 • taskloop directive, see Section 12.6
29 • teams directive, see Section 10.2
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Directives
6 distribute, do, for, parallel, scope, sections, single, target, task,
7 taskloop, teams
8 Semantics
9 The firstprivate clause provides a superset of the functionality provided by the private
10 clause. A list item that appears in a firstprivate clause is subject to the private clause
11 semantics described in Section 5.4.3, except as noted. In addition, the new list item is initialized
12 from the original list item that exists before the construct. The initialization of the new list item is
13 done once for each task that references the list item in any statement in the construct. The
14 initialization is done prior to the execution of the construct.
15 For a firstprivate clause on a construct that is not a work-distribution construct, the initial
16 value of the new list item is the value of the original list item that exists immediately prior to the
17 construct in the task region where the construct is encountered unless otherwise specified. For a
18 firstprivate clause on a work-distribution construct, the initial value of the new list item for
19 each implicit task of the threads that execute the construct is the value of the original list item that
20 exists in the implicit task immediately prior to the point in time that the construct is encountered
21 unless otherwise specified.
22 To avoid data races, concurrent updates of the original list item must be synchronized with the read
23 of the original list item that occurs as a result of the firstprivate clause.
C / C++
24 For variables of non-array type, the initialization occurs by copy assignment. For an array of
25 elements of non-array type, each element is initialized as if by assignment from an element of the
26 original array to the corresponding element of the new array.
C / C++
C++
27 For each variable of class type:
28 • If the firstprivate clause is not on a target construct then a copy constructor is invoked
29 to perform the initialization; and
30 • If the firstprivate clause is on a target construct then how many copy constructors, if
31 any, are invoked is unspecified.
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Modifiers
Name Modifies Type Properties
6 lastprivate- list Keyword: conditional default
modifier
7 Directives
8 distribute, do, for, loop, sections, simd, taskloop
9 Semantics
10 The lastprivate clause provides a superset of the functionality provided by the private
11 clause. A list item that appears in a lastprivate clause is subject to the private clause
12 semantics described in Section 5.4.3. In addition, when a lastprivate clause without the
13 conditional modifier appears on a directive and the list item is not an iteration variable of any
14 associated loop, the value of each new list item from the sequentially last iteration of the associated
15 loops, or the lexically last structured block sequence associated with a sections construct, is
16 assigned to the original list item. When the conditional modifier appears on the clause or the
17 list item is an iteration variable of one of the associated loops, if sequential execution of the loop
18 nest would assign a value to the list item then the original list item is assigned the value that the list
19 item would have after sequential execution of the loop nest.
C++
20 For class types, the copy assignment operator is invoked. The order in which copy assignment
21 operators for different variables of the same class type are invoked is unspecified.
C++
C / C++
22 For an array of elements of non-array type, each element is assigned to the corresponding element
23 of the original array.
C / C++
Fortran
24 If the original list item does not have the POINTER attribute, its update occurs as if by intrinsic
25 assignment unless it has a type bound procedure as a defined assignment.
26 If the original list item has the POINTER attribute, its update occurs as if by pointer assignment.
Fortran
20 Restrictions
21 Restrictions to the lastprivate clause are as follows:
22 • A list item must not appear in a lastprivate clause on a work-distribution construct if the
23 corresponding region binds to the region of a parallelism-generating construct in which the list
24 item is private.
25 • A list item that appears in a lastprivate clause with the conditional modifier must be a
26 scalar variable.
C++
27 • A variable of class type (or array thereof) that appears in a lastprivate clause requires an
28 accessible, unambiguous default constructor for the class type, unless the list item is also
29 specified in a firstprivate clause.
30 • A variable of class type (or array thereof) that appears in a lastprivate clause requires an
31 accessible, unambiguous copy assignment operator for the class type.
32 • If an original list item in a lastprivate clause on a work-distribution construct has a
33 reference type then it must bind to the same object for all threads in the binding thread set of the
34 work-distribution region.
C++
19 Arguments
Name Type Properties
20
list list of variable list item type default
21 Modifiers
Name Modifies Type Properties
step-simple- list OpenMP integer expression exclusive, re-
modifier gion-invariant,
unique
step-complex- list Complex, name: step Ar- unique
22
modifier guments:
linear-step expression of in-
teger type (region-invariant)
3 Additional information
4 list and linear-modifier may instead be specified as linear-modifier(list) for linear clauses that
5 appear on a declare simd directive. This syntax has been deprecated.
6 Semantics
7 The linear clause provides a superset of the functionality provided by the private clause. A
8 list item that appears in a linear clause is subject to the private clause semantics described in
9 Section 5.4.3, except as noted. If the step-simple-modifier is specified, the behavior is as if the
10 step-complex-modifier is instead specified with step-simple-modifier as its linear-step argument. If
11 linear-step is not specified, it is assumed to be 1.
12 When a linear clause is specified on a construct, the value of the new list item on each logical
13 iteration of the associated loops corresponds to the value of the original list item before entering the
14 construct plus the logical number of the iteration times linear-step. The value corresponding to the
15 sequentially last logical iteration of the associated loops is assigned to the original list item.
16 When a linear clause is specified on a declare simd directive, the list items refer to
17 parameters of the procedure to which the directive applies. For a given call to the procedure, the
18 clause determines whether the SIMD version generated by the directive may be called. If the clause
19 does not specify the ref linear-modifier, the SIMD version requires that the value of the
20 corresponding argument at the callsite is equal to the value of the argument from the first lane plus
21 the logical number of the lane times the linear-step. If the clause specifies the ref linear-modifier,
22 the SIMD version requires that the storage locations of the corresponding arguments at the callsite
23 from each SIMD lane correspond to locations within a hypothetical array of elements of the same
24 type, indexed by the logical number of the lane times the linear-step.
25 Restrictions
26 Restrictions to the linear clause are as follows:
27 • Only a loop iteration variable of a loop that is associated with the construct may appear as a list
28 item in a linear clause if a reduction clause with the inscan modifier also appears on
29 the construct.
30 • A linear-modifier may be specified as ref or uval only on a declare simd directive.
31 • For a linear clause that appears on a loop-associated construct, the difference between the
32 value of a list item at the end of a logical iteration and its value at the beginning of the logical
33 iteration must be equal to linear-step.
34 • If linear-modifier is uval for a list item in a linear clause that is specified on a
35 declare simd directive and the list item is modified during a call to the SIMD version of the
36 procedure, the program must not depend on the value of the list item upon return from the
37 procedure.
10 Arguments
Name Type Properties
11
list list of variable list item type default
12 Directives
13 dispatch, target
14 Semantics
15 The is_device_ptr clause indicates that its list items are device pointers. Support for device
16 pointers created outside of OpenMP, specifically outside of any OpenMP mechanism that returns a
17 device pointer, is implementation defined.
18 If the is_device_ptr clause is specified on a target construct, each list item privatized
19 inside the construct and the new list item is initialized to the device address to which the original
20 list item refers.
Fortran
21 If the is_device_ptr clause is specified on a target construct, if any list item is not of type
22 C_PTR, the behavior is as if the list item appeared in a has_device_addr clause. Support for
23 such list items in an is_device_ptr clause is deprecated.
Fortran
24 Restrictions
25 Restrictions to the is_device_ptr clause are as follows:
26 • Each list item must be a valid device pointer for the device data environment.
C
27 • Each list item must have a type of pointer or array.
C
10 Arguments
Name Type Properties
11
list list of variable list item type default
12 Directives
13 target data
14 Semantics
C / C++
15 If a list item that appears in a use_device_ptr clause is a pointer to an object that is mapped to
16 the device data environment, references to the list item in the structured block that is associated
17 with the construct on which the clause appears are converted into references to a device pointer that
18 is local to the structured block and that refers to the device address of the corresponding object. If
19 the list item does not point to a mapped object, it must contain a valid device address for the target
20 device, and the list item references are instead converted to references to a local device pointer that
21 refers to this device address.
C / C++
22 Arguments
Name Type Properties
23
list list of variable list item type default
24 Directives
25 target
7 Restrictions
8 Restrictions to the has_device_addr clause are as follows:
9 • Each list item must have a valid device address for the device data environment.
10 Cross References
11 • target directive, see Section 13.8
14 Arguments
Name Type Properties
15
list list of variable list item type default
16 Directives
17 target data
18 Semantics
19 If a list item has corresponding storage in the device data environment, references to the list item in
20 the structured block that is associated with the construct on which the use_device_addr clause
21 appears are converted into references to the corresponding list item. If the list item is not a mapped
22 list item, it is assumed to be accessible on the target device. Inside the structured block, the list item
23 has a device address and its storage may not be accessible from the host device. The list items that
24 appear in a use_device_addr clause may include array sections.
C / C++
25 If a list item in a use_device_addr clause is an array section that has a base pointer, the effect
26 of the clause is to convert the base pointer to a pointer that is local to the structured block and that
27 contains the device address. This conversion may be elided if the list item was not already mapped.
C / C++
4 Restrictions
5 Restrictions to reduction expressions are as follows:
6 • If execution of a reduction expression results in the execution of an OpenMP construct or an
7 OpenMP API call, the behavior is unspecified.
C / C++
8 • If a reduction expression corresponds to a reduction identifier that is used in a target region, a
9 declare target directive must be specified for any function that can be accessed through the
10 expression.
C / C++
Fortran
11 • Any subroutine or function used in a reduction expression must be an intrinsic function, or must
12 have an accessible interface.
13 • Any user-defined operator, defined assignment or extended operator used in a reduction
14 expression must have an accessible interface.
15 • If any subroutine, function, user-defined operator, defined assignment or extended operator is
16 used in a reduction expression, it must be accessible to the subprogram in which the
17 corresponding reduction clause is specified.
18 • Any subroutine used in a reduction expression must not have any alternate returns appear in the
19 argument list.
20 • If the list item in the corresponding reduction clause is an array or array section, any
21 procedure used in a reduction expression must either be elemental or have dummy arguments that
22 are scalar.
23 • Any procedure called in the region of a reduction expression must be pure and may not reference
24 any host-associated variables.
25 • If a reduction expression corresponds to a reduction identifier that is used in a target region, a
26 declare target directive must be specified for any function or subroutine that can be
27 accessed through the expression.
Fortran
7 or
8 subroutine-name(argument-list)
Fortran
9 In the definition of an initializer expression, the omp_priv special variable identifier refers to the
10 storage to be initialized. The special variable identifier omp_orig can be used in an initializer
11 expression to refer to the storage of the original variable to be reduced. The number of times that an
12 initializer expression is evaluated and the order of these evaluations are unspecified.
C / C++
13 If an initializer expression is a function name with an argument list, it is evaluated by calling the
14 function with the specified argument list. Otherwise, an initializer expression specifies how
15 omp_priv is declared and initialized.
C / C++
Fortran
16 If an initializer expression is a subroutine name with an argument list, the initializer-expr is
17 evaluated by calling the subroutine with the specified argument list. If an initializer expression is an
18 assignment statement, the initializer expression is evaluated by executing the assignment statement.
Fortran
C
19 The a priori initialization of private copies that are created for reductions follows the rules for
20 initialization of objects with static storage duration.
C
C / C++
Fortran
1 Table 5.2 lists each reduction identifier that is implicitly declared for numeric and logical types and
2 its semantic initializer value. The actual initializer value is that value as expressed in the data type
3 of the reduction list item.
Fortran
3 Arguments
Name Type Properties
4
initializer-expr expression of initializer type default
5 Directives
6 declare reduction
7 Semantics
8 The initializer clause can be used to specify initializer-expr as the initializer expression for a
9 user-defined reduction.
10 Cross References
11 • declare reduction directive, see Section 5.5.11
10 Restrictions
11 Restrictions common to reduction clauses are as follows:
12 • Any array element must be specified at most once in all list items on a directive.
13 • For a reduction identifier declared in a declare reduction directive, the directive must
14 appear before its use in a reduction clause.
15 • If a list item is an array section, it must specify contiguous storage, it cannot be a zero-length
16 array section and its base expression must be a base language identifier.
17 • If a list item is an array section or an array element, accesses to the elements of the array outside
18 the specified array section or array element result in unspecified behavior.
C / C++
19 • The type of a list item that appears in a reduction clause must be valid for the reduction identifier.
20 For a max or min reduction in C, the type of the list item must be an allowed arithmetic data
21 type: char, int, float, double, or _Bool, possibly modified with long, short,
22 signed, or unsigned. For a max or min reduction in C++, the type of the list item must be
23 an allowed arithmetic data type: char, wchar_t, int, float, double, or bool, possibly
24 modified with long, short, signed, or unsigned.
25 • A list item that appears in a reduction clause must not be const-qualified.
26 • The reduction identifier for any list item must be unambiguous and accessible.
C / C++
Fortran
27 • The type, type parameters and rank of a list item that appears in a reduction clause must be valid
28 for the combiner expression and the initializer expression.
29 • A list item that appears in a reduction clause must be definable.
30 • A procedure pointer must not appear in a reduction clause.
31 • A pointer with the INTENT(IN) attribute must not appear in a reduction clause.
13 Arguments
Name Type Properties
14
list list of variable list item type default
15 Modifiers
Name Modifies Type Properties
reduction- list An OpenMP reduction iden- required, ultimate
16 identifier tifier
reduction-modifier list Keyword: default, default
inscan, task
17 Directives
18 do, for, loop, parallel, scope, sections, simd, taskloop, teams
19 Semantics
20 The reduction clause is a reduction scoping clause and a reduction participating clause, as
21 described in Section 5.5.6 and Section 5.5.7. For each list item, a private copy is created for each
22 implicit task or SIMD lane and is initialized with the initializer value of the reduction-identifier.
23 After the end of the region, the original list item is updated with the values of the private copies
24 using the combiner associated with the reduction-identifier.
25 If reduction-modifier is not present or the default reduction-modifier is present, the behavior is
26 as follows. For parallel and worksharing constructs, one or more private copies of each list
23 Restrictions
24 Restrictions to the reduction clause are as follows:
25 • All restrictions common to all reduction clauses, as listed in Section 5.5.5, apply to this clause.
26 • A list item that appears in a reduction clause on a worksharing construct must be shared in
27 the parallel region to which a corresponding worksharing region binds.
28 • If an array section or array element appears as a list item in a reduction clause on a
29 worksharing construct, all threads of the team must specify the same storage location.
30 • Each list item specified with the inscan reduction-modifier must appear as a list item in an
31 inclusive or exclusive clause on a scan directive enclosed by the construct.
32 • If the inscan reduction-modifier is specified, a reduction clause without the inscan
33 reduction-modifier must not appear on the same construct.
34 • A reduction clause with the task reduction-modifier may only appear on a parallel
35 construct, a worksharing construct or a combined or composite construct for which any of the
36 aforementioned constructs is a constituent construct and neither simd nor loop are constituent
37 constructs.
7 Arguments
Name Type Properties
8
list list of variable list item type default
9 Modifiers
Name Modifies Type Properties
10 reduction- list An OpenMP reduction iden- required, ultimate
identifier tifier
11 Directives
12 taskgroup
13 Semantics
14 The task_reduction clause is a reduction scoping clause, as described in Section 5.5.6, that
15 specifies a reduction among tasks. For each list item, the number of copies is unspecified. Any
16 copies associated with the reduction are initialized before they are accessed by the tasks that
17 participate in the reduction. After the end of the region, the original list item contains the result of
18 the reduction.
19 Restrictions
20 Restrictions to the task_reduction clause are as follows:
21 • All restrictions common to all reduction clauses, as listed in Section 5.5.5, apply to this clause.
22 Cross References
23 • taskgroup directive, see Section 15.4
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Modifiers
Name Modifies Type Properties
6 reduction- list An OpenMP reduction iden- required, ultimate
identifier tifier
7 Directives
8 target, task, taskloop
9 Semantics
10 The in_reduction clause is a reduction participating clause, as described in Section 5.5.7, that
11 specifies that a task participates in a reduction. For a given list item, the in_reduction clause
12 defines a task to be a participant in a task reduction that is defined by an enclosing region for a
13 matching list item that appears in a task_reduction clause or a reduction clause with
14 task as the reduction-modifier, where either:
15 1. The matching list item has the same storage location as the list item in the in_reduction
16 clause; or
17 2. A private copy, derived from the matching list item, that is used to perform the task reduction
18 has the same storage location as the list item in the in_reduction clause.
19 For the task construct, the generated task becomes the participating task. For each list item, a
20 private copy may be created as if the private clause had been used.
21 For the target construct, the target task becomes the participating task. For each list item, a
22 private copy may be created in the data environment of the target task as if the private clause
23 had been used. This private copy will be implicitly mapped into the device data environment of the
24 target device, if the target device is not the parent device.
25 At the end of the task region, if a private copy was created its value is combined with a copy created
26 by a reduction scoping clause or with the original list item.
27 Restrictions
28 Restrictions to the in_reduction clause are as follows:
29 • All restrictions common to all reduction clauses, as listed in Section 5.5.5, apply to this clause.
12 Arguments
13 declare reduction(reduction-specifier)
Name Type Properties
14
reduction-specifier OpenMP reduction specifier default
15 Clauses
16 initializer
17 Semantics
18 The declare reduction directive declares a reduction-identifier that can be used in a
19 reduction clause as a user-defined reduction. The directive argument reduction-specifier uses the
20 following syntax:
21 reduction-identifier : typename-list : combiner
8 Separated directives
9 do, for, simd
10 Clauses
11 exclusive, inclusive
12 Clause set
13 Properties: unique, required, exclusive Members: exclusive, inclusive
14 Semantics
15 The scan directive separates the final-loop-body of an enclosing simd construct or
16 worksharing-loop construct (or a composite construct that combines them) into a structured block
17 sequence that serves as an input phase and a structured block sequence that serves as a scan phase.
18 The input phase contains all computations that update the list item in the iteration, and the scan
19 phase ensures that any statement that reads the list item uses the result of the scan computation for
20 that iteration. Thus, it specifies that a scan computation updates each list item on each logical
21 iteration of the enclosing loop nest that is associated with the separated directive.
22 If the inclusive clause is specified, the input phase includes the preceding structured block
23 sequence and the scan phase includes the following structured block sequence and, thus, the
24 directive specifies that an inclusive scan computation is performed for each list item of list. If the
25 exclusive clause is specified, the input phase excludes the preceding structured block sequence
26 and instead includes the following structured block sequence, while the scan phase includes the
27 preceding structured block sequence and, thus, the directive specifies that an exclusive scan
28 computation is performed for each list item of list.
29 The result of a scan computation for a given iteration is calculated according to the last generalized
30 prefix sum (PRESUMlast ) applied over the sequence of values given by the original value of the list
31 item prior to the loop and all preceding updates to the list item in the logical iteration space of the
32 loop. The operation PRESUMlast (op, a1 , . . . , aN ) is defined for a given binary operator op and a
33 sequence of N values a1 , . . . , aN as follows:
10 Arguments
Name Type Properties
11
list list of variable list item type default
12 Directives
13 scan
14 Semantics
15 The inclusive clause is used on a separating directive that separates a structured block into two
16 structured block sequences. The clause determines the association of the structured block sequence
17 that precedes the directive on which the clause appears to a phase of that directive.
18 The list items that appear in an inclusive clause may include array sections.
19 Cross References
20 • scan directive, see Section 5.6
23 Arguments
Name Type Properties
24
list list of variable list item type default
25 Directives
26 scan
6 Cross References
7 • scan directive, see Section 5.6
14 Arguments
Name Type Properties
15
list list of variable list item type default
16 Directives
17 parallel
18 Semantics
19 The copyin clause provides a mechanism to copy the value of a threadprivate variable of the
20 primary thread to the threadprivate variable of each other member of the team that is executing the
21 parallel region.
C / C++
22 The copy is performed after the team is formed and prior to the execution of the associated
23 structured block. For variables of non-array type, the copy is by copy assignment. For an array of
24 elements of non-array type, each element is copied as if by assignment from an element of the array
25 of the primary thread to the corresponding element of the array of all other threads.
C / C++
C++
26 For class types, the copy assignment operator is invoked. The order in which copy assignment
27 operators for different variables of the same class type are invoked is unspecified.
C++
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Directives
6 single
7 Semantics
8 The copyprivate clause provides a mechanism to use a private variable to broadcast a value
9 from the data environment of one implicit task to the data environments of the other implicit tasks
10 that belong to the parallel region. The effect of the copyprivate clause on the specified list
11 items occurs after the execution of the structured block associated with the associated construct,
12 and before any of the threads in the team have left the barrier at the end of the construct. To avoid
13 data races, concurrent reads or updates of the list item must be synchronized with the update of the
14 list item that occurs as a result of the copyprivate clause if, for example, the nowait clause is
15 used to remove the barrier.
C / C++
16 In all other implicit tasks that belong to the parallel region, each specified list item becomes defined
17 with the value of the corresponding list item in the implicit task associated with the thread that
18 executed the structured block. For variables of non-array type, the definition occurs by copy
19 assignment. For an array of elements of non-array type, each element is copied by copy assignment
20 from an element of the array in the data environment of the implicit task that is associated with the
21 thread that executed the structured block to the corresponding element of the array in the data
22 environment of the other implicit tasks.
C / C++
C++
23 For class types, a copy assignment operator is invoked. The order in which copy assignment
24 operators for different variables of class type are called is unspecified.
C++
Fortran
25 If a list item does not have the POINTER attribute, then in all other implicit tasks that belong to the
26 parallel region, the list item becomes defined as if by intrinsic assignment with the value of the
27 corresponding list item in the implicit task that is associated with the thread that executed the
28 structured block. If the list item has a type bound procedure as a defined assignment, the
29 assignment is performed by the defined assignment.
13 Clauses
14 from, map, to
15 Mapper identifiers can be used to uniquely identify the mapper used in a map or data-motion clause
16 through a mapper modifier, which is a unique, complex modifier. A declare mapper directive
17 defines a mapper identifier that can later be specified in a mapper modifier as its
18 modifier-parameter-specification. Each mapper identifier is a base-language identifier or default
19 where default is the default mapper for all types.
20 A non-structure type T has a predefined default mapper that is defined as if by the following
21 declare mapper directive:
C / C++
22 #pragma omp declare mapper(T v) map(tofrom: v)
C / C++
7 Cross References
8 • from clause, see Section 5.9.2
9 • map clause, see Section 5.8.3
10 • to clause, see Section 5.9.1
13 Arguments
Name Type Properties
14
locator-list list of locator list item type default
15 Modifiers
Name Modifies Type Properties
map-type-modifier locator-list Keyword: always, close, default
present
mapper locator-list Complex, name: mapper unique
Arguments:
mapper-identifier OpenMP
identifier (default)
16
iterator locator-list Complex, name: iterator unique
Arguments:
iterator-specifier OpenMP
expression (repeatable)
4 Additional information
5 The commas that separate modifiers in a map clause are optional. The specification of modifiers
6 without comma separators for the map clause has been deprecated.
7 Semantics
8 The map clause specifies how an original list item is mapped from the current task’s data
9 environment to a corresponding list item in the device data environment of the device identified by
10 the construct. If a map-type is not specified, the map-type defaults to tofrom. The map clause is
11 map-entering if the map-type is to, tofrom or alloc. The map clause is map-exiting if the
12 map-type is from, tofrom, release or delete.
13 The list items that appear in a map clause may include array sections and structure elements. A list
14 item in a map clause may reference any iterator-identifier defined in its iterator modifier. A list
15 item may appear more than once in the map clauses that are specified on the same directive.
16 If a mapper modifier is not present, the behavior is as if a mapper modifier was specified with the
17 default parameter. The map behavior of a list item in a map clause is modified by a visible
18 user-defined mapper (see Section 5.8.8) if the mapper-identifier of the mapper modifier is defined
19 for a base-language type that matches the type of the list item. Otherwise, the predefined default
20 mapper for the type of the list item applies. The effect of the mapper is to remove the list item from
21 the map clause, if the present modifier does not also appear, and to apply the clauses specified in
22 the declared mapper to the construct on which the map clause appears. In the clauses applied by the
23 mapper, references to var are replaced with references to the list item and the map-type is replaced
24 with a final map type that is determined according to the rules of map-type decay (see
25 Section 5.8.8).
26 A list item that is an array or array section of a type for which a user-defined mapper exists is
27 mapped as if the map type decays to alloc, release, or delete, and then each array element
28 is mapped with the original map type, as if by a separate construct, according to the mapper.
Fortran
29 If a component of a derived type list item is a map clause list item that results from the predefined
30 default mapper for that derived type, and if the derived type component is not an explicit list item or
31 the base expression of an explicit list item in a map clause on the construct, then:
32 • If it has the POINTER attribute, the map clause treats its association status as if it is undefined;
33 and
34 • If it has the ALLOCATABLE attribute and an allocated allocation status, and it is present in the
35 device data environment when the construct is encountered, the map clause may treat its
36 allocation status as if it is unallocated if the corresponding component does not have allocated
37 storage.
19 Note – If the effect of the map clauses on a construct would assign the value of an original list
20 item to a corresponding list item more than once, then an implementation is allowed to ignore
21 additional assignments of the same value to the corresponding list item.
22
23 In all cases on entry to the region, concurrent reads or updates of any part of the corresponding list
24 item must be synchronized with any update of the corresponding list item that occurs as a result of
25 the map clause to avoid data races.
26 The original and corresponding list items may share storage such that writes to either item by one
27 task followed by a read or write of the other item by another task without intervening
28 synchronization can result in data races. They are guaranteed to share storage if the map clause
29 appears on a target construct that corresponds to an inactive target region, or if it appears on
30 a mapping-only construct that applies to the device data environment of the host device.
31 If corresponding storage for a mappable storage block derived from map clauses on a map-exiting
32 construct is not present in the device data environment on exit from the region, the mappable
33 storage block is ignored. For each mappable storage block that is determined by the map clauses on
34 a map-exiting construct, on exit from the region the following sequence of steps occurs as if
35 performed as a single atomic operation:
23 Tool Callbacks
24 A thread dispatches one or more registered ompt_callback_target_map or
25 ompt_callback_target_map_emi callbacks for each occurrence of a target-map event in
26 that thread. The callback occurs in the context of the target task and has type signature
27 ompt_callback_target_map_t or ompt_callback_target_map_emi_t,
28 respectively.
29 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
30 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
31 event in that thread. Similarly, a thread dispatches a registered
32 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
33 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
34 type signature ompt_callback_target_data_op_emi_t.
4 Restrictions
5 Restrictions to the map clause are as follows:
6 • Two list items of the map clauses on the same construct must not share original storage unless
7 they are the same list item or unless one is the containing structure of the other.
8 • If the same list item appears more than once in map clauses on the same construct, the map
9 clauses must specify the same mapper modifier.
10 • If a list item is an array section, it must specify contiguous storage.
11 • If an expression that is used to form a list item in a map clause contains an iterator identifier, the
12 list item instances that would result from different values of the iterator must not have the same
13 containing array and must not have base pointers that share original storage.
14 • If multiple list items are explicitly mapped on the same construct and have the same containing
15 array or have base pointers that share original storage, and if any of the list items do not have
16 corresponding list items that are present in the device data environment prior to a task
17 encountering the construct, then the list items must refer to the same array elements of either the
18 containing array or the implicit array of the base pointers.
19 • If any part of the original storage of a list item with an explicit data-mapping attribute has
20 corresponding storage in the device data environment prior to a task encountering the construct
21 associated with the map clause, all of the original storage must have corresponding storage in the
22 device data environment prior to the task encountering the construct.
23 • If an array appears as a list item in a map clause, multiple parts of the array have corresponding
24 storage in the device data environment prior to a task encountering the construct associated with
25 the map clause, and the corresponding storage for those parts was created by maps from more
26 than one earlier construct, the behavior is unspecified.
27 • If a list item is an element of a structure, and a different element of the structure has a
28 corresponding list item in the device data environment prior to a task encountering the construct
29 associated with the map clause, then the list item must also have a corresponding list item in the
30 device data environment prior to the task encountering the construct.
31 • A list item must have a mappable type.
32 • If a mapper modifier appears in a map clause, the type on which the specified mapper operates
33 must match the type of the list items in the clause.
34 • Memory spaces and memory allocators must not appear as a list item in a map clause.
17 Arguments
Name Type Properties
18
list list of extended list item type default
19 Directives
20 declare target
21 Additional information
22 The clause-name to may be used as a synonym for the clause-name enter. This use has been
23 deprecated.
24 Semantics
25 The enter clause is a data-mapping clause.
C / C++
26 If a function appears in an enter clause in the same compilation unit in which the definition of the
27 function occurs then a device-specific version of the function is created for all devices to which the
28 directive of the clause applies.
15 Cross References
16 • declare target directive, see Section 7.8.1
19 Arguments
Name Type Properties
20
list list of variable list item type default
21 Directives
22 declare target
23 Semantics
24 The link clause supports compilation of device routines that refer to variables with static storage
25 duration that appear as list items in the clause. The declare target directive on which the
26 clause appears does not map the list items. Instead, they are mapped according to the data-mapping
27 rules described in Section 5.8.
28 Cross References
29 • Data-Mapping Control, see Section 5.8
30 • declare target directive, see Section 7.8.1
7 Arguments
Name Type Properties
implicit-behavior Keyword: alloc, default, default
8
firstprivate, from, none,
present, to, tofrom
9 Modifiers
Name Modifies Type Properties
variable-category implicit-behavior Keyword: aggregate, default
10
all, allocatable,
pointer, scalar
11 Directives
12 target
13 Semantics
14 The defaultmap clause determines the implicit data-mapping or data-sharing attribute of certain
15 variables that are referenced in a target construct, in accordance with the rules given in
16 Section 5.8.1. The variable-category specifies the variables for which the attribute may be set, and
17 the attribute is specified by implicit-behavior. If no variable-category is specified in the clause then
18 the effect is as if all was specified for the variable-category.
C / C++
19 The scalar variable-category specifies non-pointer variables of scalar type.
C / C++
Fortran
20 The scalar variable-category specifies non-pointer and non-allocatable variables of scalar type.
21 The allocatable variable-category specifies variables with the ALLOCATABLE attribute.
Fortran
12 Restrictions
13 Restrictions to the defaultmap clause are as follows:
14 • A given variable-category may be specified in at most one defaultmap clause on a construct.
15 • If a defaultmap clause specifies the all variable-category, no other defaultmap clause
16 may appear on the construct.
17 • If implicit-behavior is none, each variable that is specified by variable-category and is
18 referenced in the construct but does not have a predetermined data-sharing and does not appear
19 in an enter or link clause on a declare target directive must be explicitly listed in a
20 data-environment attribute clause on the construct.
C / C++
21 • The specified variable-category must not be allocatable.
C / C++
22 Cross References
23 • Implicit Data-Mapping Attribute Rules, see Section 5.8.1
24 • target directive, see Section 13.8
27 Arguments
28 declare mapper(mapper-specifier)
Name Type Properties
29
mapper-specifier OpenMP mapper specifier default
30 Clauses
31 map
25 A list item in a map clause that appears on a declare mapper directive may include array
26 sections.
12 Restrictions
13 Restrictions to data-motion clauses are as follows:
14 • Each list item clause must have a mappable type.
15 Cross References
16 • Array Sections, see Section 3.2.5
17 • Array Shaping, see Section 3.2.4
18 • declare mapper directive, see Section 5.8.8
19 • device clause, see Section 13.2
20 • from clause, see Section 5.9.2
21 • iterator modifier, see Section 3.2.6
22 • target update directive, see Section 13.9
23 • to clause, see Section 5.9.1
24 5.9.1 to Clause
25 Name: to Properties: data-motion attribute
26 Arguments
Name Type Properties
27
locator-list list of locator list item type default
3 Directives
4 target update
5 Semantics
6 The to clause is a data motion clause that specifies movement to the targeted devices from the
7 encountering device so the corresponding list items are the assigned list items and the compatible
8 map types are to and tofrom.
9 Cross References
10 • iterator modifier, see Section 3.2.6
11 • target update directive, see Section 13.9
14 Arguments
Name Type Properties
15
locator-list list of locator list item type default
3 Directives
4 target update
5 Semantics
6 The from clause is a data motion clause that specifies movement from the targeted devices to the
7 encountering device so the original list items are the assigned list items and the compatible map
8 types are from and tofrom.
9 Cross References
10 • iterator modifier, see Section 3.2.6
11 • target update directive, see Section 13.9
14 Arguments
Name Type Properties
15
parameter-list list of parameter list item type default
16 Directives
17 declare simd
18 Semantics
19 The uniform clause declares one or more arguments to have an invariant value for all concurrent
20 invocations of the function in the execution of a single SIMD loop.
21 Cross References
22 • declare simd directive, see Section 7.7
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Modifiers
Name Modifies Type Properties
alignment list OpenMP integer expression positive, region
6
invariant, ultimate,
unique
7 Directives
8 declare simd, simd
9 Semantics
C / C++
10 The aligned clause declares that the object to which each list item points is aligned to the
11 number of bytes expressed in alignment.
C / C++
Fortran
12 The aligned clause declares that the target of each list item is aligned to the number of bytes
13 expressed in alignment.
Fortran
14 The alignment modifier specifies the alignment that the program ensures related to the list items. If
15 the alignment modifier is not specified, implementation-defined default alignments for SIMD
16 instructions on the target platforms are assumed.
17 Restrictions
18 Restrictions to the aligned clause are as follows:
C
19 • The type of list items must be array or pointer.
C
C++
20 • The type of list items must be array, pointer, reference to array, or reference to pointer.
C++
13 Restrictions
14 Restrictions to OpenMP memory spaces are as follows:
15 • Variables in the omp_const_mem_space memory space may not be written.
171
1 6.2 Memory Allocators
2 OpenMP memory allocators can be used by a program to make allocation requests. When a
3 memory allocator receives a request to allocate storage of a certain size, an allocation of logically
4 consecutive memory in the resources of its associated memory space of at least the size that was
5 requested will be returned if possible. This allocation will not overlap with any other existing
6 allocation from an OpenMP memory allocator.
7 The behavior of the allocation process can be affected by the allocator traits that the user specifies.
8 Table 6.2 shows the allowed allocator traits, their possible values and the default value of each trait.
14 Arguments
Name Type Properties
15
alignment expression of integer type constant, positive
16 Directives
17 allocate
8 Restrictions
9 Restrictions to the align clause are as follows:
10 • alignment must evaluate to a power of two.
11 Cross References
12 • Memory Allocators, see Section 6.2
13 • allocate directive, see Section 6.5
16 Arguments
Name Type Properties
17
allocator expression of allocator_handle type default
18 Directives
19 allocate
20 Semantics
21 The allocator clause specifies the memory allocator to be used for allocations associated with
22 the construct on which the clause appears. Specifically, the allocator to which allocator evaluates is
23 used for the allocations. On constructs on which the clause may appear, if it is not specified then the
24 effect is as if it was specified with the value of the def-allocator-var ICV.
25 Cross References
26 • Memory Allocators, see Section 6.2
27 • allocate directive, see Section 6.5
28 • def-allocator-var ICV, see Table 2.1
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Clauses
6 align, allocator
7 Semantics
8 The storage for each list item that appears in the allocate directive is provided an allocation
9 through the memory allocator as determined by the allocator clause with an alignment as
10 determined by the align clause. The scope of this allocation is that of the list item in the base
11 language. At the end of the scope for a given list item the memory allocator used to allocate that list
12 item deallocates the storage.
13 For allocations that arise from this directive the null_fb value of the fallback allocator trait
14 behaves as if the abort_fb had been specified.
15 Restrictions
16 Restrictions to the allocate directive are as follows:
17 • A variable that is part of another variable (as an array element or a structure element) cannot
18 appear in a allocate directive.
19 • An allocate directive must appear in the same scope as the declarations of each of its list
20 items and must follow all such declarations.
21 • A declared variable may appear as a list item in at most one allocate directive in a given
22 compilation unit.
23 • allocate directives that appear in a target region must specify an allocator clause
24 unless a requires directive with the dynamic_allocators clause is present in the same
25 compilation unit.
C / C++
26 • If a list item has static storage duration, the allocator clause must be specified and the
27 allocator expression in the clause must be a constant expression that evaluates to one of the
28 predefined memory allocator values.
29 • A variable that is declared in a namespace or global scope may only appear as a list item in an
30 allocate directive if an allocate directive that lists the variable follows a declaration that
31 defines the variable and if all allocate directives that list it specify the same allocator.
C / C++
3 Arguments
Name Type Properties
4
list list of variable list item type default
5 Modifiers
Name Modifies Type Properties
allocator-simple- list expression of OpenMP allo- exclusive, unique
modifier cator_handle type
allocator-complex- list Complex, name: unique
modifier allocator Arguments:
allocator expression of allo-
6 cator_handle type (default)
7 Directives
8 allocators, distribute, do, for, parallel, scope, sections, single, target,
9 task, taskgroup, taskloop, teams
10 Semantics
11 The allocate clause specifies the memory allocator to be used to obtain storage for a list of
12 variables. If a list item in the clause also appears in a data-sharing attribute clause on the same
13 directive that privatizes the list item, allocations that arise from that list item in the clause will be
14 provided by the memory allocator. If the allocator-simple-modifier is specified, the behavior is as if
15 the allocator-complex-modifier is instead specified with allocator-simple-modifier as its allocator
16 argument. The allocator-complex-modifier and align-modifier have the same syntax and semantics
17 for the allocate clause as the allocator and align clauses have for the allocate
18 directive.
19 For allocations that arise from this clause the null_fb value of the fallback allocator trait behaves
20 as if the abort_fb had been specified.
11 Cross References
12 • Memory Allocators, see Section 6.2
13 • align clause, see Section 6.3
14 • allocator clause, see Section 6.4
15 • allocators directive, see Section 6.7
16 • distribute directive, see Section 11.6
17 • do directive, see Section 11.5.2
18 • for directive, see Section 11.5.1
19 • parallel directive, see Section 10.1
20 • scope directive, see Section 11.2
21 • sections directive, see Section 11.3
22 • single directive, see Section 11.1
23 • target directive, see Section 13.8
24 • task directive, see Section 12.5
25 • taskgroup directive, see Section 15.4
26 • taskloop directive, see Section 12.6
27 • teams directive, see Section 10.2
3 Clauses
4 allocate
5 Additional information
6 The allocators construct may alternatively be expressed as one or more allocate directives
7 that precede the allocator structured block. The syntax of these directives are as described in
8 Section 6.5, except that the list directive argument is optional. If a list argument is not specified, the
9 effect is as if there is an implicit list consisting of the names of each variable to be allocated in the
10 associated allocate-stmt that is not explicitly listed in another allocate directive associated with
11 the statement. allocate directives are semantically equivalent to an allocators directive that
12 specifies OpenMP allocators and the variables to which they apply in one or more allocate
13 clauses, and restricted uses of the allocators directive imply that equivalent uses of
14 allocate directives are also restricted. If the allocate directive is used, an allocator will be
15 used to allocate all variables even if they are not explicitly listed. This alternate syntax has been
16 deprecated.
17 Semantics
18 The allocators construct specifies that OpenMP memory allocators are used for certain
19 variables that are allocated by the associated allocate-stmt. If a variable that is to be allocated
20 appears as a list item in an allocate clause on the directive, an OpenMP allocator is used to
21 allocate storage for the variable according to the semantics of the allocate clause. If a variable
22 that is to be allocated does not appear as a list item in an allocate clause, the allocation is
23 performed according to the base language implementation.
24 Restrictions
25 Restrictions to the allocators construct are as follows:
26 • A list item that appears in an allocate clause must appear as one of the variables that is
27 allocated by the allocate-stmt in the associated allocator structured block.
28 Additional restrictions to the (deprecated) allocate directive when it is associated with an
29 allocator structured block are as follows:
30 • If a list is specified, the directive must be preceded by an executable statement or OpenMP
31 construct.
32 • If multiple allocate directives are associated with an allocator structured block, at most one
33 directive may specify no list items.
8 Arguments
Name Type Properties
9
allocator expression of allocator_handle type default
10 Modifiers
Name Modifies Type Properties
mem-space Generic Complex, name: memspace default
Arguments:
memspace-handle
expression of
memspace_handle type (de-
11 fault)
12 Directives
13 target
14 Additional information
15 The comma-separated list syntax, in which each list item is a clause-argument-specification of the
16 form allocator[(traits)] may also be used for the uses_allocators clause arguments. With
17 this syntax, traits must be a constant array with constant values. This syntax has been deprecated.
12 Restrictions
13 • The allocator expression must be a base language identifier.
14 • If allocator is a predefined allocator, no modifiers may be specified.
15 • If allocator is not a predefined allocator, it must be a variable.
16 • The allocator argument must not appear in other data-sharing attribute clauses or data-mapping
17 attribute clauses on the same construct.
18 • The traits argument for the traits-array modifier must be a constant array, have constant values
19 and be defined in the same scope as the construct on which the clause appears.
20 • The memspace-handle argument for the mem-space modifier must be an identifier that matches
21 one of the predefined memory space names.
22 Cross References
23 • Memory Allocators, see Section 6.2
24 • Memory Spaces, see Section 6.1
25 • omp_destroy_allocator, see Section 18.13.3
26 • omp_init_allocator, see Section 18.13.2
27 • target directive, see Section 13.8
183
1 The device set includes traits that define the characteristics of the device being targeted by the
2 compiler at that point in the program. For each target device that the implementation supports, a
3 target_device set exists that defines the characteristics of that device. At least the following traits
4 must be defined for the device and all target_device sets:
5 • The kind(kind-name-list) trait specifies the general kind of the device. The following kind-name
6 values are defined:
7 – host, which specifies that the device is the host device;
8 – nohost, which specifies that the device is not the host device; and
9 – the values defined in the OpenMP Additional Definitions document.
10 • The isa(isa-name-list) trait specifies the Instruction Set Architectures supported by the device.
11 The accepted isa-name values are implementation defined.
12 • The arch(arch-name-list) trait specifies the architectures supported by the device. The accepted
13 arch-name values are implementation defined.
14 The kind, isa and arch traits in the device and target_device sets are name-list traits.
15 Additionally, the target_device set defines the following trait:
16 • The device_num trait specifies the device number of the device.
17 The implementation set includes traits that describe the functionality supported by the OpenMP
18 implementation at that point in the program. At least the following traits can be defined:
19 • The vendor(vendor-name-list) trait, which specifies the vendor identifiers of the implementation.
20 OpenMP defined values for vendor-name are defined in the OpenMP Additional Definitions
21 document.
22 • The extension(extension-name-list) trait, which specifies vendor specific extensions to the
23 OpenMP specification. The accepted extension-name values are implementation defined.
24 • A trait with a name that is identical to the name of any clause that was supplied to the requires
25 directive prior to the program point. Such traits other than the atomic_default_mem_order trait
26 are non-property traits. The presence of these traits has been deprecated.
27 • A requires(requires-clause-list) trait, which is a clause-list trait for which the properties are the
28 clauses that have been supplied to the requires directive prior to the program point as well as
29 implementation-defined implicit requirements.
30 The vendor and extension traits in the implementation set are name-list traits.
31 Implementations can define additional traits in the device, target_device and implementation sets;
32 these traits are extension traits.
33 The dynamic trait set includes traits that define the dynamic properties of a program at a point in its
34 execution. The data state trait in the dynamic trait set refers to the complete data state of the
35 program that may be accessed at runtime.
40 For trait selectors that correspond to name-list traits, each trait-property should be
41 trait-property-name and for any value that is a valid identifier both the identifier and the
11 Restrictions
12 Restrictions to context selectors are as follows:
13 • Each trait-property can only be specified once in a trait-selector other than the construct
14 selector set.
15 • Each trait-set-selector-name can only be specified once.
16 • Each trait-selector-name can only be specified once.
17 • A trait-score cannot be specified in traits from the construct, device or
18 target_device trait-selector-sets.
19 • A score-expression must be a non-negative constant integer expression.
20 • The expression of a device_num trait must evaluate to a non-negative integer value that is less
21 than or equal to the value of omp_get_num_devices().
22 • A variable or procedure that is referenced in an expression that appears in a context selector must
23 be visible at the location of the directive on which the selector appears unless the directive is a
24 declare variant directive and the variable is an argument of the associated base function.
25 • If trait-property any is specified in the kind trait-selector of the device or
26 target_device selector set, no other trait-property may be specified in the same selector.
27 • For a trait-selector that corresponds to a name-list trait, at least one trait-property must be
28 specified.
29 • For a trait-selector that corresponds to a non-property trait, no trait-property may be specified.
30 • For the requires selector of the implementation selector set, at least one trait-property
31 must be specified.
3 7.4 Metadirectives
4 A metadirective is a directive that can specify multiple directive variants of which one may be
5 conditionally selected to replace the metadirective based on the enclosing OpenMP context. A
6 metadirective is replaced by a nothing directive or one of the directive variants specified by the
7 when clauses or the otherwise clause. If no otherwise clause is specified the effect is as if
8 one was specified without an associated directive variant.
9 The OpenMP context for a given metadirective is defined according to Section 7.1. The order of
10 clauses that appear on a metadirective is significant and otherwise must be the last clause
11 specified on a metadirective.
12 Replacement candidates are ordered according to the following rules in decreasing precedence:
13 • A candidate is before another one if the score associated with the context selector of the
14 corresponding when clause is higher.
15 • A candidate that was explicitly specified is before one that was implicitly specified.
16 • Candidates are ordered according to the order in which they lexically appear on the metadirective.
17 The list of dynamic replacement candidates is the prefix of the sorted list of replacement candidates
18 up to and including the first candidate for which the corresponding when clause has a static context
19 selector. The first dynamic replacement candidate for which the corresponding when clause has a
20 compatible context selector, according to the matching rules defined in Section 7.3, replaces the
21 metadirective.
22 Restrictions
23 Restrictions to metadirectives are as follows:
24 • Replacement of the metadirective with the directive variant associated with any of the dynamic
25 replacement candidates must result in a conforming OpenMP program.
26 • Insertion of user code at the location of a metadirective must be allowed if the first dynamic
27 replacement candidate does not have a static context selector.
28 • All items must be executable directives if the first dynamic replacement candidate does not have
29 a static context selector.
Fortran
30 • A metadirective that appears in the specification part of a subprogram must follow all
31 variant-generating declarative directives that appear in the same specification part.
32 • All directive variants of a metadirective must be pure otherwise the metadirective is not pure.
Fortran
3 Arguments
Name Type Properties
4
directive-variant directive-specification optional, unique
5 Modifiers
Name Modifies Type Properties
6 context-selector directive-variant An OpenMP context- required, unique
selector-specification
7 Directives
8 begin metadirective, metadirective
9 Semantics
10 The directive variant specified by a when clause is a candidate to replace the metadirective on
11 which the clause is specified if the static part of the corresponding context selector is compatible
12 with the OpenMP context according to the matching rules defined in Section 7.3. If a when clause
13 does not explicitly specify a directive variant it implicitly specifies a nothing directive as the
14 directive variant.
15 Expressions that appear in the context selector of a when clause are evaluated if no prior dynamic
16 replacement candidate has a compatible context selector, and the number of times each expression
17 is evaluated is implementation defined. All variables referenced by these expressions are
18 considered to be referenced by the metadirective.
19 A directive variant that is associated with a when clause can only affect the program if the directive
20 variant is a dynamic replacement candidate.
21 Restrictions
22 Restrictions to the when clause are as follows:
23 • directive-variant must not specify a metadirective.
24 • context-selector must not specify any properties for the simd selector.
C / C++
25 • directive-variant must not specify a begin declare variant directive.
C / C++
8 Arguments
Name Type Properties
9
directive-variant directive-specification optional, unique
10 Directives
11 begin metadirective, metadirective
12 Additional information
13 The clause-name default may be used as a synonym for the clause-name otherwise. This use
14 has been deprecated.
15 Semantics
16 The otherwise clause is treated as a when clause with the specified directive variant, if any, and
17 an always compatible static context selector that has a score lower than the scores associated with
18 any other clause.
19 Restrictions
20 Restrictions to the otherwise clause are as follows:
21 • directive-variant must not specify a metadirective.
C / C++
22 • directive-variant must not specify a begin declare variant directive.
C / C++
23 Cross References
24 • begin metadirective directive, see Section 7.4.4
25 • metadirective directive, see Section 7.4.3
26 • when clause, see Section 7.4.1
3 Clauses
4 otherwise, when
5 Semantics
6 The metadirective specifies metadirective semantics.
7 Cross References
8 • Metadirectives, see Section 7.4
9 • otherwise clause, see Section 7.4.2
10 • when clause, see Section 7.4.1
13 Clauses
14 otherwise, when
15 Semantics
16 The begin metadirective is a metadirective for which the specified directive variants other
17 than the nothing directive must accept a paired end directive. For any directive variant that is
18 selected to replace the begin metadirective directive, the end metadirective directive
19 is implicitly replaced by its paired end directive to demarcate the statements that are affected by or
20 are associated with the directive variant. If the nothing directive is selected to replace the
21 begin metadirective directive, the paired end metadirective is ignored.
22 Restrictions
23 The restrictions to begin metadirective are as follows:
24 • Any directive-variant that is specified by a when or otherwise clause must be an OpenMP
25 directive that has a paired end directive or must be the nothing directive.
26 Cross References
27 • Metadirectives, see Section 7.4
28 • nothing directive, see Section 8.4
29 • otherwise clause, see Section 7.4.2
30 • when clause, see Section 7.4.1
28 Restrictions
29 Restrictions to declare variant directives are as follows:
30 • Calling functions that a declare variant directive determined to be a function variant directly in
31 an OpenMP context that is different from the one that the construct selector set of the context
32 selector specifies is non-conforming.
33 • If a function is determined to be a function variant through more than one declare variant
34 directive then the construct selector set of their context selectors must be the same.
19 Arguments
Name Type Properties
20 context-selector An OpenMP context-selector- default
specification
21 Directives
22 begin declare variant, declare variant
23 Semantics
24 The match clause specifies the context-selector to use to determine if a specified variant function
25 is a replacement candidate for the specified base function in a given context.
5 Cross References
6 • Context Selectors, see Section 7.2
7 • begin declare variant directive, see Section 7.5.5
8 • declare variant directive, see Section 7.5.4
11 Arguments
Name Type Properties
12
parameter-list list of parameter list item type default
13 Modifiers
Name Modifies Type Properties
adjust-op parameter-list Keyword: required
14
need_device_ptr,
nothing
15 Directives
16 declare variant
17 Semantics
18 The adjust_args clause specifies how to adjust the arguments of the base function when a
19 specified variant function is selected for replacement. For each adjust_args clause that is
20 present on the selected variant the adjustment operation specified by adjust-op is applied to each
21 argument specified in the clause before being passed to the selected variant. If the adjust-op
22 modifier is nothing, the argument is passed to the selected variant without being modified.
23 If the adjust-op modifier is need_device_ptr, the arguments are converted to corresponding
24 device pointers of the default device. If an argument has the is_device_ptr property in its
25 interoperability requirement set then the argument is not adjusted. Otherwise, the argument is
26 converted in the same manner that a use_device_ptr clause on a target data construct
27 converts its pointer list items into device pointers. If the argument cannot be converted into a device
28 pointer then NULL is passed as the argument.
8 Arguments
Name Type Properties
9
append-op-list list of OpenMP operation list item type default
10 Directives
11 declare variant
12 Semantics
13 The append_args clause specifies additional arguments to pass in the call when a specified
14 variant function is selected for replacement. The arguments are constructed according to each
15 specified list item in append-op-list and are passed in the same order in which they are specified in
16 the list.
17 The supported OpenMP operations in append-op-list are:
18 interop
8 Arguments
9 declare variant([base–name:]variant-name)
Name Type Properties
10 base-name identifier of function type optional
variant-name identifier of function type default
11 Clauses
12 adjust_args, append_args, match
13 Semantics
14 The declare variant specifies declare variant semantics for a single replacement candidate.
15 variant-name identifies the function variant while base-name identifies the base function.
C
16 Any expressions in the match clause are interpreted as if they appeared in the scope of arguments
17 of the base function.
C
C++
18 variant-name and any expressions in the match clause are interpreted as if they appeared at the
19 scope of the trailing return type of the base function.
20 The function variant is determined by base language standard name lookup rules ([basic.lookup])
21 of variant-name using the argument types at the call site after implementation-defined changes have
22 been made according to the OpenMP context.
C++
Fortran
23 The procedure to which base-name refers is resolved at the location of the directive according to the
24 establishment rules for procedure names in the base language.
Fortran
C / C++
21 Clauses
22 match
23 Semantics
24 The begin declare variant directive associates the context selector in the match clause
25 with each function definition in declaration-definition-seq. For the purpose of call resolution, each
26 function definition that appears between a begin declare variant directive and its paired
27 end directive is a function variant for an assumed base function, with the same name and a
28 compatible prototype, that is declared elsewhere without an associated declare variant directive.
18 Restrictions
19 The restrictions to begin declare variant directive are as follows:
20 • match clause must not contain a simd trait-selector-name.
21 • Two begin declare variant directives and their paired end directives must either
22 encompass disjoint source ranges or be perfectly nested.
23 • match clause must not contain a dynamic context selector that references the this pointer.
24 • If an expression in the context selector that appears in match clause references the this
25 pointer, the base function must be a non-static member function.
26 Cross References
27 • Declare Variant Directives, see Section 7.5
28 • match clause, see Section 7.5.1
C / C++
3 Clauses
4 depend, device, is_device_ptr, nocontext, novariants, nowait
5 Binding
6 The binding task set for a dispatch region is the generating task. The dispatch region binds
7 to the region of the generating task.
8 Semantics
9 The dispatch construct controls whether variant substitution occurs for target-call in the
10 associated function dispatch structured block.
11 Properties added to the interoperability requirement set can be removed by the effect of other
12 directives (see Section 14.2) before the dispatch region is executed. If one or more depend
13 clauses are present on the dispatch construct, they are added as depend properties of the
14 interoperability requirement set. If a nowait clause is present on the dispatch construct the
15 nowait property is added to the interoperability requirement set. For each list item specified in an
16 is_device_ptr clause, an is_device_ptr property for that list item is added to the
17 interoperability requirement set.
18 If the interoperability requirement set contains one or more depend properties, the behavior is as if
19 those properties were applied as depend clauses to a taskwait construct that is executed before
20 the dispatch region is executed.
21 The presence of the nowait property in the interoperability requirement set has no effect on the
22 dispatch construct.
23 If the device clause is present, the value of the default-device-var ICV is set to the value of the
24 expression in the clause on entry to the dispatch region and is restored to its previous value at
25 the end of the region.
26 Cross References
27 • Interoperability Requirement Set, see Section 14.2
28 • OpenMP Function Dispatch Structured Blocks, see Section 4.3.1.2
29 • depend clause, see Section 15.9.5
30 • device clause, see Section 13.2
31 • is_device_ptr clause, see Section 5.4.7
32 • nocontext clause, see Section 7.6.2
5 Arguments
Name Type Properties
6
do-not-use-variant expression of logical type default
7 Directives
8 dispatch
9 Semantics
10 If do-not-use-variant evaluates to true, no function variant is selected for the target-call of the
11 dispatch region associated with the novariants clause even if one would be selected
12 normally. The use of a variable in do-not-use-variant causes an implicit reference to the variable in
13 all enclosing constructs. do-not-use-variant is evaluated in the enclosing context.
14 Cross References
15 • dispatch directive, see Section 7.6
18 Arguments
Name Type Properties
19
do-not-update-context expression of logical type default
20 Directives
21 dispatch
22 Semantics
23 If do-not-update-context evaluates to true, the construct on which the nocontext clause appears
24 is not added to the construct set of the OpenMP context. The use of a variable in
25 do-not-update-context causes an implicit reference to the variable in all enclosing constructs.
26 do-not-update-context is evaluated in the enclosing context.
27 Cross References
28 • dispatch directive, see Section 7.6
3 Arguments
4 declare simd[(proc-name)]
Name Type Properties
5
proc-name identifier of function type optional
6 Clause groups
7 branch
8 Clauses
9 aligned, linear, simdlen, uniform
10 Semantics
11 The association of one or more declare simd directives with a function declaration or definition
12 enables the creation of corresponding SIMD versions of the associated function that can be used to
13 process multiple arguments from a single invocation in a SIMD loop concurrently.
14 If a SIMD version is created and the simdlen clause is not specified, the number of concurrent
15 arguments for the function is implementation defined.
16 For purposes of the linear clause, any integer-typed parameter that is specified in a uniform
17 clause on the directive is considered to be constant and so may be used in linear-step.
C / C++
18 The expressions that appear in the clauses of each directive are evaluated in the scope of the
19 arguments of the function declaration or definition.
C / C++
C++
20 The special this pointer can be used as if it was one of the arguments to the function in any of the
21 linear, aligned, or uniform clauses.
C++
22 Restrictions
23 Restrictions to the declare simd directive are as follows:
24 • The function or subroutine body must be a structured block.
25 • The execution of the function or subroutine, when called from a SIMD loop, cannot result in the
26 execution of an OpenMP construct except for an ordered construct with the simd clause or an
27 atomic construct.
28 • The execution of the function or subroutine cannot have any side effects that would alter its
29 execution for concurrent iterations of a SIMD chunk.
4 Directives
5 declare simd
6 Semantics
7 The branch clause grouping defines a set of clauses that indicate if a function can be assumed to be
8 or not to be encountered in a branch. The inbranch clause specifies that the function will always
9 be called from inside a conditional statement of the calling context. The notinbranch clause
10 specifies that the function will never be called from inside a conditional statement of the calling
11 context. If neither clause is specified, then the function may or may not be called from inside a
12 conditional statement of the calling context.
13 Cross References
14 • declare simd directive, see Section 7.7
3 Arguments
4 declare target(extended-list)
Name Type Properties
5
extended-list list of extended list item type optional
6 Clauses
7 device_type, enter, indirect, link
8 Semantics
9 The declare target directive is a declare target directive. If the extended-list argument is
10 specified, the effect is as if an enter clause was specified with the extended-list as its argument.
Fortran
11 If a declare target directive does not have any clauses and does not have an extended-list then
12 an implicit enter clause with one item is formed from the name of the enclosing subroutine
13 subprogram, function subprogram or interface body to which it applies.
Fortran
14 Restrictions
15 Restrictions to the declare target directive are as follows:
16 • If the extended-list argument is specified, no clauses may be specified.
17 • If the directive has a clause, it must contain at least one enter clause or at least one link
18 clause.
19 • A variable for which nohost is specified may not appear in a link clause.
Fortran
20 • If a list item is a procedure name, it must not be a generic name, procedure pointer, entry name,
21 or statement function name.
22 • If no clauses are specified or if a device_type clause is specified, the directive must appear in
23 a specification part of a subroutine subprogram, function subprogram or interface body.
24 • If a list item is a procedure name, the directive must be in the specification part of that subroutine
25 or function subprogram or in the specification part of that subroutine or function in an interface
26 body.
27 • If an extended list item is a variable name, the directive must appear in the specification part of a
28 subroutine subprogram, function subprogram, program or module.
29 Clauses
30 device_type, indirect
4 Semantics
5 The begin declare target directive is a declare target directive. The directive and its paired
6 end directive form a delimited code region that defines an implicit extended-list. The implicit
7 extended-list consists of the variable names of any variable declarations at file or namespace scope
8 that appear in the delimited code region and of the function names of any function declarations at
9 file, namespace or class scope that appear in the delimited code region. The implicit extended-list is
10 converted to an implicit enter clause.
11 The delimited code region may contain declare target directives. If a device_type clause is
12 present on the contained declare target directive, then its argument determines which versions are
13 made available. If a list item appears both in an implicit and explicit list, the explicit list determines
14 which versions are made available.
15 Restrictions
16 Restrictions to the begin declare target directive are as follows:
C++
17 • The function names of overloaded functions or template functions may only be specified within
18 an implicit extended-list.
19 • If a lambda declaration and definition appears between a begin declare target directive
20 and the paired end directive, all variables that are captured by the lambda expression must also
21 appear in an enter clause.
22 • A module export or import statement cannot appear between a declare target directive and the
23 paired end directive.
C++
24 Cross References
25 • Declare Target Directives, see Section 7.8
26 • device_type clause, see Section 13.1
27 • enter clause, see Section 5.8.4
28 • indirect clause, see Section 7.8.3
C / C++
3 Arguments
Name Type Properties
4
invoked-by-fptr expression of logical type constant, optional
5 Directives
6 begin declare target, declare target
7 Semantics
8 If invoked-by-fptr evaluates to true, any procedures that appear in an enter clause on the directive
9 on which the indirect clause is specified may be called with an indirect device invocation. If the
10 invoked-by-fptr does not evaluate to true, any procedures that appear in an enter clause on the
11 directive may not be called with an indirect device invocation. Unless otherwise specified by an
12 indirect clause, procedures may not be called with an indirect device invocation. If the
13 indirect clause is specified and invoked-by-fptr is not specified, the effect of the clause is as if
14 invoked-by-fptr evaluates to true.
C / C++
15 If a function appears in the implicit enter clause of a begin declare target directive and in
16 the enter clause of a declare target directive that is contained in the delimited code region of the
17 begin declare target directive, and if an indirect clause appears on both directives, then
18 the indirect clause on the begin declare target directive has no effect for that function.
C / C++
19 Restrictions
20 Restrictions to the indirect clause are as follows:
21 • If invoked-by-fptr evaluates to true, a device_type clause must not appear on the same
22 directive unless it specifies any. for its device-type-description.
23 Cross References
24 • begin declare target directive, see Section 7.8.2
25 • declare target directive, see Section 7.8.1
5 8.1 at Clause
6 Name: at Properties: unique
7 Arguments
Name Type Properties
8 action-time Keyword: compilation, default
execution
9 Directives
10 error
11 Semantics
12 The at clause determines when the implementation performs an action that is associated with a
13 utility directive. If action-time is compilation, the action is performed during compilation if the
14 directive appears in a declarative context or in an executable context that is reachable at runtime. If
15 action-time is compilation and the directive appears in an executable context that is not
16 reachable at runtime, the action may or may not be performed. If action-time is execution, the
17 action is performed during program execution when a thread encounters the directive and the
18 directive is considered to be an executable directive. If the at clause is not specified, the effect is as
19 if action-time is compilation.
20 Cross References
21 • error directive, see Section 8.5
210
1 Clause groups
2 requirement
3 Semantics
4 The requires directive specifies features that an implementation must support for correct
5 execution and requirements for the execution of all code in the current compilation unit. The
6 behavior that a requirement clause specifies may override the normal behavior specified elsewhere
7 in this document. Whether an implementation supports the feature that a given requirement clause
8 specifies is implementation defined.
9 The clauses of a requires directive are added to the requires trait in the OpenMP context for all
10 program points that follow the directive.
11 Restrictions
12 The restrictions to the requires directive are as follows:
13 • All requires directives in the same compilation unit that specify the
14 atomic_default_mem_order requirement must specify the same argument.
15 • Any requires directive that specifies a reverse_offload, unified_address, or
16 unified_shared_memory requirement must appear lexically before any device constructs
17 or device routines.
18 • A requires directive may not appear lexically after a context selector in which any clause of
19 the requires directive is used.
20 • Either all compilation units of a program that contain declare target directives, device constructs
21 or device routines or none of them must specify a requires directive that specifies the
22 reverse_offload, unified_address or unified_shared_memory requirement.
23 • A requires directive that specifies the atomic_default_mem_order requirement must
24 not appear lexically after any atomic construct on which memory-order-clause is not specified.
C
25 • The requires directive may only appear at file scope.
C
C++
26 • The requires directive may only appear at file or namespace scope.
C++
Fortran
27 • The requires directive must appear in the specification part of a program unit, after any USE
28 statement, any IMPORT statement, and any IMPLICIT statement, unless the directive appears
29 by referencing a module and each clause already appeared with the same arguments in the
30 specification part of the program unit.
Fortran
4 Directives
5 requires
6 Semantics
7 The requirement clause grouping defines a set of clauses that indicate the requirement that a
8 program requires the implementation to support. Other than atomic_default_mem_order,
9 the members of the set are inarguable.
10 If an implementation supports a given requirement clause then the use of that clause on a
11 requires directive will cause the implementation to ensure the enforcement of a guarantee
12 represented by the specific member of the clause grouping. If the implementation does not support
13 the requirement then it must perform compile-time error termination.
14 The reverse_offload clause requires an implementation to guarantee that if a target
15 construct specifies a device clause in which the ancestor modifier appears, the target
16 region can execute on the parent device of an enclosing target region.
17 The unified_address clause requires an implementation to guarantees that all devices
18 accessible through OpenMP API routines and directives use a unified address space. In this address
19 space, a pointer will always refer to the same location in memory from all devices accessible
20 through OpenMP. Any OpenMP mechanism that returns a device pointer is guaranteed to return a
21 device address that supports pointer arithmetic, and the is_device_ptr clause is not necessary
22 to obtain device addresses from device pointers for use inside target regions. Host pointers may
23 be passed as device pointer arguments to device memory routines and device pointers may be
24 passed as host pointer arguments to device memory routines. Non-host devices may still have
25 discrete memories and dereferencing a device pointer on the host device or a host pointer on a
26 non-host device remains unspecified behavior. Memory local to a specific execution context may be
27 exempt from the unified_address requirement, following the restrictions of locality to a given
28 execution context, thread or contention group.
29 The unified_shared_memory clause implies the unified_address requirement,
30 inheriting all of its behaviors. The implementation must also guarantee that storage locations in
31 memory are accessible to threads on all available devices that the implementation supports, except
32 for memory that is local to a specific execution context as defined in the description of
33 unified_address above. Every device address that refers to storage allocated through
34 OpenMP device memory routines is a valid host pointer that may be dereferenced.
35 The unified_shared_memory clause makes map clauses optional on target constructs and
36 declare target directives optional for variables with static storage duration that are accessed inside
14 Cross References
15 • requires directive, see Section 8.2
26 Directives
27 assume, assumes, begin assumes
28 Semantics
29 The assumption clause grouping defines a set of clauses that indicate the assumptions that a
30 program ensures the implementation can exploit. Other than absent, contains and holds,
31 the members of the set are inarguable and unique.
18 Restrictions
19 The restrictions to assumption clauses are as follows:
20 • A directive-name list member must not specify a combined or composite directive.
21 • A directive-name list member must not specify a directive that is a declarative directive, an
22 informational directive other than the error directive, or a metadirective.
23 Cross References
24 • assume directive, see Section 8.3.3
25 • assumes directive, see Section 8.3.2
26 • begin assumes directive, see Section 8.3.4
29 Clause groups
30 assumption
4 Restrictions
5 The restrictions to the assumes directive are as follows:
C
6 • The assumes directive may only appear at file scope.
C
C++
7 • The assumes directive may only appear at file or namespace scope.
C++
Fortran
8 • The assumes directive may only appear in the specification part of a module or subprogram,
9 after any USE statement, any IMPORT statement, and any IMPLICIT statement.
Fortran
12 Clause groups
13 assumption
14 Semantics
15 The assumption scope of the assume directive is the code executed in the corresponding region or
16 in any region that is nested in the corresponding region.
C / C++
19 Clause groups
20 assumption
6 Semantics
7 The nothing directive has no effect on the execution of the OpenMP program.
8 Cross References
9 • Metadirectives, see Section 7.4
12 Clauses
13 at, message, severity
14 Semantics
15 The error directive instructs the compiler or runtime to perform an error action. The error action
16 displays an implementation-defined message. The severity clause determines whether the error
17 action is abortive following the display of the message. If sev-level is fatal and action-time is
18 compilation, the message is displayed and compilation of the current compilation unit is
19 aborted. If sev-level is fatal and action-time is execution, the message is displayed and
20 program execution is aborted.
24 Tool Callbacks
25 A thread dispatches a registered ompt_callback_error callback for each occurrence of a
26 runtime-error event in the context of the encountering task. This callback has the type signature
27 ompt_callback_error_t.
4 Cross References
5 • ompt_callback_error_t, see Section 19.5.2.30
6 • at clause, see Section 8.1
7 • message clause, see Section 8.5.2
8 • severity clause, see Section 8.5.1
11 Arguments
Name Type Properties
12
sev-level Keyword: fatal, warning default
13 Directives
14 error
15 Semantics
16 The severity clause determines the action that the implementation performs. If sev-level is
17 warning, the implementation takes no action besides displaying the message that is associated
18 with the directive. if sev-level is fatal, the implementation performs the abortive action
19 associated with the directive on which the clause appears. If no severity clause is specified then
20 the effect is as if sev-level is fatal.
21 Cross References
22 • error directive, see Section 8.5
25 Arguments
Name Type Properties
26
msg-string expression of string type default
27 Directives
28 error
4 Restrictions
C / C++
5 • If the action-time is compilation, msg-string must be a constant string literal.
C / C++
Fortran
6 • If the action-time is compilation, msg-string must be a constant character expression.
Fortran
7 Cross References
8 • error directive, see Section 8.5
12 Cross References
13 • Canonical Loop Nest Form, see Section 4.4.1
16 Clauses
17 sizes
18 Semantics
19 The tile construct tiles the outer n loops of the associated loop nest, where n is the number of
20 items in the sizes clause, which consists of items s1 , . . . , sn . Let `1 , . . . , `n be the associated
21 loops, from outermost to innermost, which the construct replaces with a loop nest that consists of
22 2n perfectly nested loops. Let f1 , . . . , fn , t1 , . . . , tn be the generated loops, from outermost to
23 innermost. The loops f1 , . . . , fn are the floor loops and the loops t1 , . . . , tn are the tile loops. The
24 tile loops do not have canonical loop nest form.
25 Let Ω be the logical iteration vector space of the associated loops. For any (α1 , . . . , αn ) ∈ Nn ,
26 define the set of iterations {(i1 , . . . , in ) ∈ Ω | ∀k ∈ {1, . . . , n} : sk αk ≤ ik < sk αk + sk } to be
27 F = {Tα1 ,...,αn | Tα1 ,...,αn 6= ∅} to be the set of tiles with at least one iteration.
tile Tα1 ,...,αn and Q
n
28 Tiles that contain k=1 sk iterations are complete tiles. Otherwise, they are partial tiles.
219
1 The floor loops iterate over all tiles {Tα1 ,...,αn ∈ F } in lexicographic order with respect to their
2 indices (α1 , . . . , αn ) and the tile loops iterate over the iterations in Tα1 ,...,αn in the lexicographic
3 order of the corresponding iteration vectors. An implementation may reorder the sequential
4 execution of two iterations if at least one is from a partial tile and if their respective logical iteration
5 vectors in loop-nest do not have a product order relation.
6 Restrictions
7 Restrictions to the tile construct are as follows:
8 • The depth of the associated loop nest must be greater than or equal to n.
9 • All loops that are associated with the construct must be perfectly nested.
10 • No loop that is associated with the construct may be a non-rectangular loop.
11 Cross References
12 • sizes clause, see Section 9.1.1
15 Arguments
Name Type Properties
16
size-list list of expression of integer type constant, positive
17 Directives
18 tile
19 Semantics
20 The sizes clause specifies a list of n compile-time constant, positive OpenMP integer expressions.
21 Cross References
22 • tile directive, see Section 9.1
25 Clauses
26 full, partial
27 Clause set
28 Properties: exclusive Members: full, partial
6 Cross References
7 • full clause, see Section 9.2.1
8 • partial clause, see Section 9.2.2
11 Directives
12 unroll
13 Semantics
14 The full clause specifies that the associated loop is fully unrolled. The construct is replaced by a
15 structured block that only contains n instances of its loop body, one for each of the n logical
16 iterations of the associated loop and in their logical iteration order.
17 Restrictions
18 Restrictions to the full clause are as follows:
19 • The iteration count of the associated loop must be a compile-time constant.
20 Cross References
21 • unroll directive, see Section 9.2
24 Arguments
Name Type Properties
25 unroll-factor expression of integer type optional, constant, posi-
tive
26 Directives
27 unroll
5 Cross References
6 • unroll directive, see Section 9.2
5 Clauses
6 allocate, copyin, default, firstprivate, if, num_threads, private,
7 proc_bind, reduction, shared
8 Binding
9 The binding thread set for a parallel region is the encountering thread. The encountering thread
10 becomes the primary thread of the new team.
11 Semantics
12 When a thread encounters a parallel construct, a team of threads is created to execute the
13 parallel region (see Section 10.1.1 for more information about how the number of threads in
14 the team is determined, including the evaluation of the if and num_threads clauses). The
15 thread that encountered the parallel construct becomes the primary thread of the new team,
16 with a thread number of zero for the duration of the new parallel region. All threads in the new
17 team, including the primary thread, execute the region. Once the team is created, the number of
18 threads in the team remains constant for the duration of that parallel region.
19 Within a parallel region, thread numbers uniquely identify each thread. Thread numbers are
20 consecutive whole numbers ranging from zero for the primary thread up to one less than the
21 number of threads in the team. A thread may obtain its own thread number by a call to the
22 omp_get_thread_num library routine.
23 A set of implicit tasks, equal in number to the number of threads in the team, is generated by the
24 encountering thread. The structured block of the parallel construct determines the code that
25 will be executed in each implicit task. Each task is assigned to a different thread in the team and
26 becomes tied. The task region of the task that the encountering thread is executing is suspended and
27 each thread in the team executes its implicit task. Each thread can execute a path of statements that
28 is different from that of the other threads.
223
1 The implementation may cause any thread to suspend execution of its implicit task at a task
2 scheduling point, and to switch to execution of any explicit task generated by any of the threads in
3 the team, before eventually resuming execution of the implicit task (for more details see
4 Chapter 12).
5 An implicit barrier occurs at the end of a parallel region. After the end of a parallel region,
6 only the primary thread of the team resumes execution of the enclosing task region.
7 If a thread in a team that is executing a parallel region encounters another parallel
8 directive, it creates a new team, according to the rules in Section 10.1.1, and it becomes the primary
9 thread of that new team.
10 If execution of a thread terminates while inside a parallel region, execution of all threads in all
11 teams terminates. The order of termination of threads is unspecified. All work done by a team prior
12 to any barrier that the team has passed in the program is guaranteed to be complete. The amount of
13 work done by each thread after the last barrier that it passed and before it terminates is unspecified.
31 Tool Callbacks
32 A thread dispatches a registered ompt_callback_parallel_begin callback for each
33 occurrence of a parallel-begin event in that thread. The callback occurs in the task that encounters
34 the parallel construct. This callback has the type signature
35 ompt_callback_parallel_begin_t. In the dispatched callback,
36 (flags & ompt_parallel_team) evaluates to true.
18 Cross References
19 • Determining the Number of Threads for a parallel Region, see Section 10.1.1
20 • omp_get_thread_num, see Section 18.2.4
21 • ompt_callback_implicit_task_t, see Section 19.5.2.11
22 • ompt_callback_parallel_begin_t, see Section 19.5.2.3
23 • ompt_callback_parallel_end_t, see Section 19.5.2.4
24 • ompt_callback_thread_begin_t, see Section 19.5.2.1
25 • ompt_callback_thread_end_t, see Section 19.5.2.2
26 • ompt_scope_endpoint_t, see Section 19.4.4.11
27 • allocate clause, see Section 6.6
28 • copyin clause, see Section 5.7.1
29 • default clause, see Section 5.4.1
30 • firstprivate clause, see Section 5.4.4
31 • if clause, see Section 3.4
32 • num_threads clause, see Section 10.1.2
33 • private clause, see Section 5.4.3
34 • proc_bind clause, see Section 10.1.4
16
17
18
Algorithm 2.1
19 let ThreadsBusy be the number of OpenMP threads currently executing in this contention group;
20 if an if clause exists
21 then let IfClauseValue be the value of the if clause expression;
22 else let IfClauseValue = true;
23 if a num_threads clause exists
24 then let ThreadsRequested be the value of the num_threads clause expression;
25 else let ThreadsRequested = value of the first element of nthreads-var;
26 let ThreadsAvailable = (thread-limit-var - ThreadsBusy + 1);
27 if (IfClauseValue = false)
28 then number of threads = 1;
29 else if (active-levels-var ≥ max-active-levels-var)
30 then number of threads = 1;
31 else if (dyn-var = true) and (ThreadsRequested ≤ ThreadsAvailable)
32 then 1 ≤ number of threads ≤ ThreadsRequested;
8 Cross References
9 • dyn-var ICV, see Table 2.1
10 • if clause, see Section 3.4
11 • max-active-levels-var ICV, see Table 2.1
12 • nthreads-var ICV, see Table 2.1
13 • num_threads clause, see Section 10.1.2
14 • parallel directive, see Section 10.1
15 • thread-limit-var ICV, see Table 2.1
18 Arguments
Name Type Properties
19
nthreads expression of integer type positive
20 Directives
21 parallel
22 Semantics
23 The num_threads clause specifies the desired number of threads to execute a parallel region.
24 Cross References
25 • parallel directive, see Section 10.1
11 Note – Wrap around is needed if the end of a place partition is reached before all thread
12 assignments are done. For example, wrap around may be needed in the case of close and T ≤ P ,
13 if the primary thread is assigned to a place other than the first place in the place partition. In this
14 case, thread 1 is assigned to the place after the place of the primary thread, thread 2 is assigned to
15 the place after that, and so on. The end of the place partition may be reached before all threads are
16 assigned. In this case, assignment of threads is resumed with the first place in the place partition.
17
18 Cross References
19 • bind-var ICV, see Table 2.1
20 • parallel directive, see Section 10.1
21 • place-partition-var ICV, see Table 2.1
22 • proc_bind clause, see Section 10.1.4
25 Arguments
Name Type Properties
26 affinity-policy Keyword: close, master (depre- default
cated), primary, spread
27 Directives
28 parallel
29 Semantics
30 The proc_bind clause specifies the mapping of OpenMP threads to places within the current
31 place partition, that is, within the places listed in the place-partition-var ICV for the implicit task of
32 the encountering thread. The effect of the possible values for affinity-policy are described in
33 Section 10.1.3
14 Tool Callbacks
15 A thread dispatches a registered ompt_callback_parallel_begin callback for each
16 occurrence of a teams-begin event in that thread. The callback occurs in the task that encounters the
17 teams construct. This callback has the type signature
18 ompt_callback_parallel_begin_t. In the dispatched callback,
19 (flags & ompt_parallel_league) evaluates to true.
20 A thread dispatches a registered ompt_callback_implicit_task callback with
21 ompt_scope_begin as its endpoint argument for each occurrence of an initial-task-begin in
22 that thread. Similarly, a thread dispatches a registered ompt_callback_implicit_task
23 callback with ompt_scope_end as its endpoint argument for each occurrence of an
24 initial-task-end event in that thread. The callbacks occur in the context of the initial task and have
25 type signature ompt_callback_implicit_task_t. In the dispatched callback,
26 (flags & ompt_task_initial) evaluates to true.
27 A thread dispatches a registered ompt_callback_parallel_end callback for each
28 occurrence of a teams-end event in that thread. The callback occurs in the task that encounters the
29 teams construct. This callback has the type signature ompt_callback_parallel_end_t.
30 A thread dispatches a registered ompt_callback_thread_begin callback for the
31 native-thread-begin event in that thread. The callback occurs in the context of the thread. The
32 callback has type signature ompt_callback_thread_begin_t.
33 A thread dispatches a registered ompt_callback_thread_end callback for the
34 native-thread-end event in that thread. The callback occurs in the context of the thread. The
35 callback has type signature ompt_callback_thread_end_t.
13 Cross References
14 • omp_get_num_teams, see Section 18.4.1
15 • omp_get_team_num, see Section 18.4.2
16 • ompt_callback_implicit_task_t, see Section 19.5.2.11
17 • ompt_callback_parallel_begin_t, see Section 19.5.2.3
18 • ompt_callback_parallel_end_t, see Section 19.5.2.4
19 • ompt_callback_thread_begin_t, see Section 19.5.2.1
20 • ompt_callback_thread_end_t, see Section 19.5.2.2
21 • allocate clause, see Section 6.6
22 • default clause, see Section 5.4.1
23 • distribute directive, see Section 11.6
24 • firstprivate clause, see Section 5.4.4
25 • num_teams clause, see Section 10.2.1
26 • parallel directive, see Section 10.1
27 • private clause, see Section 5.4.3
28 • reduction clause, see Section 5.5.8
29 • shared clause, see Section 5.4.2
30 • target directive, see Section 13.8
31 • thread_limit clause, see Section 13.3
3 Arguments
Name Type Properties
4
upper-bound expression of integer type positive
5 Modifiers
Name Modifies Type Properties
6 lower-bound Generic OpenMP integer expression positive, ultimate,
unique
7 Directives
8 teams
9 Semantics
10 The num_teams clause specifies the bounds on the number of teams created by the construct on
11 which it appears. lower-bound specifies the lower bound and upper-bound specifies the upper
12 bound on the number of teams requested. If lower-bound is not specified, the effect is as if
13 lower-bound is specified as equal to upper-bound. The number of teams created is implementation
14 defined, but it will be greater than or equal to the lower bound and less than or equal to the upper
15 bound.
16 If the num_teams clause is not specified on a construct then the effect is as if upper-bound was
17 specified as follows. If the value of the nteams-var ICV is greater than zero, the effect is as if
18 upper-bound was specified to an implementation-defined value greater than zero but less than or
19 equal to the value of the nteams-var ICV. Otherwise, the effect is as if upper-bound was specified as
20 an implementation defined value greater than or equal to one.
21 Restrictions
22 • lower-bound must be less than or equal to upper-bound.
23 Cross References
24 • teams directive, see Section 10.2
27 Arguments
Name Type Properties
28
ordering Keyword: concurrent default
3 Directives
4 distribute, do, for, loop, simd
5 Semantics
6 The order clause specifies an ordering of execution for the iterations of the associated loops of a
7 loop-associated directive. If ordering is concurrent, the logical iterations of the associated
8 loops may execute in any order, including concurrently.
9 The order-modifier on the order clause affects the schedule specification for the purpose of
10 determining its consistency with other schedules (see Section 4.4.5). If order-modifier is
11 reproducible, the loop schedule for the construct on which the clause appears is reproducible,
12 whereas if order-modifier is unconstrained, the loop schedule is not reproducible.
13 Restrictions
14 Restrictions to the order clause are as follows:
15 • The only constructs that may be encountered inside a region that corresponds to a construct with
16 an order clause that specifies concurrent are the loop construct, the parallel
17 construct, the simd construct, and combined constructs for which the first construct is a
18 parallel construct.
19 • A region that corresponds to a construct with an order clause that specifies concurrent may
20 not contain calls to procedures that contain OpenMP directives.
21 • A region that corresponds to a construct with an order clause that specifies concurrent may
22 not contain OpenMP runtime API calls.
23 • If a threadprivate variable is referenced inside a region that corresponds to a construct with an
24 order clause that specifies concurrent, the behavior is unspecified.
25 Cross References
26 • distribute directive, see Section 11.6
27 • do directive, see Section 11.5.2
28 • for directive, see Section 11.5.1
29 • loop directive, see Section 11.7
30 • simd directive, see Section 10.4
16 Arguments
Name Type Properties
17
list list of variable list item type default
18 Directives
19 simd
20 Semantics
21 The nontemporal clause specifies that accesses to the storage locations to which the list items
22 refer have low temporal locality across the iterations in which those storage locations are accessed.
23 The list items of the nontemporal clause may also appear as list items of data-environment
24 attribute clauses.
25 Cross References
26 • simd directive, see Section 10.4
3 Arguments
Name Type Properties
4
length expression of integer type positive, constant
5 Directives
6 simd
7 Semantics
8 The safelen clause specifies that no two concurrent iterations within a SIMD chunk can have a
9 distance in the logical iteration space that is greater than or equal to the value given in the clause.
10 Cross References
11 • simd directive, see Section 10.4
14 Arguments
Name Type Properties
15
length expression of integer type positive, constant
16 Directives
17 declare simd, simd
18 Semantics
19 When the simdlen clause appears on a simd construct, length is treated as a hint that specifies
20 the preferred number of iterations to be executed concurrently. When the simdlen clause appears
21 on a declare simd construct, if a SIMD version of the associated function is created, length
22 corresponds to the number of concurrent arguments of the function.
23 Cross References
24 • declare simd directive, see Section 7.7
25 • simd directive, see Section 10.4
3 Clauses
4 filter
5 Additional information
6 The directive-name master may be used as a synonym to masked if no clauses are specified.
7 This syntax has been deprecated.
8 Binding
9 The binding thread set for a masked region is the current team. A masked region binds to the
10 innermost enclosing parallel region.
11 Semantics
12 The masked construct specifies a structured block that is executed by a subset of the threads of the
13 current team. The filter clause selects a subset of the threads of the team that executes the
14 binding parallel region to execute the structured block of the masked region. Other threads in the
15 team do not execute the associated structured block. No implied barrier occurs either on entry to or
16 exit from the masked construct. The result of evaluating the thread_num parameter of the
17 filter clause may vary across threads.
18 If more than one thread in the team executes the structured block of a masked region, the
19 structured block must include any synchronization required to ensure that data races do not occur.
25 Tool Callbacks
26 A thread dispatches a registered ompt_callback_masked callback with
27 ompt_scope_begin as its endpoint argument for each occurrence of a masked-begin event in
28 that thread. Similarly, a thread dispatches a registered ompt_callback_masked callback with
29 ompt_scope_end as its endpoint argument for each occurrence of a masked-end event in that
30 thread. These callbacks occur in the context of the task executed by the current thread and have the
31 type signature ompt_callback_masked_t.
32 Cross References
33 • ompt_callback_masked_t, see Section 19.5.2.12
34 • ompt_scope_endpoint_t, see Section 19.4.4.11
35 • filter clause, see Section 10.5.1
3 Arguments
Name Type Properties
4
thread_num expression of integer type default
5 Directives
6 masked
7 Semantics
8 If thread_num specifies the thread number of the current thread in the current team then the
9 filter clause selects the current thread. If the filter clause is not specified, the effect is as if
10 the clause is specified with thread_num equal to zero, so that the filter clause selects the
11 primary thread. The use of a variable in a thread_num clause expression causes an implicit
12 reference to the variable in all enclosing constructs.
13 Cross References
14 • masked directive, see Section 10.5
12 Restrictions
13 The following restrictions apply to work-distribution constructs:
14 • Each work-distribution region must be encountered by all threads in the binding thread set or by
15 none at all unless cancellation has been requested for the innermost enclosing parallel region.
16 • The sequence of encountered work-distribution regions that have the same binding thread set
17 must be the same for every thread in the binding thread set.
18 • The sequence of encountered worksharing regions and barrier regions that bind to the same
19 thread team must be the same for every thread in the team.
22 Clauses
23 allocate, copyprivate, firstprivate, nowait, private
24 Binding
25 The binding thread set for a single region is the current team. A single region binds to the
26 innermost enclosing parallel region. Only the threads of the team that executes the binding
27 parallel region participate in the execution of the structured block and the implied barrier of the
28 single region if the barrier is not eliminated by a nowait clause.
240
1 Semantics
2 The single construct specifies that the associated structured block is executed by only one of the
3 threads in the team (not necessarily the primary thread), in the context of its implicit task. The
4 method of choosing a thread to execute the structured block each time the team encounters the
5 construct is implementation defined. An implicit barrier occurs at the end of a single region if
6 the nowait clause is not specified.
12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
14 as its endpoint argument for each occurrence of a single-begin event in that thread. Similarly, a
15 thread dispatches a registered ompt_callback_work callback with ompt_scope_end as its
16 endpoint argument for each occurrence of a single-end event in that thread. For each of these
17 callbacks, the wstype argument is ompt_work_single_executor if the thread executes the
18 structured block associated with the single region; otherwise, the wstype argument is
19 ompt_work_single_other. The callback has type signature ompt_callback_work_t.
20 Restrictions
21 Restrictions to the single construct are as follows:
22 • The copyprivate clause must not be used with the nowait clause.
23 Cross References
24 • ompt_callback_work_t, see Section 19.5.2.5
25 • ompt_scope_endpoint_t, see Section 19.4.4.11
26 • ompt_work_t, see Section 19.4.4.16
27 • allocate clause, see Section 6.6
28 • copyprivate clause, see Section 5.7.2
29 • firstprivate clause, see Section 5.4.4
30 • nowait clause, see Section 15.6
31 • private clause, see Section 5.4.3
3 Clauses
4 allocate, firstprivate, nowait, private, reduction
5 Binding
6 The binding thread set for a scope region is the current team. A scope region binds to the
7 innermost enclosing parallel region. Only the threads of the team that executes the binding parallel
8 region participate in the execution of the structured block and the implied barrier of the scope
9 region if the barrier is not eliminated by a nowait clause.
10 Semantics
11 The scope construct specifies that all threads in a team execute the associated structured block and
12 any additionally specified OpenMP operations. An implicit barrier occurs at the end of a scope
13 region if the nowait clause is not specified.
19 Tool Callbacks
20 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
21 as its endpoint argument and ompt_work_scope as its work_type argument for each occurrence
22 of a scope-begin event in that thread. Similarly, a thread dispatches a registered
23 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
24 ompt_work_scope as its work_type argument for each occurrence of a scope-end event in that
25 thread. The callbacks occur in the context of the implicit task. The callbacks have type signature
26 ompt_callback_work_t.
27 Cross References
28 • ompt_callback_work_t, see Section 19.5.2.5
29 • ompt_scope_endpoint_t, see Section 19.4.4.11
30 • ompt_work_t, see Section 19.4.4.16
31 • allocate clause, see Section 6.6
32 • firstprivate clause, see Section 5.4.4
33 • nowait clause, see Section 15.6
5 Separating directives
6 section
7 Clauses
8 allocate, firstprivate, lastprivate, nowait, private, reduction
9 Binding
10 The binding thread set for a sections region is the current team. A sections region binds to
11 the innermost enclosing parallel region. Only the threads of the team that executes the binding
12 parallel region participate in the execution of the structured block sequences and the implied
13 barrier of the sections region if the barrier is not eliminated by a nowait clause.
14 Semantics
15 The sections construct is a non-iterative worksharing construct that contains a structured block
16 that consists of a set of structured block sequences that are to be distributed among and executed by
17 the threads in a team. Each structured block sequence is executed by one of the threads in the team
18 in the context of its implicit task. An implicit barrier occurs at the end of a sections region if
19 the nowait clause is not specified.
20 Each structured block sequence in the sections construct is preceded by a section directive
21 except possibly the first sequence, for which a preceding section directive is optional. The
22 method of scheduling the structured block sequences among the threads in the team is
23 implementation defined.
24 Execution Model Events
25 The sections-begin event occurs after an implicit task encounters a sections construct but before
26 the task executes any structured block sequences of the sections region.
27 The sections-end event occurs after an implicit task finishes execution of a sections region but
28 before it resumes execution of the enclosing context.
29 Tool Callbacks
30 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
31 as its endpoint argument and ompt_work_sections as its work_type argument for each
32 occurrence of a sections-begin event in that thread. Similarly, a thread dispatches a registered
33 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
34 ompt_work_sections as its work_type argument for each occurrence of a sections-end event
35 in that thread. The callbacks occur in the context of the implicit task. The callbacks have type
36 signature ompt_callback_work_t.
15 Separated directives
16 sections
17 Semantics
18 The section directive may be used to separate the structured block that is associated with a
19 sections construct into multiple sections, each of which is a structured block sequence.
23 Tool Callbacks
24 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
25 section-begin event in that thread. The callback occurs in the context of the implicit task. The
26 callback has type signature ompt_callback_dispatch_t.
27 Cross References
28 • sections directive, see Section 11.3
3 Clauses
4 nowait
5 Binding
6 The binding thread set for a workshare region is the current team. A workshare region binds
7 to the innermost enclosing parallel region. Only the threads of the team that executes the
8 binding parallel region participate in the execution of the units of work and the implied barrier
9 of the workshare region if the barrier is not eliminated by a nowait clause.
10 Semantics
11 The workshare construct divides the execution of the associated structured block into separate
12 units of work and causes the threads of the team to share the work such that each unit is executed
13 only once by one thread, in the context of its implicit task. An implicit barrier occurs at the end of a
14 workshare region if a nowait clause is not specified.
15 An implementation of the workshare construct must insert any synchronization that is required
16 to maintain standard Fortran semantics. For example, the effects of one statement within the
17 structured block must appear to occur before the execution of succeeding statements, and the
18 evaluation of the right hand side of an assignment must appear to complete prior to the effects of
19 assigning to the left hand side.
20 The statements in the workshare construct are divided into units of work as follows:
21 • For array expressions within each statement, including transformational array intrinsic functions
22 that compute scalar values from arrays:
23 – Evaluation of each element of the array expression, including any references to elemental
24 functions, is a unit of work.
25 – Evaluation of transformational array intrinsic functions may be freely subdivided into any
26 number of units of work.
27 • For array assignment statements, assignment of each element is a unit of work.
28 • For scalar assignment statements, each assignment operation is a unit of work.
29 • For WHERE statements or constructs, evaluation of the mask expression and the masked
30 assignments are each a unit of work.
31 • For FORALL statements or constructs, evaluation of the mask expression, expressions occurring
32 in the specification of the iteration space, and the masked assignments are each a unit of work.
1 • For atomic constructs, critical constructs, and parallel constructs, the construct is a
2 unit of work. A new thread team executes the statements contained in a parallel construct.
3 • If none of the rules above apply to a portion of a statement in the structured block, then that
4 portion is a unit of work.
5 The transformational array intrinsic functions are MATMUL, DOT_PRODUCT, SUM, PRODUCT,
6 MAXVAL, MINVAL, COUNT, ANY, ALL, SPREAD, PACK, UNPACK, RESHAPE, TRANSPOSE,
7 EOSHIFT, CSHIFT, MINLOC, and MAXLOC.
8 How units of work are assigned to the threads that execute a workshare region is unspecified.
9 If an array expression in the block references the value, association status, or allocation status of
10 private variables, the value of the expression is undefined, unless the same value would be
11 computed by every thread.
12 If an array assignment, a scalar assignment, a masked array assignment, or a FORALL assignment
13 assigns to a private variable in the block, the result is unspecified.
14 The workshare directive causes the sharing of work to occur only in the workshare construct,
15 and not in the remainder of the workshare region.
21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
23 as its endpoint argument and ompt_work_workshare as its work_type argument for each
24 occurrence of a workshare-begin event in that thread. Similarly, a thread dispatches a registered
25 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
26 ompt_work_workshare as its work_type argument for each occurrence of a workshare-end
27 event in that thread. The callbacks occur in the context of the implicit task. The callbacks have type
28 signature ompt_callback_work_t.
29 Restrictions
30 Restrictions to the workshare construct are as follows:
31 • The only OpenMP constructs that may be closely nested inside a workshare construct are the
32 atomic, critical, and parallel constructs.
33 • Base language statements that are encountered inside a workshare construct but that are not
34 enclosed within a parallel or atomic construct that is nested inside the workshare
35 construct must consist of only the following:
36 – array assignments;
1 Restrictions
2 Restrictions to the worksharing-loop construct are as follows:
3 • The logical iteration space of the loops associated with the worksharing-loop construct must be
4 the same for all threads in the team.
5 • The value of the run-sched-var ICV must be the same for all threads in the team.
6 Cross References
7 • Consistent Loop Schedules, see Section 4.4.5
8 • OMP_SCHEDULE, see Section 21.2.1
9 • ompt_callback_work_t, see Section 19.5.2.5
10 • ompt_scope_endpoint_t, see Section 19.4.4.11
11 • ompt_work_t, see Section 19.4.4.16
12 • do directive, see Section 11.5.2
13 • for directive, see Section 11.5.1
14 • nowait clause, see Section 15.6
15 • order clause, see Section 10.3
16 • schedule clause, see Section 11.5.3
3 Separating directives
4 scan
5 Clauses
6 allocate, collapse, firstprivate, lastprivate, linear, nowait, order,
7 ordered, private, reduction, schedule
8 Semantics
9 The for construct is a worksharing-loop construct.
10 Cross References
11 • Worksharing-Loop Constructs, see Section 11.5
12 • allocate clause, see Section 6.6
13 • collapse clause, see Section 4.4.3
14 • firstprivate clause, see Section 5.4.4
15 • lastprivate clause, see Section 5.4.5
16 • linear clause, see Section 5.4.6
17 • nowait clause, see Section 15.6
18 • order clause, see Section 10.3
19 • ordered clause, see Section 4.4.4
20 • private clause, see Section 5.4.3
21 • reduction clause, see Section 5.5.8
22 • scan directive, see Section 5.6
23 • schedule clause, see Section 11.5.3
C / C++
1 11.5.2 do Construct
Name: do Association: loop
Category: executable Properties: work-distribution, workshar-
2
ing, worksharing-loop, cancellable, context-
matching
3 Separating directives
4 scan
5 Clauses
6 allocate, collapse, firstprivate, lastprivate, linear, nowait, order,
7 ordered, private, reduction, schedule
8 Semantics
9 The do construct is a worksharing-loop construct.
10 Cross References
11 • Worksharing-Loop Constructs, see Section 11.5
12 • allocate clause, see Section 6.6
13 • collapse clause, see Section 4.4.3
14 • firstprivate clause, see Section 5.4.4
15 • lastprivate clause, see Section 5.4.5
16 • linear clause, see Section 5.4.6
17 • nowait clause, see Section 15.6
18 • order clause, see Section 10.3
19 • ordered clause, see Section 4.4.4
20 • private clause, see Section 5.4.3
21 • reduction clause, see Section 5.5.8
22 • scan directive, see Section 5.6
23 • schedule clause, see Section 11.5.3
Fortran
3 Arguments
Name Type Properties
kind Keyword: auto, dynamic, guided, default
4 runtime, static
chunk_size expression of integer type ultimate, optional, posi-
tive, region-invariant
5 Modifiers
Name Modifies Type Properties
ordering-modifier kind Keyword: monotonic, unique
6
nonmonotonic
chunk-modifier kind Keyword: simd unique
7 Directives
8 do, for
9 Semantics
10 The schedule clause specifies how iterations of associated loops of a worksharing-loop construct
11 are divided into contiguous non-empty subsets, called chunks, and how these chunks are distributed
12 among threads of the team. The chunk_size expression is evaluated using the original list items of
13 any variables that are made private in the worksharing-loop construct. Whether, in what order, or
14 how many times, any side effects of the evaluation of this expression occur is unspecified. The use
15 of a variable in a schedule clause expression of a worksharing-loop construct causes an implicit
16 reference to the variable in all enclosing constructs.
17 If the kind argument is static, iterations are divided into chunks of size chunk_size, and the
18 chunks are assigned to the threads in the team in a round-robin fashion in the order of the thread
19 number. Each chunk contains chunk_size iterations, except for the chunk that contains the
20 sequentially last iteration, which may have fewer iterations. If chunk_size is not specified, the
21 logical iteration space is divided into chunks that are approximately equal in size, and at most one
22 chunk is distributed to each thread.
23 If the kind argument is dynamic, each thread executes a chunk, then requests another chunk, until
24 no chunks remain to be assigned. Each chunk contains chunk_size iterations, except for the chunk
25 that contains the sequentially last iteration, which may have fewer iterations. If chunk_size is not
26 specified, it defaults to 1.
27 If the kind argument is guided, each thread executes a chunk, then requests another chunk, until
28 no chunks remain to be assigned. For a chunk_size of 1, the size of each chunk is proportional to
29 the number of unassigned iterations divided by the number of threads in the team, decreasing to 1.
30 For a chunk_size with value k > 1, the size of each chunk is determined in the same way, with the
16 Note – For a team of p threads and a loop of n iterations, let dn/pee be the integer q that satisfies
17 n = p ∗ q − r, with 0 <= r < p. One compliant implementation of the static schedule (with no
18 specified chunk_size) would behave as though chunk_size had been specified with value q. Another
19 compliant implementation would assign q iterations to the first p − r threads, and q − 1 iterations to
20 the remaining r threads. This illustrates why a conforming program must not rely on the details of a
21 particular implementation.
22 A compliant implementation of the guided schedule with a chunk_size value of k would assign
23 q = dn/pe e iterations to the first available thread and set n to the larger of n − q and p ∗ k. It would
24 then repeat this process until q is greater than or equal to the number of remaining iterations, at
25 which time the remaining iterations form the final chunk. Another compliant implementation could
26 use the same method, except with q = dn/(2p)e e, and set n to the larger of n − q and 2 ∗ p ∗ k.
27
28 If the monotonic ordering-modifier is specified then each thread executes the chunks that it is
29 assigned in increasing logical iteration order. When the nonmonotonic ordering-modifier is
30 specified then chunks may be assigned to threads in any order and the behavior of an application
31 that depends on any execution order of the chunks is unspecified. If an ordering-modifier is not
32 specified, the effect is as if the monotonic modifier is specified if the kind argument is static
33 or an ordered clause is specified on the construct; otherwise, the effect is as if the
34 nonmonotonic modifier is specified.
35 Restrictions
36 Restrictions to the schedule clause are as follows:
37 • The schedule clause cannot be specified if any of the associated loops are non-rectangular.
38 • The value of the chunk_size expression must be the same for all threads in the team.
4 Cross References
5 • do directive, see Section 11.5.2
6 • for directive, see Section 11.5.1
7 • ordered clause, see Section 4.4.4
8 • run-sched-var ICV, see Table 2.1
11 Clauses
12 allocate, collapse, dist_schedule, firstprivate, lastprivate, order,
13 private
14 Binding
15 The binding thread set for a distribute region is the set of initial threads executing an
16 enclosing teams region. A distribute region binds to this teams region.
17 Semantics
18 The distribute construct specifies that the iterations of one or more loops will be executed by
19 the initial teams in the context of their implicit tasks. The iterations are distributed across the initial
20 threads of all initial teams that execute the teams region to which the distribute region binds.
21 No implicit barrier occurs at the end of a distribute region. To avoid data races the original list
22 items that are modified due to lastprivate clauses should not be accessed between the end of
23 the distribute construct and the end of the teams region to which the distribute binds.
24 If the dist_schedule clause is not specified, the schedule is implementation defined.
25 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
26 range-decl of each associated loop has the value that it would have if the set of the associated loops
27 was executed sequentially.
28 The schedule is reproducible if one of the following conditions is true:
29 • The order clause is specified with the reproducible modifier; or
30 • The dist_schedule clause is specified with static as the kind parameter and the order
31 clause is not specified with the unconstrained order-modifier.
8 Arguments
Name Type Properties
kind Keyword: static default
9
chunk_size expression of integer type ultimate, optional, posi-
tive, region-invariant
10 Directives
11 distribute
12 Semantics
13 The dist_schedule clause specifies how iterations of associated loops of a distribute
14 construct are divided into contiguous non-empty subsets, called chunks, and how these chunks are
15 distributed among the teams of the league. if chunk_size is not specified, the iteration space is
16 divided into chunks that are approximately equal in size, and at most one chunk is distributed to
17 each initial team of the league.
18 If the chunk_size argument is specified, iterations are divided into chunks of size chunk_size. The
19 chunk_size expression is evaluated using the original list items of any variables that are made
20 private in the distribute construct. Whether, in what order, or how many times, any side
21 effects of the evaluation of this expression occur is unspecified. The use of a variable in a
22 dist_schedule clause expression of a distribute construct causes an implicit reference to
23 the variable in all enclosing constructs. These chunks are assigned to the initial teams of the league
24 in a round-robin fashion in the order of the initial team number.
25 Restrictions
26 Restrictions to the dist_schedule clause are as follows:
27 • The value of the chunk_size expression must be the same for all teams in the league.
28 • The dist_schedule clause cannot be specified if any of the associated loops are
29 non-rectangular.
30 Cross References
31 • distribute directive, see Section 11.6
3 Clauses
4 bind, collapse, lastprivate, order, private, reduction
5 Binding
6 The bind clause determines the binding region, which determines the binding thread set.
7 Semantics
8 A loop construct specifies that the logical iterations of the associated loops may execute
9 concurrently and permits the encountering threads to execute the loop accordingly. A loop
10 construct is a worksharing construct if its binding region is the innermost enclosing parallel region.
11 Otherwise it is not a worksharing region. The directive asserts that the iterations of the associated
12 loops may execute in any order, including concurrently. Each logical iteration is executed once per
13 instance of the loop region that is encountered by exactly one thread that is a member of the
14 binding thread set.
15 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
16 range-decl of each associated loop has the value that it would have if the set of the associated loops
17 was executed sequentially.
18 If the order clause is not present, the behavior is as if an order clause that specifies
19 concurrent appeared on the construct. The loop schedule for a loop construct is reproducible
20 unless the order clause is present with the unconstrained order-modifier.
21 If the loop region binds to a teams region, the threads in the binding thread set may continue
22 execution after the loop region without waiting for all logical iterations of the associated loops to
23 complete. The iterations are guaranteed to complete before the end of the teams region. If the
24 loop region does not bind to a teams region, all logical iterations of the associated loops must
25 complete before the encountering threads continue execution after the loop region.
26 For the purpose of determining its consistency with other schedules, the schedule is defined by the
27 implicit order clause. The schedule is reproducible if the schedule specified through the implicit
28 order clause is reproducible.
29 Restrictions
30 Restrictions to the loop construct are as follows:
31 • A list item may not appear in a lastprivate clause unless it is the loop iteration variable of a
32 loop that is associated with the construct.
33 • If a reduction-modifier is specified in a reduction clause that appears on the directive then the
34 reduction modifier must be default.
16 Arguments
Name Type Properties
17 binding Keyword: parallel, teams, default
thread
18 Directives
19 loop
20 Semantics
21 The bind clause specifies the binding region of the construct on which it appears. Specifically, if
22 binding is teams and an innermost enclosing teams region exists then the binding region is that
23 teams region; if binding is parallel then the binding region is the innermost enclosing parallel
24 region, which may be an implicit parallel region; and if binding is thread then the binding region
25 is not defined. If the bind clause is not specified on a construct for which it may be specified and
26 the construct is closely nested inside a teams or parallel construct, the effect is as if binding is
27 teams or parallel. If none of those conditions hold, the binding region is not defined.
28 The specified binding region determines the binding thread set. Specifically, if the binding region is
29 a teams region, then the binding thread set is the set of initial threads that are executing that
30 region while if the binding region is a parallel region, then the binding thread set is the team of
31 threads that are executing that region. If the binding region is not defined, then the binding thread
32 set is the encountering thread.
10 Cross References
11 • loop directive, see Section 11.7
12 • parallel construct, see Section 10.1
13 • teams construct, see Section 10.2.
5 Directives
6 task, taskloop
7 Semantics
8 The untied clause specifies that tasks generated by the construct on which it appears are untied,
9 which means that any thread in the team can resume the task region after a suspension. If the
10 untied clause is not specified on a construct on which it may appear, generated tasks are tied; if a
11 tied task is suspended, its task region can only be resumed by the thread that started its execution.
12 If a generated task is a final or an included task, the untied clause is ignored and the task is tied.
13 Cross References
14 • task directive, see Section 12.5
15 • taskloop directive, see Section 12.6
18 Directives
19 task, taskloop
20 Semantics
21 The mergeable clause specifies that tasks generated by the construct on which it appears are
22 mergeable tasks.
23 Cross References
24 • task directive, see Section 12.5
25 • taskloop directive, see Section 12.6
260
1 12.3 final Clause
2 Name: final Properties: unique
3 Arguments
Name Type Properties
4
finalize expression of logical type default
5 Directives
6 task, taskloop
7 Semantics
8 The final clause specifies that tasks generated by the construct on which it appears are final tasks
9 if the finalize expression evaluates to true. All task constructs that are encountered during
10 execution of a final task generate final and included tasks. The use of a variable in a finalize
11 expression causes an implicit reference to the variable in all enclosing constructs. The finalize
12 expression is evaluated in the context outside of the construct on which the clause appears,
13 Cross References
14 • task directive, see Section 12.5
15 • taskloop directive, see Section 12.6
18 Arguments
Name Type Properties
19
priority-value expression of integer type constant, non-negative
20 Directives
21 task, taskloop
22 Semantics
23 The priority clause specifies a hint for the task execution order of tasks generated by the
24 construct on which it appears in the priority-value argument. Among all tasks ready to be executed,
25 higher priority tasks (those with a higher numerical priority-value) are recommended to execute
26 before lower priority ones. The default priority-value when no priority clause is specified is
27 zero (the lowest priority). If a specified priority-value is higher than the max-task-priority-var ICV
28 then the implementation will use the value of that ICV. A program that relies on the task execution
29 order being determined by the priority-value may have unspecified behavior.
7 Clauses
8 affinity, allocate, default, depend, detach, final, firstprivate, if,
9 in_reduction, mergeable, priority, private, shared, untied
10 Clause set
11 Properties: exclusive Members: detach, mergeable
12 Binding
13 The binding thread set of the task region is the current team. A task region binds to the
14 innermost enclosing parallel region.
15 Semantics
16 When a thread encounters a task construct, an explicit task is generated from the code for the
17 associated structured block. The data environment of the task is created according to the
18 data-sharing attribute clauses on the task construct, per-data environment ICVs, and any defaults
19 that apply. The data environment of the task is destroyed when the execution code of the associated
20 structured block is completed.
21 The encountering thread may immediately execute the task, or defer its execution. In the latter case,
22 any thread in the team may be assigned the task. Completion of the task can be guaranteed using
23 task synchronization constructs and clauses. If a task construct is encountered during execution
24 of an outer task, the generated task region that corresponds to this construct is not a part of the
25 outer task region unless the generated task is an included task.
26 A detachable task is completed when the execution of its associated structured block is completed
27 and the allow-completion event is fulfilled. If no detach clause is present on a task construct,
28 the generated task is completed when the execution of its associated structured block is completed.
29 A thread that encounters a task scheduling point within the task region may temporarily suspend
30 the task region.
31 The task construct includes a task scheduling point in the task region of its generating task,
32 immediately following the generation of the explicit task. Each explicit task region includes a
33 task scheduling point at the end of its associated structured block.
6 When an if clause is present on a task construct and the if clause expression evaluates to false,
7 an undeferred task is generated, and the encountering thread must suspend the current task region,
8 for which execution cannot be resumed until execution of the structured block that is associated
9 with the generated task is completed. The use of a variable in an if clause expression of a task
10 construct causes an implicit reference to the variable in all enclosing constructs. The if clause
11 expression is evaluated in the context outside of the task construct.
15 Tool Callbacks
16 A thread dispatches a registered ompt_callback_task_create callback for each occurrence
17 of a task-create event in the context of the encountering task. This callback has the type signature
18 ompt_callback_task_create_t and the flags argument indicates the task types shown in
19 Table 12.1.
20 Cross References
21 • Task Scheduling, see Section 12.9
22 • omp_fulfill_event, see Section 18.11.1
23 • ompt_callback_task_create_t, see Section 19.5.2.7
24 • affinity clause, see Section 12.5.1
25 • allocate clause, see Section 6.6
15 Arguments
Name Type Properties
16
locator-list list of locator list item type default
17 Modifiers
Name Modifies Type Properties
iterator locator-list Complex, name: iterator unique
Arguments:
18
iterator-specifier OpenMP
expression (repeatable)
19 Directives
20 task
21 Semantics
22 The affinity clause specifies a hint to indicate data affinity of tasks generated by the construct
23 on which it appears. The hint recommends to execute generated tasks close to the location of the
24 original list items. A program that relies on the task execution location being determined by this list
25 may have unspecified behavior.
10 Arguments
Name Type Properties
11
event-handle variable of event_handle type default
12 Directives
13 task
14 Semantics
15 The detach clause specifies that the task generated by the construct on which it appears is a
16 detachable task. A new allow-completion event is created and connected to the completion of the
17 associated task region. The original event-handle is updated to represent that allow-completion
18 event before the task data environment is created. The event-handle is considered as if it was
19 specified on a firstprivate clause. The use of a variable in a detach clause expression of a
20 task construct causes an implicit reference to the variable in all enclosing constructs.
21 Restrictions
22 Restrictions to the detach clause are as follows:
23 • If a detach clause appears on a directive, then the encountering task must not be a final task.
24 • A variable that appears in a detach clause cannot appear as a list item on a data-environment
25 attribute clause on the same construct.
26 • A variable that is part of another variable (as an array element or a structure element) cannot
27 appear in a detach clause.
10 Clauses
11 allocate, collapse, default, final, firstprivate, grainsize, if,
12 in_reduction, lastprivate, mergeable, nogroup, num_tasks, priority,
13 private, reduction, shared, untied
18 Binding
19 The binding thread set of the taskloop region is the current team. A taskloop region binds to
20 the innermost enclosing parallel region.
21 Semantics
22 When a thread encounters a taskloop construct, the construct partitions the iterations of the
23 associated loops into chunks, each of which is assigned to an explicit task for parallel execution.
24 The iteration count for each associated loop is computed before entry to the outermost loop. The
25 data environment of each generated task is created according to the data-sharing attribute clauses
26 on the taskloop construct, per-data environment ICVs, and any defaults that apply. The order of
27 the creation of the loop tasks is unspecified. Programs that rely on any execution order of the
28 logical iterations are non-conforming.
12 Arguments
Name Type Properties
13
grain-size expression of integer type positive
14 Modifiers
Name Modifies Type Properties
15
prescriptiveness grain-size Keyword: strict unique
16 Directives
17 taskloop
18 Semantics
19 The grainsize clause specifies the number of logical iterations, Lt , that are assigned to each
20 generated task t. If prescriptiveness is not specified as strict, other than possibly for the
21 generated task that contains the sequentially last iteration, Lt is greater than or equal to the
22 minimum of the value of the grain-size expression and the number of logical iterations, but less
23 than two times the value of the grain-size expression. If prescriptiveness is specified as strict,
24 other than possibly for the generated task that contains the sequentially last iteration, Lt is equal to
25 the value of the grain-size expression. In both cases, the generated task that contains the
26 sequentially last iteration may have fewer iterations than the value of the grain-size expression.
27 Restrictions
28 Restrictions to the grainsize clause are as follows:
29 • None of the associated loops may be non-rectangular loops.
30 Cross References
31 • taskloop directive, see Section 12.6
3 Arguments
Name Type Properties
4
num-tasks expression of integer type positive
5 Modifiers
Name Modifies Type Properties
6
prescriptiveness num-tasks Keyword: strict unique
7 Directives
8 taskloop
9 Semantics
10 The num_tasks clause specifies that the taskloop construct create as many tasks as the
11 minimum of the num-tasks expression and the number of logical iterations. Each task must have at
12 least one logical iteration. If prescriptiveness is specified as strict for a task loop with N logical
13 iterations, the logical iterations are partitioned in a balanced manner and each partition is assigned,
14 in order, to a generated task. The partition size is dN/num-taskse e until the number of remaining
15 iterations divides the number of remaining tasks evenly, at which point the partition size becomes
16 bN/num-tasksc c.
17 Restrictions
18 Restrictions to the num_tasks clause are as follows:
19 • None of the associated loops may be non-rectangular loops.
20 Cross References
21 • taskloop directive, see Section 12.6
24 Binding
25 A taskyield region binds to the current task region. The binding thread set of the taskyield
26 region is the current team.
27 Semantics
28 The taskyield region includes an explicit task scheduling point in the current task region.
29 Cross References
30 • Task Scheduling, see Section 12.9
14 Note – Task scheduling points dynamically divide task regions into parts. Each part is executed
15 uninterrupted from start to end. Different parts of the same task region are executed in the order in
16 which they are encountered. In the absence of task synchronization constructs, the order in which a
17 thread executes parts of different schedulable tasks is unspecified.
18 A program must behave correctly and consistently with all conceivable scheduling sequences that
19 are compatible with the rules above.
20 For example, if threadprivate storage is accessed (explicitly in the source code or implicitly
21 in calls to library routines) in one part of a task region, its value cannot be assumed to be preserved
22 into the next part of the same task region if another schedulable task exists that modifies it.
23 As another example, if a lock acquire and release happen in different parts of a task region, no
24 attempt should be made to acquire the same lock in any part of another task that the executing
25 thread may schedule. Otherwise, a deadlock is possible. A similar situation can occur when a
26 critical region spans multiple parts of a task and another schedulable task contains a
27 critical region with the same name.
28 The use of threadprivate variables and the use of locks or critical sections in an explicit task with an
29 if clause must take into account that when the if clause evaluates to false, the task is executed
30 immediately, without regard to Task Scheduling Constraint 2.
31
8 Cross References
9 • ompt_callback_task_schedule_t, see Section 19.5.2.10
5 Arguments
Name Type Properties
6
device-type-description Keyword: any, host, nohost default
7 Directives
8 begin declare target, declare target
9 Semantics
10 The device_type clause specifies if a version of the procedure or variable should be made
11 available on the host device, non-host devices or both the host device and non-host devices. If
12 host is specified then only a host device version of the procedure or variable is made available. If
13 any is specified then both host device and non-host device versions of the procedure or variable are
14 made available. If nohost is specified for a procedure then only non-host device versions of the
15 procedure are made available. If nohost is specified for a variable then that variable is not
16 available on the host device. If the device_type clause is not specified, the behavior is as if the
17 device_type clause appears with any specified.
18 Cross References
19 • begin declare target directive, see Section 7.8.2
20 • declare target directive, see Section 7.8.1
275
1 13.2 device Clause
2 Name: device Properties: unique
3 Arguments
Name Type Properties
4
device-description expression of integer type default
5 Modifiers
Name Modifies Type Properties
6 device-modifier device-description Keyword: ancestor, default
device_num
7 Directives
8 dispatch, interop, target, target data, target enter data, target exit
9 data, target update
10 Semantics
11 The device clause identifies the target device that is associated with a device construct.
12 If device_num is specified as the device-modifier, the device-description specifies the device
13 number of the target device. If device-modifier does not appear in the clause, the behavior of the
14 clause is as if device-modifier is device_num. If the device-description evaluates to
15 omp_invalid_device, runtime error termination is performed.
16 If ancestor is specified as the device-modifier, the device-description specifies the number of
17 target nesting level of the target device. Specifically, if the device-description evaluates to 1, the
18 target device is the parent device of the enclosing target region. If the construct on which the
19 device clause appears is not encountered in a target region, the current device is treated as the
20 parent device.
21 Unless otherwise specified, for directives that accept the device clause, if no device clause is
22 present, the behavior is as if the device clause appears without a device-modifier and with a
23 device-description that evaluates to the value of the default-device-var ICV.
24 Restrictions
25 • The ancestor device-modifier must not appear on the device clause on any directive other
26 than the target construct.
27 • If the ancestor device-modifier is specified, the device-description must evaluate to 1
28 and a requires directive with the reverse_offload clause must be specified;
29 • If the device_num device-modifier is specified and target-offload-var is not mandatory,
30 device-description must evaluate to a conforming device number.
12 Arguments
Name Type Properties
13
threadlim expression of integer type positive
14 Directives
15 target, teams
16 Semantics
17 As described in Section 2.4, some constructs limit the number of threads that may participate in a
18 contention group initiated by each team by setting the value of the thread-limit-var ICV for the
19 initial task to an implementation-defined value greater than zero. If the thread_limit clause is
20 specified, the number of threads will be less than or equal to threadlim. Otherwise, if the
21 teams-thread-limit-var ICV is greater than zero, the effect is as if the thread_limit clause was
22 specified with a threadlim that evaluates to an implementation defined value less than or equal to
23 the teams-thread-limit-var ICV.
24 Cross References
25 • target directive, see Section 13.8
26 • teams directive, see Section 10.2
12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_device_initialize callback for each
14 occurrence of a device-initialize event in that thread. This callback has type signature
15 ompt_callback_device_initialize_t.
16 A thread dispatches a registered ompt_callback_device_load callback for each occurrence
17 of a device-load event in that thread. This callback has type signature
18 ompt_callback_device_load_t.
19 A thread dispatches a registered ompt_callback_device_unload callback for each
20 occurrence of a device-unload event in that thread. This callback has type signature
21 ompt_callback_device_unload_t.
22 A thread dispatches a registered ompt_callback_device_finalize callback for each
23 occurrence of a device-finalize event in that thread. This callback has type signature
24 ompt_callback_device_finalize_t.
25 Restrictions
26 Restrictions to OpenMP device initialization are as follows:
27 • No thread may offload execution of an OpenMP construct to a device until a dispatched
28 ompt_callback_device_initialize callback completes.
29 • No thread may offload execution of an OpenMP construct to a device after a dispatched
30 ompt_callback_device_finalize callback occurs.
31 Cross References
32 • ompt_callback_device_finalize_t, see Section 19.5.2.20
33 • ompt_callback_device_initialize_t, see Section 19.5.2.19
34 • ompt_callback_device_load_t, see Section 19.5.2.21
35 • ompt_callback_device_unload_t, see Section 19.5.2.22
3 Clauses
4 device, if, map, use_device_addr, use_device_ptr
7 Binding
8 The binding task set for a target data region is the generating task. The target data region
9 binds to the region of the generating task.
10 Semantics
11 The target data construct maps variables to a device data environment. When a
12 target data construct is encountered, the encountering task executes the region. When an if
13 clause is present and the if clause expression evaluates to false, the target device is the host.
14 Variables are mapped for the extent of the region, according to any data-mapping attribute clauses,
15 from the data environment of the encountering task to the device data environment.
16 A list item that appears in a map clause may also appear in a use_device_ptr clause or a
17 use_device_addr clause. If one or more map clauses are present, the list item conversions that
18 are performed for any use_device_ptr or use_device_addr clause occur after all
19 variables are mapped on entry to the region according to those map clauses.
25 Tool Callbacks
26 The tool callbacks dispatched when entering a target data region are the same as the tool
27 callbacks dispatched when encountering a target enter data construct, as described in
28 Section 13.6.
29 The tool callbacks dispatched when exiting a target data region are the same as the tool
30 callbacks dispatched when encountering a target exit data construct, as described in
31 Section 13.7.
4 Cross References
5 • device clause, see Section 13.2
6 • if clause, see Section 3.4
7 • map clause, see Section 5.8.3
8 • use_device_addr clause, see Section 5.4.10
9 • use_device_ptr clause, see Section 5.4.8
12 Clauses
13 depend, device, if, map, nowait
14 Binding
15 The binding task set for a target enter data region is the generating task, which is the target
16 task generated by the target enter data construct. The target enter data region binds
17 to the corresponding target task region.
18 Semantics
19 When a target enter data construct is encountered, the list items are mapped to the device
20 data environment according to the map clause semantics. The target enter data construct
21 generates a target task. The generated task region encloses the target enter data region. If a
22 depend clause is present, it is associated with the target task. If the nowait clause is present,
23 execution of the target task may be deferred. If the nowait clause is not present, the target task is
24 an included task.
25 All clauses are evaluated when the target enter data construct is encountered. The data
26 environment of the target task is created according to the data-mapping attribute clauses on the
27 target enter data construct, per-data environment ICVs, and any default data-sharing
28 attribute rules that apply to the target enter data construct. If a variable or part of a variable
29 is mapped by the target enter data construct, the variable has a default data-sharing attribute
30 of shared in the data environment of the target task.
12 Tool Callbacks
13 Callbacks associated with events for target tasks are the same as for the task construct defined in
14 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.
15 A thread dispatches a registered ompt_callback_target or
16 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
17 argument and ompt_target_enter_data or ompt_target_enter_data_nowait if
18 the nowait clause is present as its kind argument for each occurrence of a target-enter-data-begin
19 event in that thread in the context of the target task on the host. Similarly, a thread dispatches a
20 registered ompt_callback_target or ompt_callback_target_emi callback with
21 ompt_scope_end as its endpoint argument and ompt_target_enter_data or
22 ompt_target_enter_data_nowait if the nowait clause is present as its kind argument
23 for each occurrence of a target-enter-data-end event in that thread in the context of the target task
24 on the host. These callbacks have type signature ompt_callback_target_t or
25 ompt_callback_target_emi_t, respectively.
26 Restrictions
27 Restrictions to the target enter data construct are as follows:
28 • At least one map clause must appear on the directive.
29 • All map clauses must be map-entering.
30 Cross References
31 • ompt_callback_target_emi_t and ompt_callback_target_t, see
32 Section 19.5.2.26
33 • depend clause, see Section 15.9.5
34 • device clause, see Section 13.2
35 • if clause, see Section 3.4
6 Clauses
7 depend, device, if, map, nowait
8 Binding
9 The binding task set for a target exit data region is the generating task, which is the target
10 task generated by the target exit data construct. The target exit data region binds to
11 the corresponding target task region.
12 Semantics
13 When a target exit data construct is encountered, the list items in the map clauses are
14 unmapped from the device data environment according to the map clause semantics. The
15 target exit data construct generates a target task. The generated task region encloses the
16 target exit data region. If a depend clause is present, it is associated with the target task. If
17 the nowait clause is present, execution of the target task may be deferred. If the nowait clause
18 is not present, the target task is an included task.
19 All clauses are evaluated when the target exit data construct is encountered. The data
20 environment of the target task is created according to the data-mapping attribute clauses on the
21 target exit data construct, per-data environment ICVs, and any default data-sharing attribute
22 rules that apply to the target exit data construct. If a variable or part of a variable is mapped
23 by the target exit data construct, the variable has a default data-sharing attribute of shared in
24 the data environment of the target task.
25 Assignment operations associated with mapping a variable (see Section 5.8.3) occur when the
26 target task executes.
27 When an if clause is present and the if clause expression evaluates to false, the target device is
28 the host.
8 Tool Callbacks
9 Callbacks associated with events for target tasks are the same as for the task construct defined in
10 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.
11 A thread dispatches a registered ompt_callback_target or
12 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
13 argument and ompt_target_exit_data or ompt_target_exit_data_nowait if the
14 nowait clause is present as its kind argument for each occurrence of a target-exit-data-begin
15 event in that thread in the context of the target task on the host. Similarly, a thread dispatches a
16 registered ompt_callback_target or ompt_callback_target_emi callback with
17 ompt_scope_end as its endpoint argument and ompt_target_exit_data or
18 ompt_target_exit_data_nowait if the nowait clause is present as its kind argument for
19 each occurrence of a target-exit-data-end event in that thread in the context of the target task on the
20 host. These callbacks have type signature ompt_callback_target_t or
21 ompt_callback_target_emi_t, respectively.
22 Restrictions
23 Restrictions to the target exit data construct are as follows:
24 • At least one map clause must appear on the directive.
25 • All map clauses must be a map-exiting.
26 Cross References
27 • ompt_callback_target_emi_t and ompt_callback_target_t, see
28 Section 19.5.2.26
29 • depend clause, see Section 15.9.5
30 • device clause, see Section 13.2
31 • if clause, see Section 3.4
32 • map clause, see Section 5.8.3
33 • nowait clause, see Section 15.6
34 • task directive, see Section 12.5
3 Clauses
4 allocate, defaultmap, depend, device, firstprivate, has_device_addr, if,
5 in_reduction, is_device_ptr, map, nowait, private, thread_limit,
6 uses_allocators
7 Binding
8 The binding task set for a target region is the generating task, which is the target task generated
9 by the target construct. The target region binds to the corresponding target task region.
10 Semantics
11 The target construct provides a superset of the functionality provided by the target data
12 directive, except for the use_device_ptr and use_device_addr clauses. The functionality
13 added to the target directive is the inclusion of an executable region to be executed on a device.
14 The target construct generates a target task. The generated task region encloses the target
15 region. If a depend clause is present, it is associated with the target task. The device clause
16 determines the device on which the target region executes. If the nowait clause is present,
17 execution of the target task may be deferred. If the nowait clause is not present, the target task is
18 an included task.
19 All clauses are evaluated when the target construct is encountered. The data environment of the
20 target task is created according to the data-sharing and data-mapping attribute clauses on the
21 target construct, per-data environment ICVs, and any default data-sharing attribute rules that
22 apply to the target construct. If a variable or part of a variable is mapped by the target
23 construct and does not appear as a list item in an in_reduction clause on the construct, the
24 variable has a default data-sharing attribute of shared in the data environment of the target task.
25 Assignment operations associated with mapping a variable (see Section 5.8.3) occur when the
26 target task executes.
27 If the device clause is specified with the ancestor device-modifier, the encountering thread
28 waits for completion of the target region on the parent device before resuming. For any list item
29 that appears in a map clause on the same construct, if the corresponding list item exists in the device
30 data environment of the parent device, it is treated as if it has a reference count of positive infinity.
31 When an if clause is present and the if clause expression evaluates to false, the effect is as if a
32 device clause that specifies omp_initial_device as the device number is present,
33 regardless of any other device clause on the directive.
27 Tool Callbacks
28 Callbacks associated with events for target tasks are the same as for the task construct defined in
29 Section 12.5; (flags & ompt_task_target) always evaluates to true in the dispatched callback.
24 Cross References
25 • ompt_callback_target_emi_t and ompt_callback_target_t, see
26 Section 19.5.2.26
27 • ompt_callback_task_create_t, see Section 19.5.2.7
28 • depend clause, see Section 15.9.5
29 • device clause, see Section 13.2
30 • from clause, see Section 5.9.2
31 • if clause, see Section 3.4
32 • nowait clause, see Section 15.6
33 • task directive, see Section 12.5
34 • to clause, see Section 5.9.1
17 Cross References
18 • Interoperability Routines, see Section 18.12
21 Clauses
22 depend, destroy, device, init, nowait, use
4 Semantics
5 The interop construct retrieves interoperability properties from the OpenMP implementation to
6 enable interoperability with foreign execution contexts. When an interop construct is
7 encountered, the encountering task executes the region.
8 For each action-clause, the interop-type set is the set of interop-type modifiers specified for the
9 clause if the clause is init or for the init clause that initialized the interop-var that is specified for
10 the clause if the clause is not init.
11 If the interop-type set includes targetsync, an empty mergeable task is generated. If the
12 nowait clause is not present on the construct then the task is also an included task. Any depend
13 clauses that are present on the construct apply to the generated task.
14 The interop construct ensures an ordered execution of the generated task relative to foreign tasks
15 executed in the foreign execution context through the foreign synchronization object that is
16 accessible through the targetsync property. When the creation of the foreign task precedes the
17 encountering of an interop construct in happens before order (see Section 1.4.5), the foreign
18 task must complete execution before the generated task begins execution. Similarly, when the
19 creation of a foreign task follows the encountering of an interop construct in happens before
20 order, the foreign task must not begin execution until the generated task completes execution. No
21 ordering is imposed between the encountering thread and either foreign tasks or OpenMP tasks by
22 the interop construct.
23 If the interop-type set does not include targetsync, the nowait clause has no effect.
24 Restrictions
25 Restrictions to the interop construct are as follows:
26 • A depend clause can only appear on the directive if the interop-type includes targetsync.
27 • Each interop-var may be specified for at most one action-clause of each interop construct.
28 Cross References
29 • Interoperability Routines, see Section 18.12
30 • depend clause, see Section 15.9.5
31 • destroy clause, see Section 3.5
32 • device clause, see Section 13.2
33 • init clause, see Section 14.1.2
34 • nowait clause, see Section 15.6
35 • use clause, see Section 14.1.3
9 Arguments
Name Type Properties
10
interop-var variable of omp_interop_t type default
11 Modifiers
Name Modifies Type Properties
interop-preference Generic Complex, name: complex, unique
prefer_type Arguments:
preference_list OpenMP
12 foreign runtime preference
list (default)
13 Directives
14 interop
15 Semantics
16 The init clause specifies that interop-var is initialized to refer to the list of properties associated
17 with any interop-type. For any interop-type, the properties type, type_name, vendor,
18 vendor_name and device_num will be available. If the implementation cannot initialize
19 interop-var, it is initialized to the value of omp_interop_none, which is defined to be zero.
20 The targetsync interop-type will additionally provide the targetsync property, which is the
21 handle to a foreign synchronization object for enabling synchronization between OpenMP tasks and
22 foreign tasks that execute in the foreign execution context.
23 The target interop-type will additionally provide the following properties:
24 • device, which will be a foreign device handle;
25 • device_context, which will be a foreign device context handle; and
26 • platform, which will be a handle to a foreign platform of the device.
4 Restrictions
5 Restrictions to the init clause are as follows:
6 • Each interop-type may be specified at most once.
7 • interop-var must be non-const.
8 Cross References
9 • OpenMP Foreign Runtime Identifiers, see Section 14.1.1
10 • interop directive, see Section 14.1
13 Arguments
Name Type Properties
14
interop-var variable of omp_interop_t type default
15 Directives
16 interop
17 Semantics
18 The use clause specifies the interop-var that is used for the effects of the directive on which the
19 clause appears. However, interop-var is not initialized, destroyed or otherwise modified. The
20 interop-type is inferred based on the interop-type used to initialize interop-var.
21 Cross References
22 • interop directive, see Section 14.1
7 Cross References
8 • Declare Variant Directives, see Section 7.5
9 • dispatch directive, see Section 7.6
16 Cross References
17 • hint clause, see Section 15.1.2
18 • omp_init_lock_with_hint and omp_init_nest_lock_with_hint, see
19 Section 18.9.2
15 Note – Future OpenMP specifications may add additional hints to the sync_hint type.
16 Implementers are advised to add implementation-defined hints starting from the most significant bit
17 of the type and to include the name of the implementation in the name of the added hint to avoid
18 name conflicts with other OpenMP implementations.
19
20 The OpenMP sync_hint and lock_hint types are synonyms for each other. The OpenMP
21 lock_hint type has been deprecated.
22 Restrictions
23 Restrictions to the synchronization hints are as follows:
24 • The hints omp_sync_hint_uncontended and omp_sync_hint_contended cannot
25 be combined.
26 • The hints omp_sync_hint_nonspeculative and omp_sync_hint_speculative
27 cannot be combined.
28 The restrictions for combining multiple values of the OpenMP sync_hint type apply equally to
29 the corresponding values of the OpenMP lock_hint type, and expressions that mix the two
30 types.
3 Arguments
Name Type Properties
4
hint-expr expression of sync_hint type default
5 Directives
6 atomic, critical
7 Semantics
8 The hint clause gives the implementation additional information about the expected runtime
9 properties of the region that corresponds to the construct on which it appears and that can
10 optionally be used to optimize the implementation. The presence of a hint clause does not affect
11 the semantics of the construct. If no hint clause is specified for a construct that accepts it, the
12 effect is as if hint(omp_sync_hint_none) had been specified.
13 Restrictions
14 • hint-expr must evaluate to a valid synchronization hint.
15 Cross References
16 • Synchronization Hint Type, see Section 15.1.1
17 • atomic directive, see Section 15.8.4
18 • critical directive, see Section 15.2
21 Arguments
22 critical(name)
Name Type Properties
23
name base language identifier optional
24 Clauses
25 hint
26 Binding
27 The binding thread set for a critical region is all threads in the contention group.
18 Tool Callbacks
19 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
20 occurrence of a critical-acquiring event in that thread. This callback has the type signature
21 ompt_callback_mutex_acquire_t.
22 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
23 occurrence of a critical-acquired event in that thread. This callback has the type signature
24 ompt_callback_mutex_t.
25 A thread dispatches a registered ompt_callback_mutex_released callback for each
26 occurrence of a critical-released event in that thread. This callback has the type signature
27 ompt_callback_mutex_t.
28 The callbacks occur in the task that encounters the critical construct. The callbacks should receive
29 ompt_mutex_critical as their kind argument if practical, but a less specific kind is
30 acceptable.
15 15.3 Barriers
16 15.3.1 barrier Construct
Name: barrier Association: none
17
Category: executable Properties: default
18 Binding
19 The binding thread set for a barrier region is the current team. A barrier region binds to the
20 innermost enclosing parallel region.
21 Semantics
22 The barrier construct specifies an explicit barrier at the point at which the construct appears.
23 Unless the binding region is canceled, all threads of the team that executes that binding region must
24 enter the barrier region and complete execution of all explicit tasks bound to that binding region
25 before any of the threads continue execution beyond the barrier.
26 The barrier region includes an implicit task scheduling point in the current task region.
12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_sync_region callback with
14 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_begin
15 as its endpoint argument for each occurrence of an explicit-barrier-begin event. Similarly, a thread
16 dispatches a registered ompt_callback_sync_region callback with
17 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_end as
18 its endpoint argument for each occurrence of an explicit-barrier-end event. These callbacks occur
19 in the context of the task that encountered the barrier construct and have type signature
20 ompt_callback_sync_region_t.
21 A thread dispatches a registered ompt_callback_sync_region_wait callback with
22 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_begin
23 as its endpoint argument for each occurrence of an explicit-barrier-wait-begin event. Similarly, a
24 thread dispatches a registered ompt_callback_sync_region_wait callback with
25 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_end as
26 its endpoint argument for each occurrence of an explicit-barrier-wait-end event. These callbacks
27 occur in the context of the task that encountered the barrier construct and have type signature
28 ompt_callback_sync_region_t.
29 A thread dispatches a registered ompt_callback_cancel callback with
30 ompt_cancel_detected as its flags argument for each occurrence of a cancellation event in
31 that thread. The callback occurs in the context of the encountering task. The callback has type
32 signature ompt_callback_cancel_t.
33 Restrictions
34 Restrictions to the barrier construct are as follows:
35 • Each barrier region must be encountered by all threads in a team or by none at all, unless
36 cancellation has been requested for the innermost enclosing parallel region.
37 • The sequence of worksharing regions and barrier regions encountered must be the same for
38 every thread in a team.
22 Tool Callbacks
23 A thread dispatches a registered ompt_callback_sync_region callback for each implicit
24 barrier begin and end event. Similarly, a thread dispatches a registered
25 ompt_callback_sync_region_wait callback for each implicit barrier wait-begin and
26 wait-end event. All callbacks for implicit barrier events execute in the context of the encountering
27 task and have type signature ompt_callback_sync_region_t.
28 For the implicit barrier at the end of a worksharing construct, the kind argument is
29 ompt_sync_region_barrier_implicit_workshare. For the implicit barrier at the end
30 of a parallel region, the kind argument is
31 ompt_sync_region_barrier_implicit_parallel. For an extra barrier added by an
32 OpenMP implementation, the kind argument is
33 ompt_sync_region_barrier_implementation. For a barrier at the end of a teams
34 region, the kind argument is ompt_sync_region_barrier_teams.
5 Restrictions
6 Restrictions to implicit barriers are as follows:
7 • If a thread is in the state ompt_state_wait_barrier_implicit_parallel, a call to
8 ompt_get_parallel_info may return a pointer to a copy of the data object associated
9 with the parallel region rather than a pointer to the associated data object itself. Writing to the
10 data object returned by omp_get_parallel_info when a thread is in the
11 ompt_state_wait_barrier_implicit_parallel results in unspecified behavior.
12 Cross References
13 • ompt_callback_cancel_t, see Section 19.5.2.18
14 • ompt_callback_sync_region_t, see Section 19.5.2.13
15 • ompt_cancel_flag_t, see Section 19.4.4.26
16 • ompt_scope_endpoint_t, see Section 19.4.4.11
17 • ompt_sync_region_t, see Section 19.4.4.14
27 Clauses
28 allocate, task_reduction
29 Binding
30 The binding task set of a taskgroup region is all tasks of the current team that are generated in
31 the region. A taskgroup region binds to the innermost enclosing parallel region.
16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_sync_region callback with
18 ompt_sync_region_taskgroup as its kind argument and ompt_scope_begin as its
19 endpoint argument for each occurrence of a taskgroup-begin event in the task that encounters the
20 taskgroup construct. Similarly, a thread dispatches a registered
21 ompt_callback_sync_region callback with ompt_sync_region_taskgroup as its
22 kind argument and ompt_scope_end as its endpoint argument for each occurrence of a
23 taskgroup-end event in the task that encounters the taskgroup construct. These callbacks occur
24 in the task that encounters the taskgroup construct and have the type signature
25 ompt_callback_sync_region_t.
26 A thread dispatches a registered ompt_callback_sync_region_wait callback with
27 ompt_sync_region_taskgroup as its kind argument and ompt_scope_begin as its
28 endpoint argument for each occurrence of a taskgroup-wait-begin event. Similarly, a thread
29 dispatches a registered ompt_callback_sync_region_wait callback with
30 ompt_sync_region_taskgroup as its kind argument and ompt_scope_end as its
31 endpoint argument for each occurrence of a taskgroup-wait-end event. These callbacks occur in the
32 context of the task that encounters the taskgroup construct and have type signature
33 ompt_callback_sync_region_t.
34 Cross References
35 • Task Scheduling, see Section 12.9
36 • ompt_callback_sync_region_t, see Section 19.5.2.13
37 • ompt_scope_endpoint_t, see Section 19.4.4.11
6 Clauses
7 depend, nowait
8 Binding
9 The taskwait region binds to the current task region. The binding thread set of the taskwait
10 region is the current team.
11 Semantics
12 The taskwait construct specifies a wait on the completion of child tasks of the current task.
13 If no depend clause is present on the taskwait construct, the current task region is suspended
14 at an implicit task scheduling point associated with the construct. The current task region remains
15 suspended until all child tasks that it generated before the taskwait region complete execution.
16 If one or more depend clauses are present on the taskwait construct and the nowait clause is
17 not also present, the behavior is as if these clauses were applied to a task construct with an empty
18 associated structured block that generates a mergeable and included task. Thus, the current task
19 region is suspended until the predecessor tasks of this task complete execution.
20 If one or more depend clauses are present on the taskwait construct and the nowait clause is
21 also present, the behavior is as if these clauses were applied to a task construct with an empty
22 associated structured block that generates a task for which execution may be deferred. Thus, all
23 predecessor tasks of this task must complete execution before any subsequently generated task that
24 depends on this task starts its execution.
7 Tool Callbacks
8 A thread dispatches a registered ompt_callback_sync_region callback with
9 ompt_sync_region_taskwait as its kind argument and ompt_scope_begin as its
10 endpoint argument for each occurrence of a taskwait-begin event in the task that encounters the
11 taskwait construct. Similarly, a thread dispatches a registered
12 ompt_callback_sync_region callback with ompt_sync_region_taskwait as its
13 kind argument and ompt_scope_end as its endpoint argument for each occurrence of a
14 taskwait-end event in the task that encounters the taskwait construct. These callbacks occur in
15 the task that encounters the taskwait construct and have the type signature
16 ompt_callback_sync_region_t.
17 A thread dispatches a registered ompt_callback_sync_region_wait callback with
18 ompt_sync_region_taskwait as its kind argument and ompt_scope_begin as its
19 endpoint argument for each occurrence of a taskwait-wait-begin event. Similarly, a thread
20 dispatches a registered ompt_callback_sync_region_wait callback with
21 ompt_sync_region_taskwait as its kind argument and ompt_scope_end as its endpoint
22 argument for each occurrence of a taskwait-wait-end event. These callbacks occur in the context of
23 the task that encounters the taskwait construct and have type signature
24 ompt_callback_sync_region_t.
25 A thread dispatches a registered ompt_callback_task_create callback for each occurrence
26 of a taskwait-init event in the context of the encountering task. This callback has the type signature
27 ompt_callback_task_create_t. In the dispatched callback, (flags &
28 ompt_task_taskwait) always evaluates to true. If the nowait clause is not present,
29 (flags & ompt_task_undeferred) also evaluates to true.
30 A thread dispatches a registered ompt_callback_task_schedule callback for each
31 occurrence of a taskwait-complete event. This callback has the type signature
32 ompt_callback_task_schedule_t with ompt_taskwait_complete as its
33 prior_task_status argument.
34 Restrictions
35 Restrictions to the taskwait construct are as follows:
36 • The mutexinoutset dependence-type may not appear in a depend clause on a taskwait
37 construct.
38 • If the dependence-type of a depend clause is depobj then the dependence objects cannot
39 represent dependences of the mutexinoutset dependence type.
2 Cross References
3 • ompt_callback_sync_region_t, see Section 19.5.2.13
4 • ompt_scope_endpoint_t, see Section 19.4.4.11
5 • ompt_sync_region_t, see Section 19.4.4.14
6 • depend clause, see Section 15.9.5
7 • nowait clause, see Section 15.6
8 • task directive, see Section 12.5
11 Directives
12 dispatch, do, for, interop, scope, sections, single, target, target enter
13 data, target exit data, target update, taskwait, workshare
14 Semantics
15 The nowait clause overrides any synchronization that would otherwise occur at the end of a
16 construct. It can also specify that an interoperability requirement set includes the nowait property.
17 If the construct includes an implicit barrier, the nowait clause specifies that the barrier will not
18 occur. For constructs that generate a task, the nowait clause specifies that the generated task may
19 be deferred. If the nowait clause is not present on the directive then the generated task is an
20 included task (so it executes synchronously in the context of the encountering task). For constructs
21 that generate an interoperability requirement set, the nowait clause adds the nowait property to
22 the set.
23 Cross References
24 • dispatch directive, see Section 7.6
25 • do directive, see Section 11.5.2
26 • for directive, see Section 11.5.1
27 • interop directive, see Section 14.1
28 • scope directive, see Section 11.2
29 • sections directive, see Section 11.3
30 • single directive, see Section 11.1
31 • target directive, see Section 13.8
8 Directives
9 taskloop
10 Semantics
11 The nogroup clause overrides any implicit taskgroup that would otherwise enclose the
12 construct.
13 Cross References
14 • taskloop directive, see Section 12.6
21 Directives
22 atomic, flush
23 Semantics
24 The memory-order clause grouping defines a set of clauses that indicate the memory ordering
25 requirements for the visibility of the effects of the constructs on which they may be specified.
7 Directives
8 atomic
9 Semantics
10 The atomic clause grouping defines a set of clauses that defines the semantics for which a directive
11 enforces atomicity. If a construct accepts the atomic clause grouping and no member of the
12 grouping is specified, the effect is as if the update clause is specified.
13 Cross References
14 • atomic directive, see Section 15.8.4
18 Directives
19 atomic
20 Semantics
21 The extended-atomic clause grouping defines a set of clauses that extend the atomicity semantics
22 specified by members of the atomic clause grouping. Other than the fail clause, they are
23 inarguable; the fail clause takes a member of the memory-order clause grouping as an argument.
24 The capture clause extends the semantics to capture the value of the variable being updated
25 atomically. The compare clause extends the semantics to perform the atomic update conditionally.
26 The fail clause extends the semantics to specify the memory ordering requirements for any
27 comparison performed by any atomic conditional update that fails. Its argument overrides any other
28 specified memory ordering. If the fail clause is not specified on an atomic conditional update the
29 effect is as if the fail clause is specified with a default argument that depends on the effective
30 memory ordering. If the effective memory ordering is acq_rel, the default argument is
31 acquire. If the effective memory ordering is release, the default argument is relaxed. For
32 any other effective memory ordering, the default argument is equal to that effective memory
33 ordering. The weak clause specifies that the comparison performed by a conditional atomic update
34 may spuriously fail, evaluating to not equal even when the values are equal.
19 Tool Callbacks
20 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
21 occurrence of an atomic-acquiring event in that thread. This callback has the type signature
22 ompt_callback_mutex_acquire_t.
23 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
24 occurrence of an atomic-acquired event in that thread. This callback has the type signature
25 ompt_callback_mutex_t.
26 A thread dispatches a registered ompt_callback_mutex_released callback with
27 ompt_mutex_atomic as the kind argument if practical, although a less specific kind may be
28 used, for each occurrence of an atomic-released event in that thread. This callback has the type
29 signature ompt_callback_mutex_t and occurs in the task that encounters the atomic
30 construct.
31 Restrictions
32 Restrictions to the atomic construct are as follows:
33 • OpenMP constructs may not be encountered during execution of an atomic region.
34 • If a capture or compare clause is specified, the atomic clause must be update.
35 • If a capture clause is specified but the compare clause is not specified, an
36 update-capture-atomic structured block must be associated with the construct.
7 Arguments
8 flush(list)
Name Type Properties
9
list list of variable list item type optional
10 Clause groups
11 memory-order
12 Binding
13 The binding thread set for a flush region is all threads in the device-set of its flush operation.
14 Semantics
15 The flush construct executes the OpenMP flush operation. This operation makes a thread’s
16 temporary view of memory consistent with memory and enforces an order on the memory
17 operations of the variables explicitly specified or implied. Execution of a flush region affects the
18 memory and it affects the temporary view of memory of the encountering thread. It does not affect
19 the temporary view of other threads. Other threads on devices in the device-set must themselves
20 execute a flush operation in order to be guaranteed to observe the effects of the flush operation of
21 the encountering thread. See the memory model description in Section 1.4 for more details.
22 If neither a memory-order clause nor a list argument appears on a flush construct then the
23 behavior is as if the memory-order clause is seq_cst.
24 A flush construct with the seq_cst clause, executed on a given thread, operates as if all data
25 storage blocks that are accessible to the thread are flushed by a strong flush operation. A flush
26 construct with a list applies a strong flush operation to the items in the list, and the flush operation
27 does not complete until the operation is complete for all specified list items. An implementation
28 may implement a flush construct with a list by ignoring the list and treating it the same as a
29 flush construct with the seq_cst clause.
30 If no list items are specified, the flush operation has the release and/or acquire flush properties:
31 • If the memory-order clause is seq_cst or acq_rel, the flush operation is both a release flush
32 and an acquire flush.
21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_flush callback for each occurrence of a
23 flush event in that thread. This callback has the type signature ompt_callback_flush_t.
24 Restrictions
25 Restrictions to the flush construct are as follows:
26 • If a memory-order clause is specified, the list argument must not be specified.
27 • The memory-order clause must not be relaxed.
28 Cross References
29 • ompt_callback_flush_t, see Section 19.5.2.17
4 Clauses
5 depend, update
6 Semantics
7 OpenMP clauses that are related to task dependences use the task-dependence-type modifier to
8 identify the type of dependence relevant to that clause. The effect of the type of dependence is
9 associated with locator list items as described with the depend clause, see Section 15.9.5.
10 Cross References
11 • depend clause, see Section 15.9.5
12 • update clause, see Section 15.9.3
21 Arguments
Name Type Properties
22 task-dependence-type Keyword: depobj, in, inout, default
inoutset, mutexinoutset, out
23 Directives
24 depobj
25 Semantics
26 The update clause sets the dependence type of an OpenMP depend object to
27 task-dependence-type.
4 Cross References
5 • depobj directive, see Section 15.9.4
6 • task-dependence-type modifier, see Section 15.9.1
9 Arguments
10 depobj(depend-object)
Name Type Properties
11
depend-object variable of depend type default
12 Clauses
13 depend, destroy, update
14 Clause set
15 Properties: unique, required, exclusive Members: depend, destroy, update
16 Binding
17 The binding thread set for a depobj region is the encountering thread.
18 Semantics
19 The depobj construct initializes, updates or destroys an OpenMP depend object. If a depend
20 clause is specified, the state of depend-object is set to initialized and depend-object is set to
21 represent the dependence that the depend clause specifies. If an update clause is specified,
22 depend-object is updated to represent the new dependence type. If a destroy clause is specified,
23 the state of depend-object is set to uninitialized.
24 Restrictions
25 Restrictions to the depobj construct are as follows:
26 • A depend clause on a depobj construct must only specify one locator.
27 • The state of depend-object must be uninitialized if a depend clause is specified.
28 • The state of depend-object must be initialized if a destroy clause or update clause is
29 specified.
8 Arguments
Name Type Properties
9
locator-list list of locator list item type default
10 Modifiers
Name Modifies Type Properties
task-dependence- locator-list Keyword: depobj, in, required, ultimate
type inout, inoutset,
mutexinoutset, out
11 iterator locator-list Complex, name: iterator unique
Arguments:
iterator-specifier OpenMP
expression (repeatable)
12 Directives
13 depobj, interop, target, target enter data, target exit data, target
14 update, task, taskwait
15 Semantics
16 The depend clause enforces additional constraints on the scheduling of tasks. These constraints
17 establish dependences only between sibling tasks. Task dependences are derived from the
18 task-dependence-type and the list items.
19 The storage location of a list item matches the storage location of another list item if they have the
20 same storage location, or if any of the list items is omp_all_memory.
21 For the in task-dependence-type, if the storage location of at least one of the list items matches the
22 storage location of a list item appearing in a depend clause with an out, inout,
23 mutexinoutset, or inoutset task-dependence-type on a construct from which a sibling task
24 was previously generated, then the generated task will be a dependent task of that sibling task.
7 Tool Callbacks
8 A thread dispatches the ompt_callback_dependences callback for each occurrence of the
9 task-dependences event to announce its dependences with respect to the list items in the depend
10 clause. This callback has type signature ompt_callback_dependences_t.
11 A thread dispatches the ompt_callback_task_dependence callback for a task-dependence
12 event to report a dependence between a predecessor task (src_task_data) and a dependent task
13 (sink_task_data). This callback has type signature ompt_callback_task_dependence_t.
14 Restrictions
15 Restrictions to the depend clause are as follows:
16 • List items, other than reserved locators, used in depend clauses of the same task or sibling tasks
17 must indicate identical storage locations or disjoint storage locations.
18 • List items used in depend clauses cannot be zero-length array sections.
19 • The omp_all_memory reserved locator can only be used in a depend clause with an out or
20 inout task-dependence-type.
21 • Array sections cannot be specified in depend clauses with the depobj task-dependence-type.
22 • List items used in depend clauses with the depobj task-dependence-type must be expressions
23 of the OpenMP depend type that correspond to depend objects in the initialized state.
24 • List items that are expressions of the OpenMP depend type can only be used in depend
25 clauses with the depobj task-dependence-type.
Fortran
26 • A common block name cannot appear in a depend clause.
Fortran
C / C++
27 • A bit-field cannot appear in a depend clause.
C / C++
18 Arguments
Name Type Properties
19
vector loop-iteration vector default
20 Modifiers
Name Modifies Type Properties
21
dependence-type vector Keyword: sink, source required
22 Directives
23 ordered
24 Additional information
25 The clause-name depend may be used as a synonym for the clause-name doacross. This use
26 has been deprecated.
13 Note – If the sink dependence-type is specified for a vector that does not indicate an earlier
14 iteration of the logical iteration space, deadlock may occur.
15
16 Restrictions
17 Restrictions to the doacross clause are as follows:
18 • If vector is specified without the omp_cur_iteration keyword and it has n dimensions, the
19 innermost loop-associated construct that encloses the construct on which the clause appears must
20 specify an ordered clause for which the parameter value equals n.
21 • If vector is specified with the omp_cur_iteration keyword and with sink as the
22 dependence-type then it must be omp_cur_iteration - 1.
23 • If vector is specified with source as the dependence-type then it must be
24 omp_cur_iteration.
25 • For each element of vector for which the sink dependence-type is specified, if the loop iteration
26 variable var i has an integral or pointer type, the ith expression of vector must be computable
27 without overflow in that type for any value of var i that can encounter the construct on which the
28 doacross clause appears.
C++
29 • For each element of vector for which the sink dependence-type is specified, if the loop iteration
30 variable var i is of a random access iterator type other than pointer type, the ith expression of
31 vector must be computable without overflow in the type that would be used by
32 std::distance applied to variables of the type of var i for any value of var i that can
33 encounter the construct on which the doacross clause appears.
C++
16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
18 occurrence of an ordered-acquiring event in that thread. This callback has the type signature
19 ompt_callback_mutex_acquire_t.
20 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
21 occurrence of an ordered-acquired event in that thread. This callback has the type signature
22 ompt_callback_mutex_t.
23 A thread dispatches a registered ompt_callback_mutex_released callback with
24 ompt_mutex_ordered as the kind argument if practical, although a less specific kind may be
25 used, for each occurrence of an ordered-released event in that thread. This callback has the type
26 signature ompt_callback_mutex_t and occurs in the task that encounters the construct.
27 Restrictions
28 • The construct that corresponds to the binding region of an ordered region must specify an
29 ordered clause.
30 • The construct that corresponds to the binding region of an ordered region must not specify a
31 reduction clause with the inscan modifier.
32 • The regions of a stand-alone ordered construct and a block-associated ordered construct
33 must not have the same binding region.
6 Clauses
7 doacross
8 Binding
9 The binding thread set for a stand-alone ordered region is the current team. A stand-alone
10 ordered region binds to the innermost enclosing worksharing-loop region.
11 Semantics
12 The stand-alone ordered construct specifies that execution must not violate cross-iteration
13 dependences as specified in the doacross clauses that appear on the construct. When a thread
14 that is executing an iteration encounters a ordered construct with one or more doacross
15 clauses for which the sink dependence-type is specified, the thread waits until its dependences on
16 all valid iterations specified by the doacross clauses are satisfied before it continues execution. A
17 specific dependence is satisfied when a thread that is executing the corresponding iteration
18 encounters an ordered construct with a doacross clause for which the source
19 dependence-type is specified.
27 Tool Callbacks
28 A thread dispatches a registered ompt_callback_dependences callback with all vector
29 entries listed as ompt_dependence_type_sink in the deps argument for each occurrence of a
30 doacross-sink event in that thread. A thread dispatches a registered
31 ompt_callback_dependences callback with all vector entries listed as
32 ompt_dependence_type_source in the deps argument for each occurrence of a
33 doacross-source event in that thread. These callbacks have the type signature
34 ompt_callback_dependences_t.
7 Cross References
8 • Worksharing-Loop Constructs, see Section 11.5
9 • ompt_callback_dependences_t, see Section 19.5.2.8
10 • doacross clause, see Section 15.9.6
13 Clause groups
14 parallelization-level
15 Binding
16 The binding thread set for a block-associated ordered region is the current team. A
17 block-associated ordered region binds to the innermost enclosing worksharing-loop, simd or
18 worksharing-loop SIMD region.
19 Semantics
20 If no clauses are specified, the effect is as if the threads parallelization-level clause was
21 specified. If the threads clause is specified, the threads in the team that is executing the
22 worksharing-loop region execute ordered regions sequentially in the order of the loop iterations.
23 If the simd parallelization-level clause is specified, the ordered regions encountered by any
24 thread will execute one at a time in the order of the loop iterations. With either
25 parallelization-level, execution of code outside the region for different iterations can run in parallel;
26 execution of that code within the same iteration must observe any constraints imposed by the
27 base-language semantics.
28 When the thread that is executing the first iteration of the loop encounters an ordered construct,
29 it can enter the ordered region without waiting. When a thread that is executing any subsequent
30 iteration encounters a block-associated ordered construct, it waits at the beginning of the
31 ordered region until execution of all ordered regions that belong to all previous iterations has
32 completed. ordered regions that bind to different regions execute independently of each other.
18 Cross References
19 • Worksharing-Loop Constructs, see Section 11.5
20 • ordered clause, see Section 4.4.4
21 • parallelization-level Clauses, see Section 15.10.3
22 • simd directive, see Section 10.4
26 Directives
27 ordered
28 Semantics
29 The parallelization-level clause grouping defines a set of clauses that indicate the level of
30 parallelization with which to associate a construct.
31 Cross References
32 • ordered directive, see Section 15.10.2
5 Clauses
6 if, do, for, parallel, sections, taskgroup
7 Additional information
8 The cancel-directive-name clause set consists of the directive-name of each directive that has the
9 cancellable property (i.e., directive-name for the worksharing-loop construct, parallel,
10 sections and taskgroup). This clause set has the required, unique and exclusive properties.
11 Binding
12 The binding thread set of the cancel region is the current team. The binding region of the
13 cancel region is the innermost enclosing region of the type that corresponds to
14 cancel-directive-name.
15 Semantics
16 The cancel construct activates cancellation of the innermost enclosing region of the type
17 specified by cancel-directive-name, which must be the directive-name of a cancellable construct.
18 Cancellation of the binding region is activated only if the cancel-var ICV is true, in which case the
19 cancel construct causes the encountering task to continue execution at the end of the binding
20 region if cancel-directive-name is not taskgroup. If the cancel-var ICV is true and
21 cancel-directive-name is taskgroup, the encountering task continues execution at the end of the
22 current task region. If the cancel-var ICV is false, the cancel construct is ignored.
23 Threads check for active cancellation only at cancellation points that are implied at the following
24 locations:
25 • cancel regions;
26 • cancellation point regions;
27 • barrier regions;
332
1 • at the end of a worksharing-loop construct with a nowait clause and for which the same list
2 item appears in both firstprivate and lastprivate clauses; and
3 • implicit barrier regions.
4 When a thread reaches one of the above cancellation points and if the cancel-var ICV is true, then:
5 • If the thread is at a cancel or cancellation point region and cancel-directive-name is
6 not taskgroup, the thread continues execution at the end of the canceled region if cancellation
7 has been activated for the innermost enclosing region of the type specified.
8 • If the thread is at a cancel or cancellation point region and cancel-directive-name is
9 taskgroup, the encountering task checks for active cancellation of all of the taskgroup sets to
10 which the encountering task belongs, and continues execution at the end of the current task
11 region if cancellation has been activated for any of the taskgroup sets.
12 • If the encountering task is at a barrier region or at the end of a worksharing-loop construct with a
13 nowait clause and for which the same list item appears in both firstprivate and
14 lastprivate clauses, the encountering task checks for active cancellation of the innermost
15 enclosing parallel region. If cancellation has been activated, then the encountering task
16 continues execution at the end of the canceled region.
17 When cancellation of tasks is activated through a cancel construct with taskgroup for
18 cancel-directive-name, the tasks that belong to the taskgroup set of the innermost enclosing
19 taskgroup region will be canceled. The task that encountered that construct continues execution
20 at the end of its task region, which implies completion of that task. Any task that belongs to the
21 innermost enclosing taskgroup and has already begun execution must run to completion or until
22 a cancellation point is reached. Upon reaching a cancellation point and if cancellation is active, the
23 task continues execution at the end of its task region, which implies the completion of the task. Any
24 task that belongs to the innermost enclosing taskgroup and that has not begun execution may be
25 discarded, which implies its completion.
26 When cancellation of tasks is activated through a cancel construct with cancel-directive-name
27 other than taskgroup, each thread of the binding thread set resumes execution at the end of the
28 canceled region if a cancellation point is encountered. If the canceled region is a parallel region,
29 any tasks that have been created by a task or a taskloop construct and their descendent tasks
30 are canceled according to the above taskgroup cancellation semantics. If the canceled region is
31 not a parallel region, no task cancellation occurs.
C++
32 The usual C++ rules for object destruction are followed when cancellation is performed.
C++
Fortran
33 All private objects or subobjects with ALLOCATABLE attribute that are allocated inside the
34 canceled construct are deallocated.
Fortran
16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_cancel callback for each occurrence of a
18 cancel event in the context of the encountering task. This callback has type signature
19 ompt_callback_cancel_t; (flags & ompt_cancel_activated) always evaluates to
20 true in the dispatched callback; (flags & ompt_cancel_parallel) evaluates to true in the
21 dispatched callback if cancel-directive-name is parallel;
22 (flags & ompt_cancel_sections) evaluates to true in the dispatched callback if
23 cancel-directive-name is sections; (flags & ompt_cancel_loop) evaluates to true in the
24 dispatched callback if cancel-directive-name is for or do; and
25 (flags & ompt_cancel_taskgroup) evaluates to true in the dispatched callback if
26 cancel-directive-name is taskgroup.
27 A thread dispatches a registered ompt_callback_cancel callback with the ompt_data_t
28 associated with the discarded task as its task_data argument and
29 ompt_cancel_discarded_task as its flags argument for each occurrence of a
30 discarded-task event. The callback occurs in the context of the task that discards the task and has
31 type signature ompt_callback_cancel_t.
32 Restrictions
33 Restrictions to the cancel construct are as follows:
34 • The behavior for concurrent cancellation of a region and a region nested within it is unspecified.
35 • If cancel-directive-name is taskgroup, the cancel construct must be closely nested inside a
36 task or a taskloop construct and the cancel region must be closely nested inside a
37 taskgroup region.
20 Cross References
21 • omp_get_cancellation, see Section 18.2.8
22 • ompt_callback_cancel_t, see Section 19.5.2.18
23 • ompt_cancel_flag_t, see Section 19.4.4.26
24 • barrier directive, see Section 15.3.1
25 • cancel-var ICV, see Table 2.1
26 • cancellation point directive, see Section 16.2
27 • declare reduction directive, see Section 5.5.11
28 • do directive, see Section 11.5.2
29 • firstprivate clause, see Section 5.4.4
30 • for directive, see Section 11.5.1
31 • if clause, see Section 3.4
32 • nowait clause, see Section 15.6
33 • ordered clause, see Section 4.4.4
34 • parallel directive, see Section 10.1
8 Clauses
9 do, for, parallel, sections, taskgroup
10 Additional information
11 The cancel-directive-name clause set consists of the directive-name of each directive that has the
12 cancellable property (i.e., directive-name for the worksharing-loop construct, parallel,
13 sections and taskgroup). This clause set has the required, unique and exclusive properties.
14 Binding
15 The binding thread set of the cancellation point construct is the current team. The binding
16 region of the cancellation point region is the innermost enclosing region of the type that
17 corresponds to cancel-directive-name.
18 Semantics
19 The cancellation point construct introduces a user-defined cancellation point at which an
20 implicit or explicit task must check if cancellation of the innermost enclosing region of the type
21 specified by cancel-directive-name, which must be the directive-name of a cancellable construct,
22 has been activated. This construct does not implement any synchronization between threads or
23 tasks. When an implicit or explicit task reaches a user-defined cancellation point and if the
24 cancel-var ICV is true, then:
25 • If the cancel-directive-name of the encountered cancellation point construct is not
26 taskgroup, the thread continues execution at the end of the canceled region if cancellation has
27 been activated for the innermost enclosing region of the type specified.
28 • If the cancel-directive-name of the encountered cancellation point construct is
29 taskgroup, the encountering task checks for active cancellation of all taskgroup sets to which
30 the encountering task belongs and continues execution at the end of the current task region if
31 cancellation has been activated for any of them.
4 Tool Callbacks
5 A thread dispatches a registered ompt_callback_cancel callback for each occurrence of a
6 cancel event in the context of the encountering task. This callback has type signature
7 ompt_callback_cancel_t; (flags & ompt_cancel_detected) always evaluates to true
8 in the dispatched callback; (flags & ompt_cancel_parallel) evaluates to true in the
9 dispatched callback if cancel-directive-name of the encountered cancellation point
10 construct is parallel; (flags & ompt_cancel_sections) evaluates to true in the
11 dispatched callback if cancel-directive-name of the encountered cancellation point
12 construct is sections; (flags & ompt_cancel_loop) evaluates to true in the dispatched
13 callback if cancel-directive-name of the encountered cancellation point construct is for
14 or do; and (flags & ompt_cancel_taskgroup) evaluates to true in the dispatched callback if
15 cancel-directive-name of the encountered cancellation point construct is taskgroup.
16 Restrictions
17 Restrictions to the cancellation point construct are as follows:
18 • A cancellation point construct for which cancel-directive-name is taskgroup must be
19 closely nested inside a task or taskloop construct, and the cancellation point region
20 must be closely nested inside a taskgroup region.
21 • A cancellation point construct for which cancel-directive-name is not taskgroup must
22 be closely nested inside an OpenMP construct that matches cancel-directive-name.
23 Cross References
24 • omp_get_cancellation, see Section 18.2.8
25 • ompt_callback_cancel_t, see Section 19.5.2.18
26 • cancel-var ICV, see Table 2.1
27 • do directive, see Section 11.5.2
28 • for directive, see Section 11.5.1
29 • parallel directive, see Section 10.1
30 • sections directive, see Section 11.3
31 • taskgroup directive, see Section 15.4
338
1 • If a target construct is encountered during execution of a target region and a device
2 clause in which the ancestor device-modifier appears is not present on the construct, the
3 behavior is unspecified.
4 • A teams region must be strictly nested either within the implicit parallel region that surrounds
5 the whole OpenMP program or within a target region. If a teams construct is nested within
6 a target construct, that target construct must contain no statements, declarations or
7 directives outside of the teams construct.
8 • distribute regions, including any distribute regions arising from composite constructs,
9 parallel regions, including any parallel regions arising from combined constructs, loop
10 regions, omp_get_num_teams() regions, and omp_get_team_num() regions are the
11 only OpenMP regions that may be strictly nested inside the teams region.
12 • A loop region that binds to a teams region must be strictly nested inside a teams region.
13 • A distribute region must be strictly nested inside a teams region.
14 • If cancel-directive-name is taskgroup, the cancel construct must be closely nested inside a
15 task construct and the cancel region must be closely nested inside a taskgroup region.
16 Otherwise, the cancel construct must be closely nested inside an OpenMP construct for which
17 directive-name is cancel-directive-name.
18 • A cancellation point construct for which cancel-directive-name is taskgroup must be
19 closely nested inside a task construct, and the cancellation point region must be closely
20 nested inside a taskgroup region. Otherwise, a cancellation point construct must be
21 closely nested inside an OpenMP construct for which directive-name is cancel-directive-name.
22 • The only constructs that may be encountered inside a region that corresponds to a construct with
23 an order clause that specifies concurrent are the loop, parallel and simd constructs,
24 and combined constructs for which directive-name-A is parallel.
25 • A region that corresponds to a construct with an order clause that specifies concurrent may
26 not contain calls to the OpenMP Runtime API or to procedures that contain OpenMP directives.
34 Restrictions
35 Restrictions to clauses on combined and composite constructs are as follows:
36 • A clause that appears on a combined or composite construct must apply to at least one of the leaf
37 constructs per the rules defined in this section.
29 Cross References
30 • distribute directive, see Section 11.6
31 • do directive, see Section 11.5.2
32 • for directive, see Section 11.5.1
33 • loop directive, see Section 11.7
34 • masked directive, see Section 10.5
35 • parallel directive, see Section 10.1
12 Restrictions
13 Restrictions to combined constructs are as follows:
14 • The restrictions of directive-name-A and directive-name-B apply.
15 • If directive-name-A is parallel, the nowait and in_reduction clauses must not be
16 specified.
17 • If directive-name-A is target, the copyin clause must not be specified.
18 Cross References
19 • copyin clause, see Section 5.7.1
20 • in_reduction clause, see Section 5.5.10
21 • nowait clause, see Section 15.6
22 • parallel directive, see Section 10.1
23 • target directive, see Section 13.8
7 Restrictions
8 Restrictions to composite constructs are as follows:
9 • The restrictions of directive-name-A and directive-name-B apply.
10 • If directive-name-A is distribute, the linear clause may only be specified for loop
11 iteration variables of loops that are associated with the construct.
12 • If directive-name-A is distribute, the ordered clause must not be specified.
13 Cross References
14 • distribute directive, see Section 11.6
15 • in_reduction clause, see Section 5.5.10
16 • linear clause, see Section 5.4.6
17 • ordered clause, see Section 4.4.4
18 • reduction clause, see Section 5.5.8
19 • simd directive, see Section 10.4
20 • taskloop directive, see Section 12.6
Fortran
7 true means a logical value of .TRUE. and false means a logical value of .FALSE..
Fortran
Fortran
8 Restrictions
9 The following restrictions apply to all OpenMP runtime library routines:
10 • OpenMP runtime library routines may not be called from PURE or ELEMENTAL procedures.
11 • OpenMP runtime library routines may not be called in DO CONCURRENT constructs.
Fortran
22 18.2.1 omp_set_num_threads
23 Summary
24 The omp_set_num_threads routine affects the number of threads to be used for subsequent
25 parallel regions that do not specify a num_threads clause, by setting the value of the first
26 element of the nthreads-var ICV of the current task.
27 Format
C / C++
28 void omp_set_num_threads(int num_threads);
C / C++
6 Binding
7 The binding task set for an omp_set_num_threads region is the generating task.
8 Effect
9 The effect of this routine is to set the value of the first element of the nthreads-var ICV of the
10 current task to the value specified in the argument.
11 Cross References
12 • Determining the Number of Threads for a parallel Region, see Section 10.1.1
13 • nthreads-var ICV, see Table 2.1
14 • num_threads clause, see Section 10.1.2
15 • parallel directive, see Section 10.1
16 18.2.2 omp_get_num_threads
17 Summary
18 The omp_get_num_threads routine returns the number of threads in the current team.
19 Format
C / C++
20 int omp_get_num_threads(void);
C / C++
Fortran
21 integer function omp_get_num_threads()
Fortran
22 Binding
23 The binding region for an omp_get_num_threads region is the innermost enclosing parallel
24 region.
25 Effect
26 The omp_get_num_threads routine returns the number of threads in the team that is executing
27 the parallel region to which the routine region binds.
21 18.2.4 omp_get_thread_num
22 Summary
23 The omp_get_thread_num routine returns the thread number, within the current team, of the
24 calling thread.
25 Format
C / C++
26 int omp_get_thread_num(void);
C / C++
Fortran
27 integer function omp_get_thread_num()
Fortran
4 Effect
5 The omp_get_thread_num routine returns the thread number of the calling thread, within the
6 team that is executing the parallel region to which the routine region binds. The thread number is
7 an integer between 0 and one less than the value returned by omp_get_num_threads,
8 inclusive. The thread number of the primary thread of the team is 0.
9 Cross References
10 • omp_get_num_threads, see Section 18.2.2
11 18.2.5 omp_in_parallel
12 Summary
13 The omp_in_parallel routine returns true if the active-levels-var ICV is greater than zero;
14 otherwise, it returns false.
15 Format
C / C++
16 int omp_in_parallel(void);
C / C++
Fortran
17 logical function omp_in_parallel()
Fortran
18 Binding
19 The binding task set for an omp_in_parallel region is the generating task.
20 Effect
21 The effect of the omp_in_parallel routine is to return true if the current task is enclosed by an
22 active parallel region, and the parallel region is enclosed by the outermost initial task
23 region on the device; otherwise it returns false.
24 Cross References
25 • active-levels-var ICV, see Table 2.1
26 • parallel directive, see Section 10.1
6 Format
C / C++
7 void omp_set_dynamic(int dynamic_threads);
C / C++
Fortran
8 subroutine omp_set_dynamic(dynamic_threads)
9 logical dynamic_threads
Fortran
10 Binding
11 The binding task set for an omp_set_dynamic region is the generating task.
12 Effect
13 For implementations that support dynamic adjustment of the number of threads, if the argument to
14 omp_set_dynamic evaluates to true, dynamic adjustment is enabled for the current task;
15 otherwise, dynamic adjustment is disabled for the current task. For implementations that do not
16 support dynamic adjustment of the number of threads, this routine has no effect: the value of
17 dyn-var remains false.
18 Cross References
19 • dyn-var ICV, see Table 2.1
20 18.2.7 omp_get_dynamic
21 Summary
22 The omp_get_dynamic routine returns the value of the dyn-var ICV, which determines whether
23 dynamic adjustment of the number of threads is enabled or disabled.
24 Format
C / C++
25 int omp_get_dynamic(void);
C / C++
Fortran
26 logical function omp_get_dynamic()
Fortran
3 Effect
4 This routine returns true if dynamic adjustment of the number of threads is enabled for the current
5 task; otherwise, it returns false. If an implementation does not support dynamic adjustment of the
6 number of threads, then this routine always returns false.
7 Cross References
8 • dyn-var ICV, see Table 2.1
9 18.2.8 omp_get_cancellation
10 Summary
11 The omp_get_cancellation routine returns the value of the cancel-var ICV, which
12 determines if cancellation is enabled or disabled.
13 Format
C / C++
14 int omp_get_cancellation(void);
C / C++
Fortran
15 logical function omp_get_cancellation()
Fortran
16 Binding
17 The binding task set for an omp_get_cancellation region is the whole program.
18 Effect
19 This routine returns true if cancellation is enabled. It returns false otherwise.
20 Cross References
21 • cancel-var ICV, see Table 2.1
26 Format
C / C++
27 void omp_set_nested(int nested);
C / C++
5 Effect
6 If the argument to omp_set_nested evaluates to true, the value of the max-active-levels-var
7 ICV is set to the number of active levels of parallelism that the implementation supports; otherwise,
8 if the value of max-active-levels-var is greater than 1 then it is set to 1. This routine has been
9 deprecated.
10 Cross References
11 • max-active-levels-var ICV, see Table 2.1
16 Format
C / C++
17 int omp_get_nested(void);
C / C++
Fortran
18 logical function omp_get_nested()
Fortran
19 Binding
20 The binding task set for an omp_get_nested region is the generating task.
21 Effect
22 This routine returns true if max-active-levels-var is greater than 1 and greater than active-levels-var
23 for the current task; it returns false otherwise. If an implementation does not support nested
24 parallelism, this routine always returns false. This routine has been deprecated.
25 Cross References
26 • max-active-levels-var ICV, see Table 2.1
5 Format
C / C++
6 void omp_set_schedule(omp_sched_t kind, int chunk_size);
C / C++
Fortran
7 subroutine omp_set_schedule(kind, chunk_size)
8 integer (kind=omp_sched_kind) kind
9 integer chunk_size
Fortran
10 Constraints on Arguments
11 The first argument passed to this routine can be one of the valid OpenMP schedule kinds (except for
12 runtime) or any implementation-specific schedule. The C/C++ header file (omp.h) and the
13 Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib) define the valid
14 constants. The valid constants must include the following, which can be extended with
15 implementation-specific values:
C / C++
16 typedef enum omp_sched_t {
17 // schedule kinds
18 omp_sched_static = 0x1,
19 omp_sched_dynamic = 0x2,
20 omp_sched_guided = 0x3,
21 omp_sched_auto = 0x4,
22
23 // schedule modifier
24 omp_sched_monotonic = 0x80000000u
25 } omp_sched_t;
C / C++
Fortran
26 ! schedule kinds
27 integer(kind=omp_sched_kind), &
28 parameter :: omp_sched_static = &
29 int(Z’1’, kind=omp_sched_kind)
30 integer(kind=omp_sched_kind), &
31 parameter :: omp_sched_dynamic = &
32 int(Z’2’, kind=omp_sched_kind)
14 Effect
15 The effect of this routine is to set the value of the run-sched-var ICV of the current task to the
16 values specified in the two arguments. The schedule is set to the schedule kind that is specified by
17 the first argument kind. It can be any of the standard schedule kinds or any other
18 implementation-specific one. For the schedule kinds static, dynamic, and guided, the
19 chunk_size is set to the value of the second argument, or to the default chunk_size if the value of the
20 second argument is less than 1; for the schedule kind auto, the second argument has no meaning;
21 for implementation-specific schedule kinds, the values and associated meanings of the second
22 argument are implementation defined.
23 Each of the schedule kinds can be combined with the omp_sched_monotonic modifier by
24 using the + or | operators in C/C++ or the + operator in Fortran. If the schedule kind is combined
25 with the omp_sched_monotonic modifier, the schedule is modified as if the monotonic
26 schedule modifier was specified. Otherwise, the schedule modifier is nonmonotonic.
27 Cross References
28 • run-sched-var ICV, see Table 2.1
29 18.2.12 omp_get_schedule
30 Summary
31 The omp_get_schedule routine returns the schedule that is applied when the runtime schedule
32 is used.
33 Format
C / C++
34 void omp_get_schedule(omp_sched_t *kind, int *chunk_size);
C / C++
6 Effect
7 This routine returns the run-sched-var ICV in the task to which the routine binds. The first
8 argument kind returns the schedule to be used. It can be any of the standard schedule kinds as
9 defined in Section 18.2.11, or any implementation-specific schedule kind. If the returned schedule
10 kind is static, dynamic, or guided, the second argument chunk_size returns the chunk size to
11 be used, or a value less than 1 if the default chunk size is to be used. The value returned by the
12 second argument is implementation defined for any other schedule kinds.
13 Cross References
14 • run-sched-var ICV, see Table 2.1
15 18.2.13 omp_get_thread_limit
16 Summary
17 The omp_get_thread_limit routine returns the maximum number of OpenMP threads
18 available to participate in the current contention group.
19 Format
C / C++
20 int omp_get_thread_limit(void);
C / C++
Fortran
21 integer function omp_get_thread_limit()
Fortran
22 Binding
23 The binding task set for an omp_get_thread_limit region is the generating task.
24 Effect
25 The omp_get_thread_limit routine returns the value of the thread-limit-var ICV.
26 Cross References
27 • thread-limit-var ICV, see Table 2.1
5 Format
C / C++
6 int omp_get_supported_active_levels(void);
C / C++
Fortran
7 integer function omp_get_supported_active_levels()
Fortran
8 Binding
9 The binding task set for an omp_get_supported_active_levels region is the generating
10 task.
11 Effect
12 The omp_get_supported_active_levels routine returns the number of active levels of
13 parallelism supported by the implementation. The max-active-levels-var ICV cannot have a value
14 that is greater than this number. The value that the omp_get_supported_active_levels
15 routine returns is implementation defined, but it must be greater than 0.
16 Cross References
17 • max-active-levels-var ICV, see Table 2.1
18 18.2.15 omp_set_max_active_levels
19 Summary
20 The omp_set_max_active_levels routine limits the number of nested active parallel
21 regions when a new nested parallel region is generated by the current task by setting the
22 max-active-levels-var ICV.
23 Format
C / C++
24 void omp_set_max_active_levels(int max_levels);
C / C++
Fortran
25 subroutine omp_set_max_active_levels(max_levels)
26 integer max_levels
Fortran
16 18.2.16 omp_get_max_active_levels
17 Summary
18 The omp_get_max_active_levels routine returns the value of the max-active-levels-var
19 ICV, which determines the maximum number of nested active parallel regions when the innermost
20 parallel region is generated by the current task.
21 Format
C / C++
22 int omp_get_max_active_levels(void);
C / C++
Fortran
23 integer function omp_get_max_active_levels()
Fortran
24 Binding
25 The binding task set for an omp_get_max_active_levels region is the generating task.
26 Effect
27 The omp_get_max_active_levels routine returns the value of the max-active-levels-var
28 ICV. The current task may only generate an active parallel region if the returned value is greater
29 than the value of the active-levels-var ICV.
30 Cross References
31 • max-active-levels-var ICV, see Table 2.1
4 Format
C / C++
5 int omp_get_level(void);
C / C++
Fortran
6 integer function omp_get_level()
Fortran
7 Binding
8 The binding task set for an omp_get_level region is the generating task.
9 Effect
10 The effect of the omp_get_level routine is to return the number of nested parallel regions
11 (whether active or inactive) that enclose the current task such that all of the parallel regions are
12 enclosed by the outermost initial task region on the current device.
13 Cross References
14 • levels-var ICV, see Table 2.1
15 • parallel directive, see Section 10.1
16 18.2.18 omp_get_ancestor_thread_num
17 Summary
18 The omp_get_ancestor_thread_num routine returns, for a given nested level of the current
19 thread, the thread number of the ancestor of the current thread.
20 Format
C / C++
21 int omp_get_ancestor_thread_num(int level);
C / C++
Fortran
22 integer function omp_get_ancestor_thread_num(level)
23 integer level
Fortran
5 Effect
6 The omp_get_ancestor_thread_num routine returns the thread number of the ancestor at a
7 given nest level of the current thread or the thread number of the current thread. If the requested
8 nest level is outside the range of 0 and the nest level of the current thread, as returned by the
9 omp_get_level routine, the routine returns -1.
10
15 Cross References
16 • omp_get_level, see Section 18.2.17
17 • omp_get_thread_num, see Section 18.2.4
18 • parallel directive, see Section 10.1
19 18.2.19 omp_get_team_size
20 Summary
21 The omp_get_team_size routine returns, for a given nested level of the current thread, the size
22 of the thread team to which the ancestor or the current thread belongs.
23 Format
C / C++
24 int omp_get_team_size(int level);
C / C++
Fortran
25 integer function omp_get_team_size(level)
26 integer level
Fortran
27 Binding
28 The binding thread set for an omp_get_team_size region is the encountering thread. The
29 binding region for an omp_get_team_size region is the innermost enclosing parallel
30 region.
7 Note – When the omp_get_team_size routine is called with a value of level=0, the routine
8 always returns 1. If level=omp_get_level(), the routine has the same effect as the
9 omp_get_num_threads routine.
10
11 Cross References
12 • omp_get_level, see Section 18.2.17
13 • omp_get_num_threads, see Section 18.2.2
14 • parallel directive, see Section 10.1
15 18.2.20 omp_get_active_level
16 Summary
17 The omp_get_active_level routine returns the value of the active-levels-var ICV.
18 Format
C / C++
19 int omp_get_active_level(void);
C / C++
Fortran
20 integer function omp_get_active_level()
Fortran
21 Binding
22 The binding task set for the an omp_get_active_level region is the generating task.
23 Effect
24 The effect of the omp_get_active_level routine is to return the number of nested active
25 parallel regions enclosing the current task such that all of the parallel regions are enclosed
26 by the outermost initial task region on the current device.
27 Cross References
28 • active-levels-var ICV, see Table 2.1
29 • parallel directive, see Section 10.1
3 18.3.1 omp_get_proc_bind
4 Summary
5 The omp_get_proc_bind routine returns the thread affinity policy to be used for the
6 subsequent nested parallel regions that do not specify a proc_bind clause.
7 Format
C / C++
8 omp_proc_bind_t omp_get_proc_bind(void);
C / C++
Fortran
9 integer (kind=omp_proc_bind_kind) function omp_get_proc_bind()
Fortran
10 Constraints on Arguments
11 The value returned by this routine must be one of the valid affinity policy kinds. The C/C++ header
12 file (omp.h) and the Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib)
13 define the valid constants. The valid constants must include the following:
C / C++
14 typedef enum omp_proc_bind_t {
15 omp_proc_bind_false = 0,
16 omp_proc_bind_true = 1,
17 omp_proc_bind_primary = 2,
18 omp_proc_bind_master = omp_proc_bind_primary, // (deprecated)
19 omp_proc_bind_close = 3,
20 omp_proc_bind_spread = 4
21 } omp_proc_bind_t;
C / C++
Fortran
22 integer (kind=omp_proc_bind_kind), &
23 parameter :: omp_proc_bind_false = 0
24 integer (kind=omp_proc_bind_kind), &
25 parameter :: omp_proc_bind_true = 1
26 integer (kind=omp_proc_bind_kind), &
27 parameter :: omp_proc_bind_primary = 2
10 Effect
11 The effect of this routine is to return the value of the first element of the bind-var ICV of the current
12 task. See Section 10.1.3 for the rules that govern the thread affinity policy.
13 Cross References
14 • Controlling OpenMP Thread Affinity, see Section 10.1.3
15 • bind-var ICV, see Table 2.1
16 • parallel directive, see Section 10.1
17 18.3.2 omp_get_num_places
18 Summary
19 The omp_get_num_places routine returns the number of places available to the execution
20 environment in the place list.
21 Format
C / C++
22 int omp_get_num_places(void);
C / C++
Fortran
23 integer function omp_get_num_places()
Fortran
24 Binding
25 The binding thread set for an omp_get_num_places region is all threads on a device. The
26 effect of executing this routine is not related to any specific region corresponding to any construct
27 or API routine.
5 Cross References
6 • place-partition-var ICV, see Table 2.1
7 18.3.3 omp_get_place_num_procs
8 Summary
9 The omp_get_place_num_procs routine returns the number of processors available to the
10 execution environment in the specified place.
11 Format
C / C++
12 int omp_get_place_num_procs(int place_num);
C / C++
Fortran
13 integer function omp_get_place_num_procs(place_num)
14 integer place_num
Fortran
15 Binding
16 The binding thread set for an omp_get_place_num_procs region is all threads on a device.
17 The effect of executing this routine is not related to any specific region corresponding to any
18 construct or API routine.
19 Effect
20 The omp_get_place_num_procs routine returns the number of processors associated with
21 the place numbered place_num. The routine returns zero when place_num is negative or is greater
22 than or equal to the value returned by omp_get_num_places().
23 Cross References
24 • omp_get_num_places, see Section 18.3.2
25 18.3.4 omp_get_place_proc_ids
26 Summary
27 The omp_get_place_proc_ids routine returns the numerical identifiers of the processors
28 available to the execution environment in the specified place.
10 Effect
11 The omp_get_place_proc_ids routine returns the numerical identifiers of each processor
12 associated with the place numbered place_num. The numerical identifiers are non-negative and
13 their meaning is implementation defined. The numerical identifiers are returned in the array ids and
14 their order in the array is implementation defined. The array must be sufficiently large to contain
15 omp_get_place_num_procs(place_num) integers; otherwise, the behavior is unspecified.
16 The routine has no effect when place_num has a negative value or a value greater than or equal to
17 omp_get_num_places().
18 Cross References
19 • OMP_PLACES, see Section 21.1.6
20 • omp_get_num_places, see Section 18.3.2
21 • omp_get_place_num_procs, see Section 18.3.3
22 18.3.5 omp_get_place_num
23 Summary
24 The omp_get_place_num routine returns the place number of the place to which the
25 encountering thread is bound.
26 Format
C / C++
27 int omp_get_place_num(void);
C / C++
Fortran
28 integer function omp_get_place_num()
Fortran
3 Effect
4 When the encountering thread is bound to a place, the omp_get_place_num routine returns the
5 place number associated with the thread. The returned value is between 0 and one less than the
6 value returned by omp_get_num_places(), inclusive. When the encountering thread is not
7 bound to a place, the routine returns -1.
8 Cross References
9 • omp_get_num_places, see Section 18.3.2
10 18.3.6 omp_get_partition_num_places
11 Summary
12 The omp_get_partition_num_places routine returns the number of places in the place
13 partition of the innermost implicit task.
14 Format
C / C++
15 int omp_get_partition_num_places(void);
C / C++
Fortran
16 integer function omp_get_partition_num_places()
Fortran
17 Binding
18 The binding task set for an omp_get_partition_num_places region is the encountering
19 implicit task.
20 Effect
21 The omp_get_partition_num_places routine returns the number of places in the
22 place-partition-var ICV.
23 Cross References
24 • place-partition-var ICV, see Table 2.1
5 Format
C / C++
6 void omp_get_partition_place_nums(int *place_nums);
C / C++
Fortran
7 subroutine omp_get_partition_place_nums(place_nums)
8 integer place_nums(*)
Fortran
9 Binding
10 The binding task set for an omp_get_partition_place_nums region is the encountering
11 implicit task.
12 Effect
13 The omp_get_partition_place_nums routine returns the list of place numbers that
14 correspond to the places in the place-partition-var ICV of the innermost implicit task. The array
15 must be sufficiently large to contain omp_get_partition_num_places() integers;
16 otherwise, the behavior is unspecified.
17 Cross References
18 • omp_get_partition_num_places, see Section 18.3.6
19 • place-partition-var ICV, see Table 2.1
20 18.3.8 omp_set_affinity_format
21 Summary
22 The omp_set_affinity_format routine sets the affinity format to be used on the device by
23 setting the value of the affinity-format-var ICV.
24 Format
C / C++
25 void omp_set_affinity_format(const char *format);
C / C++
Fortran
26 subroutine omp_set_affinity_format(format)
27 character(len=*),intent(in) :: format
Fortran
6 Effect
7 The effect of omp_set_affinity_format routine is to copy the character string specified by
8 the format argument into the affinity-format-var ICV on the current device.
9 This routine has the described effect only when called from a sequential part of the program. When
10 called from within a parallel or teams region, the effect of this routine is implementation
11 defined.
12 Cross References
13 • Controlling OpenMP Thread Affinity, see Section 10.1.3
14 • OMP_AFFINITY_FORMAT, see Section 21.2.5
15 • OMP_DISPLAY_AFFINITY, see Section 21.2.4
16 • omp_capture_affinity, see Section 18.3.11
17 • omp_display_affinity, see Section 18.3.10
18 • omp_get_affinity_format, see Section 18.3.9
19 18.3.9 omp_get_affinity_format
20 Summary
21 The omp_get_affinity_format routine returns the value of the affinity-format-var ICV on
22 the device.
23 Format
C / C++
24 size_t omp_get_affinity_format(char *buffer, size_t size);
C / C++
Fortran
25 integer function omp_get_affinity_format(buffer)
26 character(len=*),intent(out) :: buffer
Fortran
27 Binding
28 When called from a sequential part of the program, the binding thread set for an
29 omp_get_affinity_format region is the encountering thread. When called from within any
30 parallel or teams region, the binding thread set (and binding region, if required) for the
31 omp_get_affinity_format region is implementation defined.
14 Cross References
15 • affinity-format-var ICV, see Table 2.1
16 • parallel directive, see Section 10.1
17 • teams directive, see Section 10.2
18 18.3.10 omp_display_affinity
19 Summary
20 The omp_display_affinity routine prints the OpenMP thread affinity information using the
21 format specification provided.
22 Format
C / C++
23 void omp_display_affinity(const char *format);
C / C++
Fortran
24 subroutine omp_display_affinity(format)
25 character(len=*),intent(in) :: format
Fortran
26 Binding
27 The binding thread set for an omp_display_affinity region is the encountering thread.
7 Cross References
8 • affinity-format-var ICV, see Table 2.1
9 18.3.11 omp_capture_affinity
10 Summary
11 The omp_capture_affinity routine prints the OpenMP thread affinity information into a
12 buffer using the format specification provided.
13 Format
C / C++
14 size_t omp_capture_affinity(
15 char *buffer,
16 size_t size,
17 const char *format
18 );
C / C++
Fortran
19 integer function omp_capture_affinity(buffer,format)
20 character(len=*),intent(out) :: buffer
21 character(len=*),intent(in) :: format
Fortran
22 Binding
23 The binding thread set for an omp_capture_affinity region is the encountering thread.
24 Effect
C / C++
25 The omp_capture_affinity routine returns the number of characters in the entire thread
26 affinity information string excluding the terminating null byte (’\0’). If size is non-zero, it writes
27 the thread affinity information of the current thread in the format specified by the format argument
28 into the character string buffer followed by a null byte. If the return value is larger or equal to
29 size, the thread affinity information string is truncated, with the terminating null byte stored to
30 buffer[size-1]. If size is zero, nothing is stored and buffer may be NULL. If the format is NULL
31 or a zero-length string, the value of the affinity-format-var ICV is used.
C / C++
9 Cross References
10 • affinity-format-var ICV, see Table 2.1
14 18.4.1 omp_get_num_teams
15 Summary
16 The omp_get_num_teams routine returns the number of initial teams in the current teams
17 region.
18 Format
C / C++
19 int omp_get_num_teams(void);
C / C++
Fortran
20 integer function omp_get_num_teams()
Fortran
21 Binding
22 The binding task set for an omp_get_num_teams region is the generating task
23 Effect
24 The effect of this routine is to return the number of initial teams in the current teams region. The
25 routine returns 1 if it is called from outside of a teams region.
3 18.4.2 omp_get_team_num
4 Summary
5 The omp_get_team_num routine returns the initial team number of the calling thread.
6 Format
C / C++
7 int omp_get_team_num(void);
C / C++
Fortran
8 integer function omp_get_team_num()
Fortran
9 Binding
10 The binding task set for an omp_get_team_num region is the generating task.
11 Effect
12 The omp_get_team_num routine returns the initial team number of the calling thread. The
13 initial team number is an integer between 0 and one less than the value returned by
14 omp_get_num_teams(), inclusive. The routine returns 0 if it is called outside of a teams
15 region.
16 Cross References
17 • omp_get_num_teams, see Section 18.4.1
18 • teams directive, see Section 10.2
19 18.4.3 omp_set_num_teams
20 Summary
21 The omp_set_num_teams routine affects the number of threads to be used for subsequent
22 teams regions that do not specify a num_teams clause, by setting the value of the nteams-var
23 ICV of the current device.
24 Format
C / C++
25 void omp_set_num_teams(int num_teams);
C / C++
Fortran
26 subroutine omp_set_num_teams(num_teams)
27 integer num_teams
Fortran
4 Binding
5 The binding task set for an omp_set_num_teams region is the generating task.
6 Effect
7 The effect of this routine is to set the value of the nteams-var ICV of the current device to the value
8 specified in the argument.
9 Restrictions
10 Restrictions to the omp_set_num_teams routine are as follows:
11 • The routine may not be called from within a parallel region that is not the implicit parallel region
12 that surrounds the whole OpenMP program.
13 Cross References
14 • nteams-var ICV, see Table 2.1
15 • num_teams clause, see Section 10.2.1
16 • teams directive, see Section 10.2
17 18.4.4 omp_get_max_teams
18 Summary
19 The omp_get_max_teams routine returns an upper bound on the number of teams that could be
20 created by a teams construct without a num_teams clause that is encountered after execution
21 returns from this routine.
22 Format
C / C++
23 int omp_get_max_teams(void);
C / C++
Fortran
24 integer function omp_get_max_teams()
Fortran
25 Binding
26 The binding task set for an omp_get_max_teams region is the generating task.
6 Cross References
7 • nteams-var ICV, see Table 2.1
8 • num_teams clause, see Section 10.2.1
9 • teams directive, see Section 10.2
10 18.4.5 omp_set_teams_thread_limit
11 Summary
12 The omp_set_teams_thread_limit routine defines the maximum number of OpenMP
13 threads that can participate in each contention group created by a teams construct.
14 Format
C / C++
15 void omp_set_teams_thread_limit(int thread_limit);
C / C++
Fortran
16 subroutine omp_set_teams_thread_limit(thread_limit)
17 integer thread_limit
Fortran
18 Constraints on Arguments
19 The value of the argument passed to this routine must evaluate to a positive integer, or else the
20 behavior of this routine is implementation defined.
21 Binding
22 The binding task set for an omp_set_teams_thread_limit region is the generating task.
23 Effect
24 The omp_set_teams_thread_limit routine sets the value of the teams-thread-limit-var
25 ICV to the value of the thread_limit argument. If the value of thread_limit exceeds the number of
26 OpenMP threads that an implementation supports for each contention group created by a teams
27 construct, the value of the teams-thread-limit-var ICV will be set to the number that is supported by
28 the implementation.
5 Cross References
6 • teams directive, see Section 10.2
7 • teams-thread-limit-var ICV, see Table 2.1
8 • thread_limit clause, see Section 13.3
9 18.4.6 omp_get_teams_thread_limit
10 Summary
11 The omp_get_teams_thread_limit routine returns the maximum number of OpenMP
12 threads available to participate in each contention group created by a teams construct.
13 Format
C / C++
14 int omp_get_teams_thread_limit(void);
C / C++
Fortran
15 integer function omp_get_teams_thread_limit()
Fortran
16 Binding
17 The binding task set for an omp_get_teams_thread_limit region is the generating task.
18 Effect
19 The omp_get_teams_thread_limit routine returns the value of the teams-thread-limit-var
20 ICV.
21 Cross References
22 • teams directive, see Section 10.2
23 • teams-thread-limit-var ICV, see Table 2.1
3 18.5.1 omp_get_max_task_priority
4 Summary
5 The omp_get_max_task_priority routine returns the maximum value that can be specified
6 in the priority clause.
7 Format
C / C++
8 int omp_get_max_task_priority(void);
C / C++
Fortran
9 integer function omp_get_max_task_priority()
Fortran
10 Binding
11 The binding thread set for an omp_get_max_task_priority region is all threads on the
12 device. The effect of executing this routine is not related to any specific region that corresponds to
13 any construct or API routine.
14 Effect
15 The omp_get_max_task_priority routine returns the value of the max-task-priority-var
16 ICV, which determines the maximum value that can be specified in the priority clause.
17 Cross References
18 • max-task-priority-var ICV, see Table 2.1
19 • priority clause, see Section 12.4
20 18.5.2 omp_in_explicit_task
21 Summary
22 The omp_in_explicit_task routine returns the value of the explicit-task-var ICV.
23 Format
C / C++
24 int omp_in_explicit_task(void);
C / C++
Fortran
25 logical function omp_in_explicit_task()
Fortran
9 18.5.3 omp_in_final
10 Summary
11 The omp_in_final routine returns true if the routine is executed in a final task region;
12 otherwise, it returns false.
13 Format
C / C++
14 int omp_in_final(void);
C / C++
Fortran
15 logical function omp_in_final()
Fortran
16 Binding
17 The binding task set for an omp_in_final region is the generating task.
18 Effect
19 omp_in_final returns true if the enclosing task region is final. Otherwise, it returns false.
22 18.6.1 omp_pause_resource
23 Summary
24 The omp_pause_resource routine allows the runtime to relinquish resources used by OpenMP
25 on the specified device.
26 Format
C / C++
27 int omp_pause_resource(omp_pause_resource_t kind, int device_num);
C / C++
20 Binding
21 The binding task set for an omp_pause_resource region is the whole program.
22 Effect
23 The omp_pause_resource routine allows the runtime to relinquish resources used by OpenMP
24 on the specified device.
25 If successful, the omp_pause_hard value results in a hard pause for which the OpenMP state is
26 not guaranteed to persist across the omp_pause_resource call. A hard pause may relinquish
27 any data allocated by OpenMP on a given device, including data allocated by memory routines for
28 that device as well as data present on the device as a result of a declare target directive or
29 target data construct. A hard pause may also relinquish any data associated with a
30 threadprivate directive. When relinquished and when applicable, base language appropriate
31 deallocation/finalization is performed. When relinquished and when applicable, mapped data on a
32 device will not be copied back from the device to the host.
6 Note – A hard pause may relinquish more resources, but may resume processing OpenMP regions
7 more slowly. A soft pause allows OpenMP regions to restart more quickly, but may relinquish fewer
8 resources. An OpenMP implementation will reclaim resources as needed for OpenMP regions
9 encountered after the omp_pause_resource region. Since a hard pause may unmap data on the
10 specified device, appropriate data mapping is required before using data on the specified device
11 after the omp_pause_region region.
12
13 The routine returns zero in case of success, and non-zero otherwise.
14 Tool Callbacks
15 If the tool is not allowed to interact with the specified device after encountering this call, then the
16 runtime must call the tool finalizer for that device.
17 Restrictions
18 Restrictions to the omp_pause_resource routine are as follows:
19 • The omp_pause_resource region may not be nested in any explicit OpenMP region.
20 • The routine may only be called when all explicit tasks have finalized execution.
21 Cross References
22 • Declare Target Directives, see Section 7.8
23 • target data directive, see Section 13.5
24 • threadprivate directive, see Section 5.2
25 18.6.2 omp_pause_resource_all
26 Summary
27 The omp_pause_resource_all routine allows the runtime to relinquish resources used by
28 OpenMP on all devices.
29 Format
C / C++
30 int omp_pause_resource_all(omp_pause_resource_t kind);
C / C++
Fortran
31 integer function omp_pause_resource_all(kind)
32 integer (kind=omp_pause_resource_kind) kind
Fortran
21 18.7.1 omp_get_num_procs
22 Summary
23 The omp_get_num_procs routine returns the number of processors available to the device.
24 Format
C / C++
25 int omp_get_num_procs(void);
C / C++
Fortran
26 integer function omp_get_num_procs()
Fortran
27 Binding
28 The binding thread set for an omp_get_num_procs region is all threads on a device. The effect
29 of executing this routine is not related to any specific region corresponding to any construct or API
30 routine.
6 18.7.2 omp_set_default_device
7 Summary
8 The omp_set_default_device routine controls the default target device by assigning the
9 value of the default-device-var ICV.
10 Format
C / C++
11 void omp_set_default_device(int device_num);
C / C++
Fortran
12 subroutine omp_set_default_device(device_num)
13 integer device_num
Fortran
14 Binding
15 The binding task set for an omp_set_default_device region is the generating task.
16 Effect
17 The effect of this routine is to set the value of the default-device-var ICV of the current task to the
18 value specified in the argument. When called from within a target region the effect of this
19 routine is unspecified.
20 Cross References
21 • default-device-var ICV, see Table 2.1
22 • target directive, see Section 13.8
23 18.7.3 omp_get_default_device
24 Summary
25 The omp_get_default_device routine returns the default target device.
26 Format
C / C++
27 int omp_get_default_device(void);
C / C++
4 Effect
5 The omp_get_default_device routine returns the value of the default-device-var ICV of the
6 current task. When called from within a target region the effect of this routine is unspecified.
7 Cross References
8 • default-device-var ICV, see Table 2.1
9 • target directive, see Section 13.8
10 18.7.4 omp_get_num_devices
11 Summary
12 The omp_get_num_devices routine returns the number of non-host devices available for
13 offloading code or data.
14 Format
C / C++
15 int omp_get_num_devices(void);
C / C++
Fortran
16 integer function omp_get_num_devices()
Fortran
17 Binding
18 The binding task set for an omp_get_num_devices region is the generating task.
19 Effect
20 The omp_get_num_devices routine returns the number of available non-host devices onto
21 which code or data may be offloaded. When called from within a target region the effect of this
22 routine is unspecified.
23 Cross References
24 • target directive, see Section 13.8
5 Format
C / C++
6 int omp_get_device_num(void);
C / C++
Fortran
7 integer function omp_get_device_num()
Fortran
8 Binding
9 The binding task set for an omp_get_device_num region is the generating task.
10 Effect
11 The omp_get_device_num routine returns the device number of the device on which the
12 calling thread is executing. When called on the host device, it will return the same value as the
13 omp_get_initial_device routine.
14 18.7.6 omp_is_initial_device
15 Summary
16 The omp_is_initial_device routine returns true if the current task is executing on the host
17 device; otherwise, it returns false.
18 Format
C / C++
19 int omp_is_initial_device(void);
C / C++
Fortran
20 logical function omp_is_initial_device()
Fortran
21 Binding
22 The binding task set for an omp_is_initial_device region is the generating task.
23 Effect
24 The effect of this routine is to return true if the current task is executing on the host device;
25 otherwise, it returns false.
5 Format
C / C++
6 int omp_get_initial_device(void);
C / C++
Fortran
7 integer function omp_get_initial_device()
Fortran
8 Binding
9 The binding task set for an omp_get_initial_device region is the generating task.
10 Effect
11 The effect of this routine is to return the device number of the host device. The value of the device
12 number is the value returned by the omp_get_num_devices routine. When called from within
13 a target region the effect of this routine is unspecified.
14 Cross References
15 • target directive, see Section 13.8
21 18.8.1 omp_target_alloc
22 Summary
23 The omp_target_alloc routine allocates memory in a device data environment and returns a
24 device pointer to that memory.
25 Format
C / C++
26 void* omp_target_alloc(size_t size, int device_num);
C / C++
7 Binding
8 The binding task set for an omp_target_alloc region is the generating task, which is the target
9 task generated by the call to the omp_target_alloc routine.
10 Effect
11 The omp_target_alloc routine returns a device pointer that references the device address of a
12 storage location of size bytes. The storage location is dynamically allocated in the device data
13 environment of the device specified by device_num. The omp_target_alloc routine executes
14 as if part of a target task that is generated by the call to the routine and that is an included task. The
15 omp_target_alloc routine returns NULL if it cannot dynamically allocate the memory in the
16 device data environment. The device pointer returned by omp_target_alloc can be used in an
17 is_device_ptr clause (see Section 5.4.7).
Fortran
18 The omp_target_alloc routine requires an explicit interface and so might not be provided in
19 omp_lib.h.
Fortran
20 Execution Model Events
21 The target-data-allocation-begin event occurs before a thread initiates a data allocation on a target
22 device.
23 The target-data-allocation-end event occurs after a thread initiates a data allocation on a target
24 device.
25 Tool Callbacks
26 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
27 ompt_scope_begin as its endpoint argument for each occurrence of a
28 target-data-allocation-begin event in that thread. Similarly, a thread dispatches a registered
29 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
30 argument for each occurrence of a target-data-allocation-end event in that thread. These callbacks
31 have type signature ompt_callback_target_data_op_emi_t.
32 A thread dispatches a registered ompt_callback_target_data_op callback for each
33 occurrence of a target-data-allocation-end event in that thread. The callback occurs in the context
34 of the target task and has type signature ompt_callback_target_data_op_t.
15 18.8.2 omp_target_free
16 Summary
17 The omp_target_free routine frees the device memory allocated by the
18 omp_target_alloc routine.
19 Format
C / C++
20 void omp_target_free(void *device_ptr, int device_num);
C / C++
Fortran
21 subroutine omp_target_free(device_ptr, device_num) bind(c)
22 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
23 type(c_ptr), value :: device_ptr
24 integer(c_int), value :: device_num
Fortran
25 Constraints on Arguments
26 A program that calls omp_target_free with a non-null pointer that does not have a value
27 returned from omp_target_alloc is non-conforming. The device_num argument must be a
28 conforming device number.
4 Effect
5 The omp_target_free routine frees the memory in the device data environment associated
6 with device_ptr. If device_ptr is NULL, the operation is ignored. The omp_target_free
7 routine executes as if part of a target task that is generated by the call to the routine and that is an
8 included task. Synchronization must be inserted to ensure that all accesses to device_ptr are
9 completed before the call to omp_target_free.
Fortran
10 The omp_target_free routine requires an explicit interface and so might not be provided in
11 omp_lib.h.
Fortran
12 Execution Model Events
13 The target-data-free-begin event occurs before a thread initiates a data free on a target device.
14 The target-data-free-end event occurs after a thread initiates a data free on a target device.
15 Tool Callbacks
16 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
17 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-free-begin
18 event in that thread. Similarly, a thread dispatches a registered
19 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
20 argument for each occurrence of a target-data-free-end event in that thread. These callbacks have
21 type signature ompt_callback_target_data_op_emi_t.
22 A thread dispatches a registered ompt_callback_target_data_op callback for each
23 occurrence of a target-data-free-begin event in that thread. The callback occurs in the context of the
24 target task and has type signature ompt_callback_target_data_op_t.
25 Restrictions
26 Restrictions to the omp_target_free routine are as follows.
27 • When called from within a target region the effect is unspecified.
28 Cross References
29 • omp_target_alloc, see Section 18.8.1
30 • ompt_callback_target_data_op_emi_t and
31 ompt_callback_target_data_op_t, see Section 19.5.2.25
32 • target directive, see Section 13.8
5 Format
C / C++
6 int omp_target_is_present(const void *ptr, int device_num);
C / C++
Fortran
7 integer(c_int) function omp_target_is_present(ptr, device_num) &
8 bind(c)
9 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
10 type(c_ptr), value :: ptr
11 integer(c_int), value :: device_num
Fortran
12 Constraints on Arguments
13 The value of ptr must be a valid host pointer or NULL. The device_num argument must be a
14 conforming device number.
15 Binding
16 The binding task set for an omp_target_is_present region is the encountering task.
17 Effect
18 The omp_target_is_present routine returns true if device_num refers to the host device or
19 if ptr refers to storage that has corresponding storage in the device data environment of device
20 device_num. Otherwise, the routine returns false.
Fortran
21 The omp_target_is_present routine requires an explicit interface and so might not be
22 provided in omp_lib.h.
Fortran
23 Restrictions
24 Restrictions to the omp_target_is_present routine are as follows.
25 • When called from within a target region the effect is unspecified.
26 Cross References
27 • target directive, see Section 13.8
5 Format
C / C++
6 int omp_target_is_accessible( const void *ptr, size_t size,
7 int device_num);
C / C++
Fortran
8 integer(c_int) function omp_target_is_accessible( &
9 ptr, size, device_num) bind(c)
10 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
11 type(c_ptr), value :: ptr
12 integer(c_size_t), value :: size
13 integer(c_int), value :: device_num
Fortran
14 Constraints on Arguments
15 The value of ptr must be a valid host pointer or NULL. The device_num argument must be a
16 conforming device number.
17 Binding
18 The binding task set for an omp_target_is_accessible region is the encountering task.
19 Effect
20 This routine returns true if the storage of size bytes starting at the address given by ptr is accessible
21 from device device_num. Otherwise, it returns false.
Fortran
22 The omp_target_is_accessible routine requires an explicit interface and so might not be
23 provided in omp_lib.h.
Fortran
24 Restrictions
25 Restrictions to the omp_target_is_accessible routine are as follows.
26 • When called from within a target region the effect is unspecified.
27 Cross References
28 • target directive, see Section 13.8
6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
8 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
9 event in that thread. Similarly, a thread dispatches a registered
10 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
11 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
12 type signature ompt_callback_target_data_op_emi_t.
13 A thread dispatches a registered ompt_callback_target_data_op callback for each
14 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
15 target task and has type signature ompt_callback_target_data_op_t.
16 Restrictions
17 Restrictions to the omp_target_memcpy routine are as follows.
18 • When called from within a target region the effect is unspecified.
19 Cross References
20 • ompt_callback_target_data_op_emi_t and
21 ompt_callback_target_data_op_t, see Section 19.5.2.25
22 • target directive, see Section 13.8
23 18.8.6 omp_target_memcpy_rect
24 Summary
25 The omp_target_memcpy_rect routine copies a rectangular subvolume from a
26 multi-dimensional array to another multi-dimensional array. The omp_target_memcpy_rect
27 routine performs a copy between any combination of host and device pointers.
28 Format
C / C++
29 int omp_target_memcpy_rect(
30 void *dst,
31 const void *src,
32 size_t element_size,
33 int num_dims,
34 const size_t *volume,
35 const size_t *dst_offsets,
26 Effect
27 This routine copies a rectangular subvolume of src, in the device data environment of device
28 src_device_num, to dst, in the device data environment of device dst_device_num. The volume is
29 specified in terms of the size of an element, number of dimensions, and constant arrays of length
30 num_dims. The maximum number of dimensions supported is at least three; support for higher
31 dimensionality is implementation defined. The volume array specifies the length, in number of
32 elements, to copy in each dimension from src to dst. The dst_offsets (src_offsets) parameter
33 specifies the number of elements from the origin of dst (src) in elements. The dst_dimensions
34 (src_dimensions) parameter specifies the length of each dimension of dst (src).
31 18.8.7 omp_target_memcpy_async
32 Summary
33 The omp_target_memcpy_async routine asynchronously performs a copy between any
34 combination of host and device pointers.
24 Binding
25 The binding task set for an omp_target_memcpy_async region is the generating task, which
26 is the target task generated by the call to the omp_target_memcpy_async routine.
27 Effect
28 This routine performs an asynchronous memory copy where length bytes of memory at offset
29 src_offset from src in the device data environment of device src_device_num are copied to dst
30 starting at offset dst_offset in the device data environment of device dst_device_num. The
31 omp_target_memcpy_async routine executes as if part of a target task that is generated by the
32 call to the routine and for which execution may be deferred. Task dependences are expressed with
33 zero or more OpenMP depend objects. The dependences are specified by passing the number of
34 depend objects followed by an array of the objects. The generated target task is not a dependent task
35 if the program passes in a count of zero for depobj_count. depobj_list is ignored if the value of
36 depobj_count is zero.
10 Tool Callbacks
11 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
12 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
13 event in that thread. Similarly, a thread dispatches a registered
14 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
15 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
16 type signature ompt_callback_target_data_op_emi_t.
17 A thread dispatches a registered ompt_callback_target_data_op callback for each
18 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
19 target task and has type signature ompt_callback_target_data_op_t.
20 Restrictions
21 Restrictions to the omp_target_memcpy_async routine are as follows.
22 • When called from within a target region the effect is unspecified.
23 Cross References
24 • Depend Objects, see Section 15.9.2
25 • ompt_callback_target_data_op_emi_t and
26 ompt_callback_target_data_op_t, see Section 19.5.2.25
27 • target directive, see Section 13.8
28 18.8.8 omp_target_memcpy_rect_async
29 Summary
30 The omp_target_memcpy_rect_async routine asynchronously performs a copy between
31 any combination of host and device pointers.
4 Restrictions
5 Restrictions to the omp_target_memcpy_rect_async routine are as follows.
6 • When called from within a target region the effect is unspecified.
7 Cross References
8 • Depend Objects, see Section 15.9.2
9 • ompt_callback_target_data_op_emi_t and
10 ompt_callback_target_data_op_t, see Section 19.5.2.25
11 • target directive, see Section 13.8
12 18.8.9 omp_target_associate_ptr
13 Summary
14 The omp_target_associate_ptr routine maps a device pointer, which may be returned
15 from omp_target_alloc or implementation-defined runtime routines, to a host pointer.
16 Format
C / C++
17 int omp_target_associate_ptr(
18 const void *host_ptr,
19 const void *device_ptr,
20 size_t size,
21 size_t device_offset,
22 int device_num
23 );
C / C++
Fortran
24 integer(c_int) function omp_target_associate_ptr(host_ptr, &
25 device_ptr, size, device_offset, device_num) bind(c)
26 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
27 type(c_ptr), value :: host_ptr, device_ptr
28 integer(c_size_t), value :: size, device_offset
29 integer(c_int), value :: device_num
Fortran
30 Constraints on Arguments
31 The value of device_ptr value must be a valid pointer to device memory for the device denoted by
32 the value of device_num. The device_num argument must be a conforming device number.
4 Effect
5 The omp_target_associate_ptr routine associates a device pointer in the device data
6 environment of device device_num with a host pointer such that when the host pointer appears in a
7 subsequent map clause, the associated device pointer is used as the target for data motion
8 associated with that host pointer. The device_offset parameter specifies the offset into device_ptr
9 that is used as the base address for the device side of the mapping. The reference count of the
10 resulting mapping will be infinite. After being successfully associated, the buffer to which the
11 device pointer points is invalidated and accessing data directly through the device pointer results in
12 unspecified behavior. The pointer can be retrieved for other uses by using the
13 omp_target_disassociate_ptr routine to disassociate it .
14 The omp_target_associate_ptr routine executes as if part of a target task that is generated
15 by the call to the routine and that is an included task. The routine returns zero if successful.
16 Otherwise it returns a non-zero value.
17 Only one device buffer can be associated with a given host pointer value and device number pair.
18 Attempting to associate a second buffer will return non-zero. Associating the same pair of pointers
19 on the same device with the same offset has no effect and returns zero. Associating pointers that
20 share underlying storage will result in unspecified behavior. The omp_target_is_present
21 function can be used to test whether a given host pointer has a corresponding variable in the device
22 data environment.
Fortran
23 The omp_target_associate_ptr routine requires an explicit interface and so might not be
24 provided in omp_lib.h.
Fortran
25 Execution Model Events
26 The target-data-associate event occurs before a thread initiates a device pointer association on a
27 target device.
28 Tool Callbacks
29 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
30 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
31 endpoint argument for each occurrence of a target-data-associate event in that thread. These
32 callbacks have type signature ompt_callback_target_data_op_t or
33 ompt_callback_target_data_op_emi_t, respectively.
34 Restrictions
35 Restrictions to the omp_target_associate_ptr routine are as follows.
36 • When called from within a target region the effect is unspecified.
8 18.8.10 omp_target_disassociate_ptr
9 Summary
10 The omp_target_disassociate_ptr removes the associated pointer for a given device
11 from a host pointer.
12 Format
C / C++
13 int omp_target_disassociate_ptr(const void *ptr, int device_num);
C / C++
Fortran
14 integer(c_int) function omp_target_disassociate_ptr(ptr, &
15 device_num) bind(c)
16 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
17 type(c_ptr), value :: ptr
18 integer(c_int), value :: device_num
Fortran
19 Constraints on Arguments
20 The device_num argument must be a conforming device number.
21 Binding
22 The binding task set for an omp_target_disassociate_ptr region is the generating task,
23 which is the target task generated by the call to the omp_target_disassociate_ptr routine.
24 Effect
25 The omp_target_disassociate_ptr removes the associated device data on device
26 device_num from the presence table for host pointer ptr. A call to this routine on a pointer that is
27 not NULL and does not have associated data on the given device results in unspecified behavior.
28 The reference count of the mapping is reduced to zero, regardless of its current value. The
29 omp_target_disassociate_ptr routine executes as if part of a target task that is generated
30 by the call to the routine and that is an included task. The routine returns zero if successful.
31 Otherwise it returns a non-zero value. After a call to omp_target_disassociate_ptr, the
32 contents of the device buffer are invalidated.
6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
8 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
9 endpoint argument for each occurrence of a target-data-disassociate event in that thread. These
10 callbacks have type signature ompt_callback_target_data_op_t or
11 ompt_callback_target_data_op_emi_t, respectively.
12 Restrictions
13 Restrictions to the omp_target_disassociate_ptr routine are as follows.
14 • When called from within a target region the effect is unspecified.
15 Cross References
16 • ompt_callback_target_data_op_emi_t and
17 ompt_callback_target_data_op_t, see Section 19.5.2.25
18 • target directive, see Section 13.8
19 18.8.11 omp_get_mapped_ptr
20 Summary
21 The omp_get_mapped_ptr routine returns the device pointer that is associated with a host
22 pointer for a given device.
23 Format
C / C++
24 void * omp_get_mapped_ptr(const void *ptr, int device_num);
C / C++
Fortran
25 type(c_ptr) function omp_get_mapped_ptr(ptr, &
26 device_num) bind(c)
27 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
28 type(c_ptr), value :: ptr
29 integer(c_int), value :: device_num
Fortran
18 Cross References
19 • omp_get_initial_device, see Section 18.7.7
13 Binding
14 The binding thread set for all lock routine regions is all threads in the contention group. As a
15 consequence, for each OpenMP lock, the lock routine effects relate to all tasks that call the routines,
16 without regard to which teams in the contention group the threads that are executing the tasks
17 belong.
15 Restrictions
16 Restrictions to OpenMP lock routines are as follows:
17 • The use of the same OpenMP lock in different contention groups results in unspecified behavior.
21 Format
C / C++
22 void omp_init_lock(omp_lock_t *lock);
23 void omp_init_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
24 subroutine omp_init_lock(svar)
25 integer (kind=omp_lock_kind) svar
26
27 subroutine omp_init_nest_lock(nvar)
28 integer (kind=omp_nest_lock_kind) nvar
Fortran
4 Effect
5 The effect of these routines is to initialize the lock to the unlocked state; that is, no task owns the
6 lock. In addition, the nesting count for a nestable lock is set to zero.
11 Tool Callbacks
12 A thread dispatches a registered ompt_callback_lock_init callback with
13 omp_sync_hint_none as the hint argument and ompt_mutex_lock as the kind argument
14 for each occurrence of a lock-init event in that thread. Similarly, a thread dispatches a registered
15 ompt_callback_lock_init callback with omp_sync_hint_none as the hint argument
16 and ompt_mutex_nest_lock as the kind argument for each occurrence of a nest-lock-init
17 event in that thread. These callbacks have the type signature
18 ompt_callback_mutex_acquire_t and occur in the task that encounters the routine.
19 Cross References
20 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
27 Format
C / C++
28 void omp_init_lock_with_hint(
29 omp_lock_t *lock,
30 omp_sync_hint_t hint
31 );
32 void omp_init_nest_lock_with_hint(
33 omp_nest_lock_t *lock,
34 omp_sync_hint_t hint
35 );
C / C++
12 Effect
13 The effect of these routines is to initialize the lock to the unlocked state and, optionally, to choose a
14 specific lock implementation based on the hint. After initialization no task owns the lock. In
15 addition, the nesting count for a nestable lock is set to zero.
21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_lock_init callback with the same value
23 for its hint argument as the hint argument of the call to omp_init_lock_with_hint and
24 ompt_mutex_lock as the kind argument for each occurrence of a lock-init-with-hint event in
25 that thread. Similarly, a thread dispatches a registered ompt_callback_lock_init callback
26 with the same value for its hint argument as the hint argument of the call to
27 omp_init_nest_lock_with_hint and ompt_mutex_nest_lock as the kind argument
28 for each occurrence of a nest-lock-init-with-hint event in that thread. These callbacks have the type
29 signature ompt_callback_mutex_acquire_t and occur in the task that encounters the
30 routine.
31 Cross References
32 • Synchronization Hints, see Section 15.1
33 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
4 Format
C / C++
5 void omp_destroy_lock(omp_lock_t *lock);
6 void omp_destroy_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
7 subroutine omp_destroy_lock(svar)
8 integer (kind=omp_lock_kind) svar
9
10 subroutine omp_destroy_nest_lock(nvar)
11 integer (kind=omp_nest_lock_kind) nvar
Fortran
12 Constraints on Arguments
13 A program that accesses a lock that is not in the unlocked state through either routine is
14 non-conforming.
15 Effect
16 The effect of these routines is to change the state of the lock to uninitialized.
21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_lock_destroy callback with
23 ompt_mutex_lock as the kind argument for each occurrence of a lock-destroy event in that
24 thread. Similarly, a thread dispatches a registered ompt_callback_lock_destroy callback
25 with ompt_mutex_nest_lock as the kind argument for each occurrence of a nest-lock-destroy
26 event in that thread. These callbacks have the type signature ompt_callback_mutex_t and
27 occur in the task that encounters the routine.
28 Cross References
29 • ompt_callback_mutex_t, see Section 19.5.2.15
5 Format
C / C++
6 void omp_set_lock(omp_lock_t *lock);
7 void omp_set_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
8 subroutine omp_set_lock(svar)
9 integer (kind=omp_lock_kind) svar
10
11 subroutine omp_set_nest_lock(nvar)
12 integer (kind=omp_nest_lock_kind) nvar
Fortran
13 Constraints on Arguments
14 A program that accesses a lock that is in the uninitialized state through either routine is
15 non-conforming. A simple lock accessed by omp_set_lock that is in the locked state must not
16 be owned by the task that contains the call or deadlock will result.
17 Effect
18 Each of these routines has an effect equivalent to suspension of the task that is executing the routine
19 until the specified lock is available.
20
21 Note – The semantics of these routines is specified as if they serialize execution of the region
22 guarded by the lock. However, implementations may implement them in other ways provided that
23 the isolation properties are respected so that the actual execution delivers a result that could arise
24 from some serialization.
25
26 A simple lock is available if it is unlocked. Ownership of the lock is granted to the task that
27 executes the routine. A nestable lock is available if it is unlocked or if it is already owned by the
28 task that executes the routine. The task that executes the routine is granted, or retains, ownership of
29 the lock, and the nesting count for the lock is incremented.
9 Effect
10 For a simple lock, the omp_unset_lock routine causes the lock to become unlocked. For a
11 nestable lock, the omp_unset_nest_lock routine decrements the nesting count, and causes the
12 lock to become unlocked if the resulting nesting count is zero. For either routine, if the lock
13 becomes unlocked, and if one or more task regions were effectively suspended because the lock was
14 unavailable, the effect is that one task is chosen and given ownership of the lock.
23 Tool Callbacks
24 A thread dispatches a registered ompt_callback_mutex_released callback with
25 ompt_mutex_lock as the kind argument for each occurrence of a lock-release event in that
26 thread. Similarly, a thread dispatches a registered ompt_callback_mutex_released
27 callback with ompt_mutex_nest_lock as the kind argument for each occurrence of a
28 nest-lock-release event in that thread. These callbacks have the type signature
29 ompt_callback_mutex_t and occur in the task that encounters the routine.
30 A thread dispatches a registered ompt_callback_nest_lock callback with
31 ompt_scope_end as its endpoint argument for each occurrence of a nest-lock-held event in that
32 thread. This callback has the type signature ompt_callback_nest_lock_t.
33 Cross References
34 • ompt_callback_mutex_t, see Section 19.5.2.15
35 • ompt_callback_nest_lock_t, see Section 19.5.2.16
5 Format
C / C++
6 int omp_test_lock(omp_lock_t *lock);
7 int omp_test_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
8 logical function omp_test_lock(svar)
9 integer (kind=omp_lock_kind) svar
10
11 integer function omp_test_nest_lock(nvar)
12 integer (kind=omp_nest_lock_kind) nvar
Fortran
13 Constraints on Arguments
14 A program that accesses a lock that is in the uninitialized state through either routine is
15 non-conforming. The behavior is unspecified if a simple lock accessed by omp_test_lock is in
16 the locked state and is owned by the task that contains the call.
17 Effect
18 These routines attempt to set a lock in the same manner as omp_set_lock and
19 omp_set_nest_lock, except that they do not suspend execution of the task that executes the
20 routine. For a simple lock, the omp_test_lock routine returns true if the lock is successfully
21 set; otherwise, it returns false. For a nestable lock, the omp_test_nest_lock routine returns
22 the new nesting count if the lock is successfully set; otherwise, it returns zero.
15 Cross References
16 • ompt_callback_mutex_acquire_t, see Section 19.5.2.14
17 • ompt_callback_mutex_t, see Section 19.5.2.15
18 • ompt_callback_nest_lock_t, see Section 19.5.2.16
21 18.10.1 omp_get_wtime
22 Summary
23 The omp_get_wtime routine returns elapsed wall clock time in seconds.
24 Format
C / C++
25 double omp_get_wtime(void);
C / C++
Fortran
26 double precision function omp_get_wtime()
Fortran
27 Binding
28 The binding thread set for an omp_get_wtime region is the encountering thread. The routine’s
29 return value is not guaranteed to be consistent across any set of threads.
6 18.10.2 omp_get_wtick
7 Summary
8 The omp_get_wtick routine returns the precision of the timer used by omp_get_wtime.
9 Format
C / C++
10 double omp_get_wtick(void);
C / C++
Fortran
11 double precision function omp_get_wtick()
Fortran
12 Binding
13 The binding thread set for an omp_get_wtick region is the encountering thread. The routine’s
14 return value is not guaranteed to be consistent across any set of threads.
15 Effect
16 The omp_get_wtick routine returns a value equal to the number of seconds between successive
17 clock ticks of the timer used by omp_get_wtime.
20 Binding
21 The binding thread set for all event routine regions is the encountering thread.
22 18.11.1 omp_fulfill_event
23 Summary
24 This routine fulfills and destroys an OpenMP event.
9 Effect
10 The effect of this routine is to fulfill the event associated with the event handle argument. The effect
11 of fulfilling the event will depend on how the event was created. The event is destroyed and cannot
12 be accessed after calling this routine, and the event handle becomes unassociated with any event.
16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_task_schedule callback with NULL as its
18 next_task_data argument while the argument prior_task_data binds to the detachable task for each
19 occurrence of a task-fulfill event. If the task-fulfill event occurs before the detachable task finished
20 the execution of the associated structured-block, the callback has
21 ompt_task_early_fulfill as its prior_task_status argument; otherwise the callback has
22 ompt_task_late_fulfill as its prior_task_status argument. This callback has type
23 signature ompt_callback_task_schedule_t.
24 Restrictions
25 Restrictions to the omp_fulfill_event routine are as follows:
26 • The event handler passed to the routine must have been created by a thread in the same device as
27 the thread that invoked the routine.
28 Cross References
29 • ompt_callback_task_schedule_t, see Section 19.5.2.10
30 • detach clause, see Section 12.5.2
C / C++
1 Table 18.2 lists the return codes used by routines that take an int* ret_code argument.
2 Binding
3 The binding task set for all interoperability routine regions is the generating task.
C / C++
C / C++
4 18.12.1 omp_get_num_interop_properties
5 Summary
6 The omp_get_num_interop_properties routine retrieves the number of
7 implementation-defined properties available for an omp_interop_t object.
8 Format
9 int omp_get_num_interop_properties(const omp_interop_t interop);
10 Effect
11 The omp_get_num_interop_properties routine returns the number of
12 implementation-defined properties available for interop. The total number of properties available
13 for interop is the returned value minus omp_ipr_first.
C / C++
C / C++
14 18.12.2 omp_get_interop_int
15 Summary
16 The omp_get_interop_int routine retrieves an integer property from an omp_interop_t
17 object.
5 Effect
6 The omp_get_interop_int routine returns the requested integer property, if available, and
7 zero if an error occurs or no value is available. If the interop is omp_interop_none, an empty
8 error occurs. If the property_id is less than omp_ipr_first or greater than or equal to
9 omp_get_num_interop_properties(interop), an out of range error occurs. If the
10 requested property value is not convertible into an integer value, a type error occurs.
11 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
12 return code is stored in the object to which ret_code points. If an error occurred, the stored value
13 will be negative and it will match the error as defined in Table 18.2. On success, zero will be stored.
14 If no error occurred but no meaningful value can be returned, omp_irc_no_value, which is
15 one, will be stored.
16 Restrictions
17 Restrictions to the omp_get_interop_int routine are as follows:
18 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
19 Cross References
20 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++
21 18.12.3 omp_get_interop_ptr
22 Summary
23 The omp_get_interop_ptr routine retrieves a pointer property from an omp_interop_t
24 object.
25 Format
26 void* omp_get_interop_ptr(const omp_interop_t interop,
27 omp_interop_property_t property_id,
28 int *ret_code);
29 Effect
30 The omp_get_interop_ptr routine returns the requested pointer property, if available, and
31 NULL if an error occurs or no value is available. If the interop is omp_interop_none, an empty
32 error occurs. If the property_id is less than omp_ipr_first or greater than or equal to
33 omp_get_num_interop_properties(interop), an out of range error occurs. If the
34 requested property value is not convertible into a pointer value, a type error occurs.
6 Restrictions
7 Restrictions to the omp_get_interop_ptr routine are as follows:
8 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
9 • Memory referenced by the pointer returned by the omp_get_interop_ptr routine is
10 managed by the OpenMP implementation and should not be freed or modified.
11 Cross References
12 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++
13 18.12.4 omp_get_interop_str
14 Summary
15 The omp_get_interop_str routine retrieves a string property from an omp_interop_t
16 object.
17 Format
18 const char* omp_get_interop_str(const omp_interop_t interop,
19 omp_interop_property_t property_id,
20 int *ret_code);
21 Effect
22 The omp_get_interop_str routine returns the requested string property as a C string, if
23 available, and NULL if an error occurs or no value is available. If the interop is
24 omp_interop_none, an empty error occurs. If the property_id is less than omp_ipr_first
25 or greater than or equal to omp_get_num_interop_properties(interop), an out of range
26 error occurs. If the requested property value is not convertible into a string value, a type error
27 occurs.
28 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
29 return code is stored in the object to which the ret_code points. If an error occurred, the stored
30 value will be negative and it will match the error as defined in Table 18.2. On success, zero will be
31 stored. If no error occurred but no meaningful value can be returned, omp_irc_no_value,
32 which is one, will be stored.
6 Cross References
7 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++
8 18.12.5 omp_get_interop_name
9 Summary
10 The omp_get_interop_name routine retrieves a property name from an omp_interop_t
11 object.
12 Format
13 const char* omp_get_interop_name(const omp_interop_t interop,
14 omp_interop_property_t property_id)
15 ;
16 Effect
17 The omp_get_interop_name routine returns the name of the property identified by
18 property_id as a C string. Property names for non-implementation defined properties are listed in
19 Table 18.1. If the property_id is less than omp_ipr_first or greater than or equal to
20 omp_get_num_interop_properties(interop), NULL is returned.
21 Restrictions
22 Restrictions to the omp_get_interop_name routine are as follows:
23 • The behavior of the routine is unspecified if an invalid object is provided.
24 • Memory referenced by the pointer returned by the omp_get_interop_name routine is
25 managed by the OpenMP implementation and should not be freed or modified.
26 Cross References
27 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
1 18.12.6 omp_get_interop_type_desc
2 Summary
3 The omp_get_interop_type_desc routine retrieves a description of the type of a property
4 associated with an omp_interop_t object.
5 Format
6 const char* omp_get_interop_type_desc(const omp_interop_t interop,
7 omp_interop_property_t
8 property_id);
9 Effect
10 The omp_get_interop_type_desc routine returns a C string that describes the type of the
11 property identified by property_id in human-readable form. That may contain a valid C type
12 declaration possibly followed by a description or name of the type. If interop has the value
13 omp_interop_none, NULL is returned. If the property_id is less than omp_ipr_first or
14 greater than or equal to omp_get_num_interop_properties(interop), NULL is returned.
15 Restrictions
16 Restrictions to the omp_get_interop_type_desc routine are as follows:
17 • The behavior of the routine is unspecified if an invalid object is provided.
18 • Memory referenced by the pointer returned from the omp_get_interop_type_desc
19 routine is managed by the OpenMP implementation and should not be freed or modified.
20 Cross References
21 • omp_get_num_interop_properties, see Section 18.12.1
C / C++
C / C++
22 18.12.7 omp_get_interop_rc_desc
23 Summary
24 The omp_get_interop_rc_desc routine retrieves a description of the return code associated
25 with an omp_interop_t object.
26 Format
27 const char* omp_get_interop_rc_desc(const omp_interop_t interop,
28 omp_interop_rc_t ret_code);
29 Effect
30 The omp_get_interop_rc_desc routine returns a C string that describes the return code
31 ret_code in human-readable form.
7 Cross References
8 • Memory Allocators, see Section 6.2
9 • Memory Spaces, see Section 6.1
10 • requires directive, see Section 8.2
11 • target directive, see Section 13.8
12 18.13.3 omp_destroy_allocator
13 Summary
14 The omp_destroy_allocator routine releases all resources used by the allocator handle.
15 Format
C / C++
16 void omp_destroy_allocator (omp_allocator_handle_t allocator);
C / C++
Fortran
17 subroutine omp_destroy_allocator ( allocator )
18 integer(kind=omp_allocator_handle_kind),intent(in) :: allocator
Fortran
19 Constraints on Arguments
20 The allocator argument must not represent a predefined memory allocator.
21 Binding
22 The binding thread set for an omp_destroy_allocator region is all threads on a device. The
23 effect of executing this routine is not related to any specific region that corresponds to any construct
24 or API routine.
25 Effect
26 The omp_destroy_allocator routine releases all resources used to implement the allocator
27 handle. If allocator is omp_null_allocator then this routine will have no effect.
6 Cross References
7 • Memory Allocators, see Section 6.2
8 • requires directive, see Section 8.2
9 • target directive, see Section 13.8
10 18.13.4 omp_set_default_allocator
11 Summary
12 The omp_set_default_allocator routine sets the default memory allocator to be used by
13 allocation calls, allocate clauses and allocate and allocators directives that do not
14 specify an allocator.
15 Format
C / C++
16 void omp_set_default_allocator (omp_allocator_handle_t allocator);
C / C++
Fortran
17 subroutine omp_set_default_allocator ( allocator )
18 integer(kind=omp_allocator_handle_kind),intent(in) :: allocator
Fortran
19 Constraints on Arguments
20 The allocator argument must be a valid memory allocator handle.
21 Binding
22 The binding task set for an omp_set_default_allocator region is the binding implicit task.
23 Effect
24 The effect of this routine is to set the value of the def-allocator-var ICV of the binding implicit task
25 to the value specified in the allocator argument.
26 Cross References
27 • Memory Allocators, see Section 6.2
28 • allocate clause, see Section 6.6
29 • allocate directive, see Section 6.5
30 • allocators directive, see Section 6.7
31 • def-allocator-var ICV, see Table 2.1
27 Binding
28 The binding task set for an omp_alloc or omp_aligned_alloc region is the generating task.
29 Effect
30 The omp_alloc and omp_aligned_alloc routines request a memory allocation of size bytes
31 from the specified memory allocator. If the allocator argument is omp_null_allocator the
32 memory allocator used by the routines will be the one specified by the def-allocator-var ICV of the
33 binding implicit task. Upon success they return a pointer to the allocated memory. Otherwise, the
34 behavior that the fallback trait of the allocator specifies will be followed. If size is 0,
35 omp_alloc and omp_aligned_alloc will return NULL.
12 18.13.7 omp_free
13 Summary
14 The omp_free routine deallocates previously allocated memory.
15 Format
C
16 void omp_free (void *ptr, omp_allocator_handle_t allocator);
C
C++
17 void omp_free(
18 void *ptr,
19 omp_allocator_handle_t allocator=omp_null_allocator
20 );
C++
Fortran
21 subroutine omp_free(ptr, allocator) bind(c)
22 use, intrinsic :: iso_c_binding, only : c_ptr
23 type(c_ptr), value :: ptr
24 integer(omp_allocator_handle_kind), value :: allocator
Fortran
25 Binding
26 The binding task set for an omp_free region is the generating task.
29 Binding
30 The binding task set for an omp_calloc or omp_aligned_calloc region is the generating
31 task.
20 18.13.9 omp_realloc
21 Summary
22 The omp_realloc routine deallocates previously allocated memory and requests a memory
23 allocation from a memory allocator.
24 Format
C
25 void *omp_realloc(
26 void *ptr,
27 size_t size,
28 omp_allocator_handle_t allocator,
29 omp_allocator_handle_t free_allocator
30 );
C
18 Binding
19 The binding task set for an omp_realloc region is the generating task.
20 Effect
21 The omp_realloc routine deallocates the memory to which ptr points and requests a new
22 memory allocation of size bytes from the specified memory allocator. If the free_allocator
23 argument is specified, it must be the memory allocator to which the previous allocation request was
24 made. If the free_allocator argument is omp_null_allocator the implementation will
25 determine that value automatically. If the allocator argument is omp_null_allocator the
26 behavior is as if the memory allocator that allocated the memory to which ptr argument points is
27 passed to the allocator argument. Upon success it returns a (possibly moved) pointer to the
28 allocated memory and the contents of the new object shall be the same as that of the old object
29 prior to deallocation, up to the minimum size of old allocated size and size. Any bytes in the new
30 object beyond the old allocated size will have unspecified values. If the allocation failed, the
31 behavior that the fallback trait of the allocator specifies will be followed. If ptr is NULL,
32 omp_realloc will behave the same as omp_alloc with the same size and allocator arguments.
33 If size is 0, omp_realloc will return NULL and the old allocation will be deallocated. If size is
34 not 0, the old allocation will be deallocated if and only if the function returns a non-null value.
35 Memory allocated by omp_realloc will be byte-aligned to at least the maximum of the
36 alignment required by malloc and the alignment trait of the allocator.
9 Cross References
10 • Memory Allocators, see Section 6.2
11 • omp_alloc and omp_aligned_alloc, see Section 18.13.6
12 • omp_destroy_allocator, see Section 18.13.3
13 • requires directive, see Section 8.2
14 • target directive, see Section 13.8
18 Format
C / C++
19 int omp_control_tool(int command, int modifier, void *arg);
C / C++
Fortran
20 integer function omp_control_tool(command, modifier)
21 integer (kind=omp_control_tool_kind) command
22 integer modifier
Fortran
23 Constraints on Arguments
24 The following enumeration type defines four standard commands. Table 18.3 describes the actions
25 that these commands request from a tool.
18 Binding
19 The binding task set for an omp_control_tool region is the generating task.
11 Restrictions
12 Restrictions on access to the state of an OpenMP first-party tool are as follows:
13 • An application may access the tool state modified by an OMPT callback only by using
14 omp_control_tool.
15 Cross References
16 • OMPT Interface, see Chapter 19
17 • ompt_callback_control_tool_t, see Section 19.5.2.29
22 Format
C / C++
23 void omp_display_env(int verbose);
C / C++
Fortran
24 subroutine omp_display_env(verbose)
25 logical,intent(in) :: verbose
Fortran
26 Binding
27 The binding thread set for an omp_display_env region is the encountering thread.
32 Restrictions
33 Restrictions to the omp_display_env routine are as follows.
34 • When called from within a target region the effect is unspecified.
35 Cross References
36 • OMP_DISPLAY_ENV, see Section 21.7
21 19.2.1 ompt_start_tool
22 Summary
23 In order to use the OMPT interface provided by an OpenMP implementation, a tool must implement
24 the ompt_start_tool function, through which the OpenMP implementation initializes the tool.
17 Description of Arguments
18 The argument omp_version is the value of the _OPENMP version macro associated with the
19 OpenMP API implementation. This value identifies the OpenMP API version that an OpenMP
20 implementation supports, which specifies the version of the OMPT interface that it supports.
21 The argument runtime_version is a version string that unambiguously identifies the OpenMP
22 implementation.
23 Constraints on Arguments
24 The argument runtime_version must be an immutable string that is defined for the lifetime of a
25 program execution.
26 Effect
27 If a tool returns a non-null pointer to an ompt_start_tool_result_t structure, an OpenMP
28 implementation will call the tool initializer specified by the initialize field in this structure before
29 beginning execution of any OpenMP construct or completing execution of any environment routine
30 invocation; the OpenMP implementation will call the tool finalizer specified by the finalize field in
31 this structure when the OpenMP implementation shuts down.
32 Cross References
33 • Tool Initialization and Finalization, see Section 19.4.1
disabled
Runtime shutdown no
or pause Inactive Found? Find next tool
yes r=NULL
Call Return
ompt_start_tool value r
0
r=non-null
1 Return Call
Active
value r->initialize
21 Cross References
22 • Tool Initialization and Finalization, see Section 19.4.1
23 • ompt_start_tool, see Section 19.2.1
24 • tool-libraries-var ICV, see Table 2.1
25 • tool-var ICV, see Table 2.1
8 Cross References
9 • Tool Initialization and Finalization, see Section 19.4.1
10 • ompt_enumerate_mutex_impls_t, see Section 19.6.1.2
11 • ompt_enumerate_states_t, see Section 19.6.1.1
12 • ompt_set_callback_t, see Section 19.6.1.3
13 • ompt_start_tool, see Section 19.2.1
1 Cross References
2 • Lookup Entry Points: ompt_function_lookup_t, see Section 19.6.3
3 • ompt_enumerate_mutex_impls_t, see Section 19.6.1.2
4 • ompt_enumerate_states_t, see Section 19.6.1.1
5 • ompt_get_callback_t, see Section 19.6.1.4
6 • ompt_get_num_devices_t, see Section 19.6.1.17
7 • ompt_get_num_places_t, see Section 19.6.1.7
8 • ompt_get_num_procs_t, see Section 19.6.1.6
9 • ompt_get_parallel_info_t, see Section 19.6.1.13
10 • ompt_get_partition_place_nums_t, see Section 19.6.1.10
11 • ompt_get_place_num_t, see Section 19.6.1.9
12 • ompt_get_place_proc_ids_t, see Section 19.6.1.8
13 • ompt_get_proc_id_t, see Section 19.6.1.11
14 • ompt_get_state_t, see Section 19.6.1.12
35 Cross References
36 • ompt_get_callback_t, see Section 19.6.1.4
Callback Name
ompt_callback_thread_begin
ompt_callback_thread_end
ompt_callback_parallel_begin
ompt_callback_parallel_end
ompt_callback_task_create
ompt_callback_task_schedule
ompt_callback_implicit_task
ompt_callback_target
ompt_callback_target_emi
ompt_callback_target_data_op
ompt_callback_target_data_op_emi
ompt_callback_target_submit
ompt_callback_target_submit_emi
ompt_callback_control_tool
ompt_callback_device_initialize
ompt_callback_device_finalize
ompt_callback_device_load
ompt_callback_device_unload
Callback Name
ompt_callback_sync_region_wait
ompt_callback_mutex_released
ompt_callback_dependences
ompt_callback_task_dependence
ompt_callback_work
ompt_callback_master // (deprecated)
ompt_callback_masked
ompt_callback_target_map
ompt_callback_target_map_emi
ompt_callback_sync_region
ompt_callback_reduction
ompt_callback_lock_init
ompt_callback_lock_destroy
ompt_callback_mutex_acquire
ompt_callback_mutex_acquired
ompt_callback_nest_lock
ompt_callback_flush
ompt_callback_cancel
ompt_callback_dispatch
1 NULL to the device initializer of the tool for its lookup argument; otherwise, the OpenMP
2 implementation passes a pointer to a device-specific runtime entry point with type signature
3 ompt_function_lookup_t to the device initializer of the tool.
4 • If a non-null lookup pointer is provided to the device initializer of the tool, the tool may use it to
5 determine the runtime entry points in the tracing interface that are available for the device and
6 may bind the returned function pointers to tool variables. Table 19.4 indicates the names of
7 runtime entry points that may be available for a device; an implementation may provide
8 additional implementation-defined names and corresponding entry points. The driver for the
9 device provides the runtime entry points that enable a tool to control the trace collection interface
10 of the device. The native trace format that the interface uses may be device specific and the
11 available kinds of trace records are implementation defined. Some devices may allow a tool to
12 collect traces of records in a standard format known as OMPT trace records. Each OMPT trace
13 record serves as a substitute for an OMPT callback that cannot be made on the device. The fields
14 in each trace record type are defined in the description of the callback that the record represents.
15 If this type of record is provided then the lookup function returns values for the runtime entry
16 points ompt_set_trace_ompt and ompt_get_record_ompt, which support collecting
17 and decoding OMPT traces. If the native tracing format for a device is the OMPT format then
18 tracing can be controlled using the runtime entry points for native or OMPT tracing.
30 Restrictions
31 Restrictions on tracing activity on devices are as follows:
32 • Implementation-defined names must not start with the prefix ompt_, which is reserved for the
33 OpenMP specification.
34 Cross References
35 • ompt_advance_buffer_cursor_t, see Section 19.6.2.10
36 • ompt_callback_device_finalize_t, see Section 19.5.2.20
37 • ompt_callback_device_initialize_t, see Section 19.5.2.19
19 Cross References
20 • ompt_finalize_t, see Section 19.5.1.2
16 19.4.2 Callbacks
17 Summary
18 The ompt_callbacks_t enumeration type indicates the integer codes used to identify OpenMP
19 callbacks when registering or querying them.
20 Format
C / C++
21 typedef enum ompt_callbacks_t {
22 ompt_callback_thread_begin = 1,
23 ompt_callback_thread_end = 2,
24 ompt_callback_parallel_begin = 3,
25 ompt_callback_parallel_end = 4,
26 ompt_callback_task_create = 5,
27 ompt_callback_task_schedule = 6,
28 ompt_callback_implicit_task = 7,
29 ompt_callback_target = 8,
30 ompt_callback_target_data_op = 9,
31 ompt_callback_target_submit = 10,
32 ompt_callback_control_tool = 11,
33 ompt_callback_device_initialize = 12,
34 ompt_callback_device_finalize = 13,
35 ompt_callback_device_load = 14,
36 ompt_callback_device_unload = 15,
25 19.4.3 Tracing
26 OpenMP provides type definitions that support tracing with OMPT.
5 Format
C / C++
6 typedef enum ompt_record_native_t {
7 ompt_record_native_info = 1,
8 ompt_record_native_event = 2
9 } ompt_record_native_t;
C / C++
14 Format
C / C++
15 typedef struct ompt_record_abstract_t {
16 ompt_record_native_t rclass;
17 const char *type;
18 ompt_device_time_t start_time;
19 ompt_device_time_t end_time;
20 ompt_hwid_t hwid;
21 } ompt_record_abstract_t;
C / C++
22 Semantics
23 An ompt_record_abstract_t record contains information that a tool can use to process a
24 native record that it may not fully understand. The rclass field indicates that the record is
25 informational or that it represents an event; this information can help a tool determine how to
26 present the record. The record type field points to a statically-allocated, immutable character string
27 that provides a meaningful name that a tool can use to describe the event to a user. The start_time
28 and end_time fields are used to place an event in time. The times are relative to the device clock. If
29 an event does not have an associated start_time (end_time), the value of the start_time (end_time)
30 field is ompt_time_none. The hardware identifier field, hwid, indicates the location on the
31 device where the event occurred. A hwid may represent a hardware abstraction such as a core or a
32 hardware thread identifier. The meaning of a hwid value for a device is implementation defined. If
33 no hardware abstraction is associated with the record then the value of hwid is ompt_hwid_none.
4 Format
C / C++
5 typedef struct ompt_record_ompt_t {
6 ompt_callbacks_t type;
7 ompt_device_time_t time;
8 ompt_id_t thread_id;
9 ompt_id_t target_id;
10 union {
11 ompt_record_thread_begin_t thread_begin;
12 ompt_record_parallel_begin_t parallel_begin;
13 ompt_record_parallel_end_t parallel_end;
14 ompt_record_work_t work;
15 ompt_record_dispatch_t dispatch;
16 ompt_record_task_create_t task_create;
17 ompt_record_dependences_t dependences;
18 ompt_record_task_dependence_t task_dependence;
19 ompt_record_task_schedule_t task_schedule;
20 ompt_record_implicit_task_t implicit_task;
21 ompt_record_masked_t masked;
22 ompt_record_sync_region_t sync_region;
23 ompt_record_mutex_acquire_t mutex_acquire;
24 ompt_record_mutex_t mutex;
25 ompt_record_nest_lock_t nest_lock;
26 ompt_record_flush_t flush;
27 ompt_record_cancel_t cancel;
28 ompt_record_target_t target;
29 ompt_record_target_data_op_t target_data_op;
30 ompt_record_target_map_t target_map;
31 ompt_record_target_kernel_t target_kernel;
32 ompt_record_control_tool_t control_tool;
33 ompt_record_error_t error;
34 } record;
35 } ompt_record_ompt_t;
C / C++
36 Semantics
37 The field type specifies the type of record provided by this structure. According to the type, event
38 specific information is stored in the matching record entry.
6 19.4.4.1 ompt_callback_t
7 Summary
8 Pointers to tool callback functions with different type signatures are passed to the
9 ompt_set_callback runtime entry point and returned by the ompt_get_callback
10 runtime entry point. For convenience, these runtime entry points expect all type signatures to be
11 cast to a dummy type ompt_callback_t.
12 Format
C / C++
13 typedef void (*ompt_callback_t) (void);
C / C++
14 19.4.4.2 ompt_set_result_t
15 Summary
16 The ompt_set_result_t enumeration type corresponds to values that the
17 ompt_set_callback, ompt_set_trace_ompt and ompt_set_trace_native
18 runtime entry points return.
19 Format
C / C++
20 typedef enum ompt_set_result_t {
21 ompt_set_error = 0,
22 ompt_set_never = 1,
23 ompt_set_impossible = 2,
24 ompt_set_sometimes = 3,
25 ompt_set_sometimes_paired = 4,
26 ompt_set_always = 5
27 } ompt_set_result_t;
C / C++
28 Semantics
29 Values of ompt_set_result_t, may indicate several possible outcomes. The
30 ompt_set_error value indicates that the associated call failed. Otherwise, the value indicates
31 when an event may occur and, when appropriate, dispatching a callback event leads to the
32 invocation of the callback. The ompt_set_never value indicates that the event will never occur
33 or that the callback will never be invoked at runtime. The ompt_set_impossible value
34 indicates that the event may occur but that tracing of it is not possible. The
35 ompt_set_sometimes value indicates that the event may occur and, for an
7 Cross References
8 • ompt_set_callback_t, see Section 19.6.1.3
9 • ompt_set_trace_native_t, see Section 19.6.2.5
10 • ompt_set_trace_ompt_t, see Section 19.6.2.4
11 19.4.4.3 ompt_id_t
12 Summary
13 The ompt_id_t type is used to provide various identifiers to tools.
14 Format
C / C++
15 typedef uint64_t ompt_id_t;
C / C++
16 Semantics
17 When tracing asynchronous activity on devices, identifiers enable tools to correlate target regions
18 and operations that the host initiates with associated activities on a target device. In addition,
19 OMPT provides identifiers to refer to parallel regions and tasks that execute on a device. These
20 various identifiers are of type ompt_id_t.
21 ompt_id_none is defined as an instance of type ompt_id_t with the value 0.
22 Restrictions
23 Restrictions to the ompt_id_t type are as follows:
24 • Identifiers created on each device must be unique from the time an OpenMP implementation is
25 initialized until it is shut down. Identifiers for each target region and target data operation
26 instance that the host device initiates must be unique over time on the host. Identifiers for parallel
27 and task region instances that execute on a device must be unique over time within that device.
28 19.4.4.4 ompt_data_t
29 Summary
30 The ompt_data_t type represents data associated with threads and with parallel and task regions.
12 19.4.4.5 ompt_device_t
13 Summary
14 The ompt_device_t opaque object type represents a device.
15 Format
C / C++
16 typedef void ompt_device_t;
C / C++
17 19.4.4.6 ompt_device_time_t
18 Summary
19 The ompt_device_time_t type represents raw device time values.
20 Format
C / C++
21 typedef uint64_t ompt_device_time_t;
C / C++
22 Semantics
23 The ompt_device_time_t opaque object type represents raw device time values.
24 ompt_time_none refers to an unknown or unspecified time and is defined as an instance of type
25 ompt_device_time_t with the value 0.
26 19.4.4.7 ompt_buffer_t
27 Summary
28 The ompt_buffer_t opaque object type is a handle for a target buffer.
3 19.4.4.8 ompt_buffer_cursor_t
4 Summary
5 The ompt_buffer_cursor_t opaque type is a handle for a position in a target buffer.
6 Format
C / C++
7 typedef uint64_t ompt_buffer_cursor_t;
C / C++
8 19.4.4.9 ompt_dependence_t
9 Summary
10 The ompt_dependence_t type represents a task dependence.
11 Format
C / C++
12 typedef struct ompt_dependence_t {
13 ompt_data_t variable;
14 ompt_dependence_type_t dependence_type;
15 } ompt_dependence_t;
C / C++
16 Semantics
17 The ompt_dependence_t type is a structure that holds information about a depend clause. For
18 task dependences, the variable field points to the storage location of the dependence. For doacross
19 dependences, the variable field contains the value of a vector element that describes the
20 dependence. The dependence_type field indicates the type of the dependence.
21 Cross References
22 • ompt_dependence_type_t, see Section 19.4.4.24
23 19.4.4.10 ompt_thread_t
24 Summary
25 The ompt_thread_t enumeration type defines the valid thread type values.
14 19.4.4.11 ompt_scope_endpoint_t
15 Summary
16 The ompt_scope_endpoint_t enumeration type defines valid scope endpoint values.
17 Format
C / C++
18 typedef enum ompt_scope_endpoint_t {
19 ompt_scope_begin = 1,
20 ompt_scope_end = 2,
21 ompt_scope_beginend = 3
22 } ompt_scope_endpoint_t;
C / C++
23 19.4.4.12 ompt_dispatch_t
24 Summary
25 The ompt_dispatch_t enumeration type defines the valid dispatch kind values.
26 Format
C / C++
27 typedef enum ompt_dispatch_t {
28 ompt_dispatch_iteration = 1,
29 ompt_dispatch_section = 2,
30 ompt_dispatch_ws_loop_chunk = 3,
31 ompt_dispatch_taskloop_chunk = 4,
32 ompt_dispatch_distribute_chunk = 5
33 } ompt_dispatch_t;
C / C++
4 Format
C / C++
5 typedef struct ompt_dispatch_chunk_t {
6 uint64_t start;
7 uint64_t iterations;
8 } ompt_dispatch_chunk_t;
C / C++
9 Semantics
10 The ompt_dispatch_chunk_t type is a structure that holds information about a chunk of
11 logical iterations of a loop nest. The start field specifies the first logical iteration of the chunk and
12 the iterations field specifies the number of iterations in the chunk. Whether the chunk of a taskloop
13 is contiguous is implementation defined.
14 19.4.4.14 ompt_sync_region_t
15 Summary
16 The ompt_sync_region_t enumeration type defines the valid synchronization region kind
17 values.
18 Format
C / C++
19 typedef enum ompt_sync_region_t {
20 ompt_sync_region_barrier = 1, // deprecated
21 ompt_sync_region_barrier_implicit = 2, // deprecated
22 ompt_sync_region_barrier_explicit = 3,
23 ompt_sync_region_barrier_implementation = 4,
24 ompt_sync_region_taskwait = 5,
25 ompt_sync_region_taskgroup = 6,
26 ompt_sync_region_reduction = 7,
27 ompt_sync_region_barrier_implicit_workshare = 8,
28 ompt_sync_region_barrier_implicit_parallel = 9,
29 ompt_sync_region_barrier_teams = 10
30 } ompt_sync_region_t;
C / C++
31 19.4.4.15 ompt_target_data_op_t
32 Summary
33 The ompt_target_data_op_t enumeration type defines the valid target data operation values.
14 19.4.4.16 ompt_work_t
15 Summary
16 The ompt_work_t enumeration type defines the valid work type values.
17 Format
C / C++
18 typedef enum ompt_work_t {
19 ompt_work_loop = 1,
20 ompt_work_sections = 2,
21 ompt_work_single_executor = 3,
22 ompt_work_single_other = 4,
23 ompt_work_workshare = 5,
24 ompt_work_distribute = 6,
25 ompt_work_taskloop = 7,
26 ompt_work_scope = 8,
27 ompt_work_loop_static = 10,
28 ompt_work_loop_dynamic = 11,
29 ompt_work_loop_guided = 12,
30 ompt_work_loop_other = 13
31 } ompt_work_t;
C / C++
32 19.4.4.17 ompt_mutex_t
33 Summary
34 The ompt_mutex_t enumeration type defines the valid mutex kind values.
11 19.4.4.18 ompt_native_mon_flag_t
12 Summary
13 The ompt_native_mon_flag_t enumeration type defines the valid native monitoring flag
14 values.
15 Format
C / C++
16 typedef enum ompt_native_mon_flag_t {
17 ompt_native_data_motion_explicit = 0x01,
18 ompt_native_data_motion_implicit = 0x02,
19 ompt_native_kernel_invocation = 0x04,
20 ompt_native_kernel_execution = 0x08,
21 ompt_native_driver = 0x10,
22 ompt_native_runtime = 0x20,
23 ompt_native_overhead = 0x40,
24 ompt_native_idleness = 0x80
25 } ompt_native_mon_flag_t;
C / C++
26 19.4.4.19 ompt_task_flag_t
27 Summary
28 The ompt_task_flag_t enumeration type defines valid task types.
29 Format
C / C++
30 typedef enum ompt_task_flag_t {
31 ompt_task_initial = 0x00000001,
32 ompt_task_implicit = 0x00000002,
33 ompt_task_explicit = 0x00000004,
34 ompt_task_target = 0x00000008,
35 ompt_task_taskwait = 0x00000010,
11 19.4.4.20 ompt_task_status_t
12 Summary
13 The ompt_task_status_t enumeration type indicates the reason that a task was switched
14 when it reached a task scheduling point.
15 Format
C / C++
16 typedef enum ompt_task_status_t {
17 ompt_task_complete = 1,
18 ompt_task_yield = 2,
19 ompt_task_cancel = 3,
20 ompt_task_detach = 4,
21 ompt_task_early_fulfill = 5,
22 ompt_task_late_fulfill = 6,
23 ompt_task_switch = 7,
24 ompt_taskwait_complete = 8
25 } ompt_task_status_t;
C / C++
26 Semantics
27 The value ompt_task_complete of the ompt_task_status_t type indicates that the task
28 that encountered the task scheduling point completed execution of the associated structured block
29 and an associated allow-completion event was fulfilled. The value ompt_task_yield indicates
30 that the task encountered a taskyield construct. The value ompt_task_cancel indicates
31 that the task was canceled when it encountered an active cancellation point. The value
32 ompt_task_detach indicates that a task for which the detach clause was specified completed
33 execution of the associated structured block and is waiting for an allow-completion event to be
8 19.4.4.21 ompt_target_t
9 Summary
10 The ompt_target_t enumeration type defines the valid target type values.
11 Format
C / C++
12 typedef enum ompt_target_t {
13 ompt_target = 1,
14 ompt_target_enter_data = 2,
15 ompt_target_exit_data = 3,
16 ompt_target_update = 4,
17 ompt_target_nowait = 9,
18 ompt_target_enter_data_nowait = 10,
19 ompt_target_exit_data_nowait = 11,
20 ompt_target_update_nowait = 12
21 ompt_target_t;
C / C++
22 19.4.4.22 ompt_parallel_flag_t
23 Summary
24 The ompt_parallel_flag_t enumeration type defines valid invoker values.
25 Format
C / C++
26 typedef enum ompt_parallel_flag_t {
27 ompt_parallel_invoker_program = 0x00000001,
28 ompt_parallel_invoker_runtime = 0x00000002,
29 ompt_parallel_league = 0x40000000,
30 ompt_parallel_team = 0x80000000
31 } ompt_parallel_flag_t;
C / C++
11 19.4.4.23 ompt_target_map_flag_t
12 Summary
13 The ompt_target_map_flag_t enumeration type defines the valid target map flag values.
14 Format
C / C++
15 typedef enum ompt_target_map_flag_t {
16 ompt_target_map_flag_to = 0x01,
17 ompt_target_map_flag_from = 0x02,
18 ompt_target_map_flag_alloc = 0x04,
19 ompt_target_map_flag_release = 0x08,
20 ompt_target_map_flag_delete = 0x10,
21 ompt_target_map_flag_implicit = 0x20,
22 ompt_target_map_flag_always = 0x40,
23 ompt_target_map_flag_present = 0x80,
24 ompt_target_map_flag_close = 0x100,
25 ompt_target_map_flag_shared = 0x200
26 } ompt_target_map_flag_t;
C / C++
27 Semantics
28 The ompt_target_map_flag_ map-type flag is set if the mapping operations have that
29 map-type. If the map-type for the mapping operations is tofrom, both the
30 ompt_target_map_flag_to and ompt_target_map_flag_from flags are set. The
31 ompt_target_map_implicit flag is set if the mapping operations result from implicit
32 data-mapping rules. The ompt_target_map_flag_ map-type-modifier flag is set if the
33 mapping operations are specified with that map-type-modifier. The
34 ompt_target_map_flag_shared flag is set if the original and corresponding storage are
35 shared in the mapping operation.
5 Format
C / C++
6 typedef enum ompt_dependence_type_t {
7 ompt_dependence_type_in = 1,
8 ompt_dependence_type_out = 2,
9 ompt_dependence_type_inout = 3,
10 ompt_dependence_type_mutexinoutset = 4,
11 ompt_dependence_type_source = 5,
12 ompt_dependence_type_sink = 6,
13 ompt_dependence_type_inoutset = 7
14 } ompt_dependence_type_t;
C / C++
15 19.4.4.25 ompt_severity_t
16 Summary
17 The ompt_severity_t enumeration type defines the valid severity values.
18 Format
C / C++
19 typedef enum ompt_severity_t {
20 ompt_warning = 1,
21 ompt_fatal = 2
22 } ompt_severity_t;
C / C++
23 19.4.4.26 ompt_cancel_flag_t
24 Summary
25 The ompt_cancel_flag_t enumeration type defines the valid cancel flag values.
26 Format
C / C++
27 typedef enum ompt_cancel_flag_t {
28 ompt_cancel_parallel = 0x01,
29 ompt_cancel_sections = 0x02,
30 ompt_cancel_loop = 0x04,
31 ompt_cancel_taskgroup = 0x08,
5 19.4.4.27 ompt_hwid_t
6 Summary
7 The ompt_hwid_t opaque type is a handle for a hardware identifier for a target device.
8 Format
C / C++
9 typedef uint64_t ompt_hwid_t;
C / C++
10 Semantics
11 The ompt_hwid_t opaque type is a handle for a hardware identifier for a target device.
12 ompt_hwid_none is an instance of the type that refers to an unknown or unspecified hardware
13 identifier and that has the value 0. If no hwid is associated with an
14 ompt_record_abstract_t then the value of hwid is ompt_hwid_none.
15 Cross References
16 • Native Record Abstract Type, see Section 19.4.3.3
17 19.4.4.28 ompt_state_t
18 Summary
19 If the OMPT interface is in the active state then an OpenMP implementation must maintain thread
20 state information for each thread. The thread state maintained is an approximation of the
21 instantaneous state of a thread.
22 Format
C / C++
23 A thread state must be one of the values of the enumeration type ompt_state_t or an
24 implementation-defined state value of 512 or higher.
25 typedef enum ompt_state_t {
26 ompt_state_work_serial = 0x000,
27 ompt_state_work_parallel = 0x001,
28 ompt_state_work_reduction = 0x002,
29
30 ompt_state_wait_barrier = 0x010, //
31 deprecated
32 ompt_state_wait_barrier_implicit_parallel = 0x011,
33 ompt_state_wait_barrier_implicit_workshare = 0x012,
23 19.4.4.29 ompt_frame_t
24 Summary
25 The ompt_frame_t type describes procedure frame information for an OpenMP task.
26 Format
C / C++
27 typedef struct ompt_frame_t {
28 ompt_data_t exit_frame;
29 ompt_data_t enter_frame;
30 int exit_frame_flags;
31 int enter_frame_flags;
32 } ompt_frame_t;
C / C++
30 19.4.4.30 ompt_frame_flag_t
31 Summary
32 The ompt_frame_flag_t enumeration type defines valid frame information flags.
19 19.4.4.31 ompt_wait_id_t
20 Summary
21 The ompt_wait_id_t type describes wait identifiers for an OpenMP thread.
22 Format
C / C++
23 typedef uint64_t ompt_wait_id_t;
C / C++
24 Semantics
25 Each thread maintains a wait identifier of type ompt_wait_id_t. When a task that a thread
26 executes is waiting for mutual exclusion, the wait identifier of the thread indicates the reason that
27 the thread is waiting. A wait identifier may represent a critical section name, a lock, a program
28 variable accessed in an atomic region, or a synchronization object that is internal to an OpenMP
29 implementation. When a thread is not in a wait state then the value of the wait identifier of the
30 thread is undefined. ompt_wait_id_none is defined as an instance of type
31 ompt_wait_id_t with the value 0.
13 Format
C / C++
14 typedef int (*ompt_initialize_t) (
15 ompt_function_lookup_t lookup,
16 int initial_device_num,
17 ompt_data_t *tool_data
18 );
C / C++
19 Semantics
20 To use the OMPT interface, an implementation of ompt_start_tool must return a non-null
21 pointer to an ompt_start_tool_result_t structure that contains a pointer to a tool
22 initializer function with type signature ompt_initialize_t. An OpenMP implementation will
23 call the initializer after fully initializing itself but before beginning execution of any OpenMP
24 construct or runtime library routine. The initializer returns a non-zero value if it succeeds;
25 otherwise, the OMPT interface state changes to inactive as described in Section 19.2.3.
26 Description of Arguments
27 The lookup argument is a callback to an OpenMP runtime routine that must be used to obtain a
28 pointer to each runtime entry point in the OMPT interface. The initial_device_num argument
29 provides the value of omp_get_initial_device(). The tool_data argument is a pointer to
30 the tool_data field in the ompt_start_tool_result_t structure that ompt_start_tool
31 returned.
6 19.5.1.2 ompt_finalize_t
7 Summary
8 A tool implements a finalizer with the type signature ompt_finalize_t to finalize its use of the
9 OMPT interface.
10 Format
C / C++
11 typedef void (*ompt_finalize_t) (
12 ompt_data_t *tool_data
13 );
C / C++
14 Semantics
15 To use the OMPT interface, an implementation of ompt_start_tool must return a non-null
16 pointer to an ompt_start_tool_result_t structure that contains a non-null pointer to a tool
17 finalizer with type signature ompt_finalize_t. An OpenMP implementation must call the tool
18 finalizer after the last OMPT event as the OpenMP implementation shuts down.
19 Description of Arguments
20 The tool_data argument is a pointer to the tool_data field in the
21 ompt_start_tool_result_t structure returned by ompt_start_tool.
22 Cross References
23 • Tool Initialization and Finalization, see Section 19.4.1
24 • ompt_data_t, see Section 19.4.4.4
25 • ompt_start_tool, see Section 19.2.1
4 19.5.2.1 ompt_callback_thread_begin_t
5 Summary
6 The ompt_callback_thread_begin_t type is used for callbacks that are dispatched when
7 native threads are created.
8 Format
C / C++
9 typedef void (*ompt_callback_thread_begin_t) (
10 ompt_thread_t thread_type,
11 ompt_data_t *thread_data
12 );
C / C++
13 Trace Record
C / C++
14 typedef struct ompt_record_thread_begin_t {
15 ompt_thread_t thread_type;
16 } ompt_record_thread_begin_t;
C / C++
17 Description of Arguments
18 The thread_type argument indicates the type of the new thread: initial, worker, or other. The
19 binding of the thread_data argument is the new thread.
20 Cross References
21 • Initial Task, see Section 12.8
22 • ompt_data_t, see Section 19.4.4.4
23 • ompt_thread_t, see Section 19.4.4.10
24 • parallel directive, see Section 10.1
25 • teams directive, see Section 10.2
5 Format
C / C++
6 typedef void (*ompt_callback_thread_end_t) (
7 ompt_data_t *thread_data
8 );
C / C++
9 Description of Arguments
10 The binding of the thread_data argument is the thread that will be destroyed.
11 Cross References
12 • Initial Task, see Section 12.8
13 • Standard Trace Record Type, see Section 19.4.3.4
14 • ompt_data_t, see Section 19.4.4.4
15 • parallel directive, see Section 10.1
16 • teams directive, see Section 10.2
17 19.5.2.3 ompt_callback_parallel_begin_t
18 Summary
19 The ompt_callback_parallel_begin_t type is used for callbacks that are dispatched
20 when a parallel or teams region starts.
21 Format
C / C++
22 typedef void (*ompt_callback_parallel_begin_t) (
23 ompt_data_t *encountering_task_data,
24 const ompt_frame_t *encountering_task_frame,
25 ompt_data_t *parallel_data,
26 unsigned int requested_parallelism,
27 int flags,
28 const void *codeptr_ra
29 );
C / C++
32 19.5.2.4 ompt_callback_parallel_end_t
33 Summary
34 The ompt_callback_parallel_end_t type is used for callbacks that are dispatched when a
35 parallel or teams region ends.
27 Cross References
28 • ompt_data_t, see Section 19.4.4.4
29 • ompt_parallel_flag_t, see Section 19.4.4.22
30 • parallel directive, see Section 10.1
31 • teams directive, see Section 10.2
5 Format
C / C++
6 typedef void (*ompt_callback_work_t) (
7 ompt_work_t work_type,
8 ompt_scope_endpoint_t endpoint,
9 ompt_data_t *parallel_data,
10 ompt_data_t *task_data,
11 uint64_t count,
12 const void *codeptr_ra
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_work_t {
16 ompt_work_t work_type;
17 ompt_scope_endpoint_t endpoint;
18 ompt_id_t parallel_id;
19 ompt_id_t task_id;
20 uint64_t count;
21 const void *codeptr_ra;
22 } ompt_record_work_t;
C / C++
23 Description of Arguments
24 The work_type argument indicates the kind of region.
25 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
26 scope.
27 The binding of the parallel_data argument is the current parallel region.
28 The binding of the task_data argument is the current task.
29 The count argument is a measure of the quantity of work involved in the construct. For a
30 worksharing-loop or taskloop construct, count represents the number of iterations in the
31 iteration space, which may be the result of collapsing several associated loops. For a sections
32 construct, count represents the number of sections. For a workshare construct, count represents
33 the units of work, as defined by the workshare construct. For a single or scope construct,
9 Cross References
10 • Work-Distribution Constructs, see Chapter 11
11 • ompt_data_t, see Section 19.4.4.4
12 • ompt_scope_endpoint_t, see Section 19.4.4.11
13 • ompt_work_t, see Section 19.4.4.16
14 • taskloop directive, see Section 12.6
15 19.5.2.6 ompt_callback_dispatch_t
16 Summary
17 The ompt_callback_dispatch_t type is used for callbacks that are dispatched when a
18 thread begins to execute a section or loop iteration.
19 Format
C / C++
20 typedef void (*ompt_callback_dispatch_t) (
21 ompt_data_t *parallel_data,
22 ompt_data_t *task_data,
23 ompt_dispatch_t kind,
24 ompt_data_t instance
25 );
C / C++
26 Trace Record
C / C++
27 typedef struct ompt_record_dispatch_t {
28 ompt_id_t parallel_id;
29 ompt_id_t task_id;
30 ompt_dispatch_t kind;
31 ompt_data_t instance;
32 } ompt_record_dispatch_t;
C / C++
23 19.5.2.7 ompt_callback_task_create_t
24 Summary
25 The ompt_callback_task_create_t type is used for callbacks that are dispatched when
26 task regions are generated.
27 Format
C / C++
28 typedef void (*ompt_callback_task_create_t) (
29 ompt_data_t *encountering_task_data,
30 const ompt_frame_t *encountering_task_frame,
31 ompt_data_t *new_task_data,
32 int flags,
33 int has_dependences,
34 const void *codeptr_ra
35 );
C / C++
23 Cross References
24 • Initial Task, see Section 12.8
25 • ompt_data_t, see Section 19.4.4.4
26 • ompt_frame_t, see Section 19.4.4.29
27 • ompt_task_flag_t, see Section 19.4.4.19
28 • task directive, see Section 12.5
29 19.5.2.8 ompt_callback_dependences_t
30 Summary
31 The ompt_callback_dependences_t type is used for callbacks that are related to
32 dependences and that are dispatched when new tasks are generated and when ordered constructs
33 are encountered.
25 Cross References
26 • ompt_data_t, see Section 19.4.4.4
27 • ompt_dependence_t, see Section 19.4.4.9
28 • depend clause, see Section 15.9.5
29 • ordered directive, see Section 15.10.1
5 Format
C / C++
6 typedef void (*ompt_callback_task_dependence_t) (
7 ompt_data_t *src_task_data,
8 ompt_data_t *sink_task_data
9 );
C / C++
10 Trace Record
C / C++
11 typedef struct ompt_record_task_dependence_t {
12 ompt_id_t src_task_id;
13 ompt_id_t sink_task_id;
14 } ompt_record_task_dependence_t;
C / C++
15 Description of Arguments
16 The binding of the src_task_data argument is a running task with an outgoing dependence.
17 The binding of the sink_task_data argument is a task with an unsatisfied incoming dependence.
18 Cross References
19 • ompt_data_t, see Section 19.4.4.4
20 • depend clause, see Section 15.9.5
21 19.5.2.10 ompt_callback_task_schedule_t
22 Summary
23 The ompt_callback_task_schedule_t type is used for callbacks that are dispatched when
24 task scheduling decisions are made.
25 Format
C / C++
26 typedef void (*ompt_callback_task_schedule_t) (
27 ompt_data_t *prior_task_data,
28 ompt_task_status_t prior_task_status,
29 ompt_data_t *next_task_data
30 );
C / C++
14 Cross References
15 • Task Scheduling, see Section 12.9
16 • ompt_data_t, see Section 19.4.4.4
17 • ompt_task_status_t, see Section 19.4.4.20
18 19.5.2.11 ompt_callback_implicit_task_t
19 Summary
20 The ompt_callback_implicit_task_t type is used for callbacks that are dispatched when
21 initial tasks and implicit tasks are generated and completed.
22 Format
C / C++
23 typedef void (*ompt_callback_implicit_task_t) (
24 ompt_scope_endpoint_t endpoint,
25 ompt_data_t *parallel_data,
26 ompt_data_t *task_data,
27 unsigned int actual_parallelism,
28 unsigned int index,
29 int flags
30 );
C / C++
25 Cross References
26 • ompt_data_t, see Section 19.4.4.4
27 • ompt_scope_endpoint_t, see Section 19.4.4.11
28 • parallel directive, see Section 10.1
29 • teams directive, see Section 10.2
30 19.5.2.12 ompt_callback_masked_t
31 Summary
32 The ompt_callback_masked_t type is used for callbacks that are dispatched when masked
33 regions start and end.
30 19.5.2.13 ompt_callback_sync_region_t
31 Summary
32 The ompt_callback_sync_region_t type is used for callbacks that are dispatched when
33 barrier regions, taskwait regions, and taskgroup regions begin and end and when waiting
34 begins and ends for them as well as for when reductions are performed.
10 19.5.2.14 ompt_callback_mutex_acquire_t
11 Summary
12 The ompt_callback_mutex_acquire_t type is used for callbacks that are dispatched when
13 locks are initialized, acquired and tested and when critical regions, atomic regions, and
14 ordered regions are begun.
15 Format
C / C++
16 typedef void (*ompt_callback_mutex_acquire_t) (
17 ompt_mutex_t kind,
18 unsigned int hint,
19 unsigned int impl,
20 ompt_wait_id_t wait_id,
21 const void *codeptr_ra
22 );
C / C++
23 Trace Record
C / C++
24 typedef struct ompt_record_mutex_acquire_t {
25 ompt_mutex_t kind;
26 unsigned int hint;
27 unsigned int impl;
28 ompt_wait_id_t wait_id;
29 const void *codeptr_ra;
30 } ompt_record_mutex_acquire_t;
C / C++
15 Cross References
16 • omp_init_lock and omp_init_nest_lock, see Section 18.9.1
17 • ompt_mutex_t, see Section 19.4.4.17
18 • ordered Construct, see Section 15.10
19 • atomic directive, see Section 15.8.4
20 • critical directive, see Section 15.2
21 • ompt_wait_id_t, see Section 19.4.4.31
22 19.5.2.15 ompt_callback_mutex_t
23 Summary
24 The ompt_callback_mutex_t type is used for callbacks that indicate important
25 synchronization events.
26 Format
C / C++
27 typedef void (*ompt_callback_mutex_t) (
28 ompt_mutex_t kind,
29 ompt_wait_id_t wait_id,
30 const void *codeptr_ra
31 );
C / C++
16 Cross References
17 • omp_set_lock and omp_set_nest_lock, see Section 18.9.4
18 • omp_test_lock and omp_test_nest_lock, see Section 18.9.6
19 • omp_unset_lock and omp_unset_nest_lock, see Section 18.9.5
20 • ompt_mutex_t, see Section 19.4.4.17
21 • ordered Construct, see Section 15.10
22 • atomic directive, see Section 15.8.4
23 • critical directive, see Section 15.2
24 • omp_destroy_lock and omp_destroy_nest_lock, see Section 18.9.3
25 • ompt_wait_id_t, see Section 19.4.4.31
26 19.5.2.16 ompt_callback_nest_lock_t
27 Summary
28 The ompt_callback_nest_lock_t type is used for callbacks that indicate that a thread that
29 owns a nested lock has performed an action related to the lock but has not relinquished ownership.
23 Cross References
24 • omp_set_lock and omp_set_nest_lock, see Section 18.9.4
25 • omp_test_lock and omp_test_nest_lock, see Section 18.9.6
26 • omp_unset_lock and omp_unset_nest_lock, see Section 18.9.5
27 • ompt_scope_endpoint_t, see Section 19.4.4.11
28 • ompt_wait_id_t, see Section 19.4.4.31
29 19.5.2.17 ompt_callback_flush_t
30 Summary
31 The ompt_callback_flush_t type is used for callbacks that are dispatched when flush
32 constructs are encountered.
21 19.5.2.18 ompt_callback_cancel_t
22 Summary
23 The ompt_callback_cancel_t type is used for callbacks that are dispatched for cancellation,
24 cancel and discarded-task events.
25 Format
C / C++
26 typedef void (*ompt_callback_cancel_t) (
27 ompt_data_t *task_data,
28 int flags,
29 const void *codeptr_ra
30 );
C / C++
23 19.5.2.19 ompt_callback_device_initialize_t
24 Summary
25 The ompt_callback_device_initialize_t type is used for callbacks that initialize
26 device tracing interfaces.
27 Format
C / C++
28 typedef void (*ompt_callback_device_initialize_t) (
29 int device_num,
30 const char *type,
31 ompt_device_t *device,
32 ompt_function_lookup_t lookup,
33 const char *documentation
34 );
C / C++
6 Description of Arguments
7 The device_num argument identifies the logical device that is being initialized.
8 The type argument is a C string that indicates the type of the device. A device type string is a
9 semicolon-separated character string that includes, at a minimum, the vendor and model name of
10 the device. These names may be followed by a semicolon-separated sequence of properties that
11 describe the hardware or software of the device.
12 The device argument is a pointer to an opaque object that represents the target device instance.
13 Functions in the device tracing interface use this pointer to identify the device that is being
14 addressed.
15 The lookup argument points to a runtime callback that a tool must use to obtain pointers to runtime
16 entry points in the device’s OMPT tracing interface. If a device does not support tracing then
17 lookup is NULL.
18 The documentation argument is a C string that describes how to use any device-specific runtime
19 entry points that can be obtained through the lookup argument. This documentation string may be a
20 pointer to external documentation, or it may be inline descriptions that include names and type
21 signatures for any device-specific interfaces that are available through the lookup argument along
22 with descriptions of how to use these interface functions to control monitoring and analysis of
23 device traces.
24 Constraints on Arguments
25 The type and documentation arguments must be immutable strings that are defined for the lifetime
26 of program execution.
27 Effect
28 A device initializer must fulfill several duties. First, the type argument should be used to determine
29 if any special knowledge about the hardware and/or software of a device is employed. Second, the
30 lookup argument should be used to look up pointers to runtime entry points in the OMPT tracing
31 interface for the device. Finally, these runtime entry points should be used to set up tracing for the
32 device. Initialization of tracing for a target device is described in Section 19.2.5.
33 Cross References
34 • Lookup Entry Points: ompt_function_lookup_t, see Section 19.6.3
21 19.5.2.21 ompt_callback_device_load_t
22 Summary
23 The ompt_callback_device_load_t type is used for callbacks that the OpenMP runtime
24 invokes to indicate that it has just loaded code onto the specified device.
25 Format
C / C++
26 typedef void (*ompt_callback_device_load_t) (
27 int device_num,
28 const char *filename,
29 int64_t offset_in_file,
30 void *vma_in_file,
31 size_t bytes,
32 void *host_addr,
33 void *device_addr,
34 uint64_t module_id
35 );
C / C++
16 Cross References
17 • Device Directives and Clauses, see Chapter 13
18 19.5.2.22 ompt_callback_device_unload_t
19 Summary
20 The ompt_callback_device_unload_t type is used for callbacks that the OpenMP
21 runtime invokes to indicate that it is about to unload code from the specified device.
22 Format
C / C++
23 typedef void (*ompt_callback_device_unload_t) (
24 int device_num,
25 uint64_t module_id
26 );
C / C++
27 Description of Arguments
28 The device_num argument specifies the device.
29 The module_id argument is an identifier that is associated with the device code object.
30 Cross References
31 • Device Directives and Clauses, see Chapter 13
23 19.5.2.24 ompt_callback_buffer_complete_t
24 Summary
25 The ompt_callback_buffer_complete_t type is used for callbacks that are dispatched
26 when devices will not record any more trace records in an event buffer and all records written to the
27 buffer are valid.
28 Format
C / C++
29 typedef void (*ompt_callback_buffer_complete_t) (
30 int device_num,
31 ompt_buffer_t *buffer,
32 size_t bytes,
33 ompt_buffer_cursor_t begin,
34 int buffer_owned
35 );
C / C++
7 Description of Arguments
8 The device_num argument indicates the device for which the buffer contains events.
9 The buffer argument is the address of a buffer that was previously allocated by a buffer request
10 callback.
11 The bytes argument indicates the full size of the buffer.
12 The begin argument is an opaque cursor that indicates the position of the beginning of the first
13 record in the buffer.
14 The buffer_owned argument is 1 if the data to which the buffer points can be deleted by the callback
15 and 0 otherwise. If multiple devices accumulate trace events into a single buffer, this callback may
16 be invoked with a pointer to one or more trace records in a shared buffer with buffer_owned = 0. In
17 this case, the callback may not delete the buffer.
18 Cross References
19 • ompt_buffer_cursor_t, see Section 19.4.4.8
20 • ompt_buffer_t, see Section 19.4.4.7
27 Format
C / C++
28 typedef void (*ompt_callback_target_data_op_emi_t) (
29 ompt_scope_endpoint_t endpoint,
30 ompt_data_t *target_task_data,
31 ompt_data_t *target_data,
32 ompt_id_t *host_op_id,
33 ompt_target_data_op_t optype,
34 void *src_addr,
35 int src_device_num,
17 Restrictions
18 Restrictions to the ompt_callback_target_emi and ompt_callback_target callbacks
19 are as follows:
20 • These callbacks must not be registered at the same time.
21 Cross References
22 • ompt_data_t, see Section 19.4.4.4
23 • ompt_id_t, see Section 19.4.4.3
24 • ompt_scope_endpoint_t, see Section 19.4.4.11
25 • ompt_target_t, see Section 19.4.4.21
26 • target data directive, see Section 13.5
27 • target directive, see Section 13.8
28 • target enter data directive, see Section 13.6
29 • target exit data directive, see Section 13.7
30 • target update directive, see Section 13.9
6 Format
C / C++
7 typedef void (*ompt_callback_target_map_emi_t) (
8 ompt_data_t *target_data,
9 unsigned int nitems,
10 void **host_addr,
11 void **device_addr,
12 size_t *bytes,
13 unsigned int *mapping_flags,
14 const void *codeptr_ra
15 );
16 typedef void (*ompt_callback_target_map_t) (
17 ompt_id_t target_id,
18 unsigned int nitems,
19 void **host_addr,
20 void **device_addr,
21 size_t *bytes,
22 unsigned int *mapping_flags,
23 const void *codeptr_ra
24 );
C / C++
25 Trace Record
C / C++
26 typedef struct ompt_record_target_map_t {
27 ompt_id_t target_id;
28 unsigned int nitems;
29 void **host_addr;
30 void **device_addr;
31 size_t *bytes;
32 unsigned int *mapping_flags;
33 const void *codeptr_ra;
34 } ompt_record_target_map_t;
C / C++
29 Restrictions
30 Restrictions to the ompt_callback_target_data_map_emi and
31 ompt_callback_target_data_map callbacks are as follows:
32 • These callbacks must not be registered at the same time.
33 Cross References
34 • ompt_callback_target_data_op_emi_t and
35 ompt_callback_target_data_op_t, see Section 19.5.2.25
36 • ompt_data_t, see Section 19.4.4.4
37 • ompt_id_t, see Section 19.4.4.3
12 Format
C / C++
13 typedef void (*ompt_callback_target_submit_emi_t) (
14 ompt_scope_endpoint_t endpoint,
15 ompt_data_t *target_data,
16 ompt_id_t *host_op_id,
17 unsigned int requested_num_teams
18 );
19 typedef void (*ompt_callback_target_submit_t) (
20 ompt_id_t target_id,
21 ompt_id_t host_op_id,
22 unsigned int requested_num_teams
23 );
C / C++
24 Trace Record
C / C++
25 typedef struct ompt_record_target_kernel_t {
26 ompt_id_t host_op_id;
27 unsigned int requested_num_teams;
28 unsigned int granted_num_teams;
29 ompt_device_time_t end_time;
30 } ompt_record_target_kernel_t;
C / C++
5 Format
C / C++
6 typedef int (*ompt_callback_control_tool_t) (
7 uint64_t command,
8 uint64_t modifier,
9 void *arg,
10 const void *codeptr_ra
11 );
C / C++
12 Trace Record
C / C++
13 typedef struct ompt_record_control_tool_t {
14 uint64_t command;
15 uint64_t modifier;
16 const void *codeptr_ra;
17 } ompt_record_control_tool_t;
C / C++
18 Semantics
19 Callbacks with type signature ompt_callback_control_tool_t may return any
20 non-negative value, which will be returned to the application as the return value of the
21 omp_control_tool call that triggered the callback.
22 Description of Arguments
23 The command argument passes a command from an application to a tool. Standard values for
24 command are defined by omp_control_tool_t in Section 18.14.
25 The modifier argument passes a command modifier from an application to a tool.
26 The command and modifier arguments may have tool-specific values. Tools must ignore command
27 values that they are not designed to handle.
28 The arg argument is a void pointer that enables a tool and an application to exchange arbitrary state.
29 The arg argument may be NULL.
7 Constraints on Arguments
8 Tool-specific values for command must be ≥ 64.
9 Cross References
10 • Tool Control Routine, see Section 18.14
11 19.5.2.30 ompt_callback_error_t
12 Summary
13 The ompt_callback_error_t type is used for callbacks that dispatch runtime-error events.
14 Format
C / C++
15 typedef void (*ompt_callback_error_t) (
16 ompt_severity_t severity,
17 const char *message,
18 size_t length,
19 const void *codeptr_ra
20 );
C / C++
21 Trace Record
C / C++
22 typedef struct ompt_record_error_t {
23 ompt_severity_t severity;
24 const char *message;
25 size_t length;
26 const void *codeptr_ra;
27 } ompt_record_error_t;
C / C++
28 Semantics
29 A thread dispatches a registered ompt_callback_error_t callback when an error directive
30 is encountered for which the at(execution) clause is specified.
6 19.6.1.1 ompt_enumerate_states_t
7 Summary
8 The ompt_enumerate_states_t type is the type signature of the
9 ompt_enumerate_states runtime entry point, which enumerates the thread states that an
10 OpenMP implementation supports.
11 Format
C / C++
12 typedef int (*ompt_enumerate_states_t) (
13 int current_state,
14 int *next_state,
15 const char **next_state_name
16 );
C / C++
17 Semantics
18 An OpenMP implementation may support only a subset of the states that the ompt_state_t
19 enumeration type defines. An OpenMP implementation may also support implementation-specific
20 states. The ompt_enumerate_states runtime entry point, which has type signature
21 ompt_enumerate_states_t, enables a tool to enumerate the supported thread states.
22 When a supported thread state is passed as current_state, the runtime entry point assigns the next
23 thread state in the enumeration to the variable passed by reference in next_state and assigns the
24 name associated with that state to the character pointer passed by reference in next_state_name.
25 Whenever one or more states are left in the enumeration, the ompt_enumerate_states
26 runtime entry point returns 1. When the last state in the enumeration is passed as current_state,
27 ompt_enumerate_states returns 0, which indicates that the enumeration is complete.
28 Description of Arguments
29 The current_state argument must be a thread state that the OpenMP implementation supports. To
30 begin enumerating the supported states, a tool should pass ompt_state_undefined as
31 current_state. Subsequent invocations of ompt_enumerate_states should pass the value
32 assigned to the variable that was passed by reference in next_state to the previous call.
33 The value ompt_state_undefined is reserved to indicate an invalid thread state.
34 ompt_state_undefined is defined as an integer with the value 0x102.
5 Constraints on Arguments
6 Any string returned through the next_state_name argument must be immutable and defined for the
7 lifetime of program execution.
8 Cross References
9 • ompt_state_t, see Section 19.4.4.28
10 19.6.1.2 ompt_enumerate_mutex_impls_t
11 Summary
12 The ompt_enumerate_mutex_impls_t type is the type signature of the
13 ompt_enumerate_mutex_impls runtime entry point, which enumerates the kinds of mutual
14 exclusion implementations that an OpenMP implementation employs.
15 Format
C / C++
16 typedef int (*ompt_enumerate_mutex_impls_t) (
17 int current_impl,
18 int *next_impl,
19 const char **next_impl_name
20 );
C / C++
21 Semantics
22 Mutual exclusion for locks, critical sections, and atomic regions may be implemented in
23 several ways. The ompt_enumerate_mutex_impls runtime entry point, which has type
24 signature ompt_enumerate_mutex_impls_t, enables a tool to enumerate the supported
25 mutual exclusion implementations.
26 When a supported mutex implementation is passed as current_impl, the runtime entry point assigns
27 the next mutex implementation in the enumeration to the variable passed by reference in next_impl
28 and assigns the name associated with that mutex implementation to the character pointer passed by
29 reference in next_impl_name.
30 Whenever one or more mutex implementations are left in the enumeration, the
31 ompt_enumerate_mutex_impls runtime entry point returns 1. When the last mutex
32 implementation in the enumeration is passed as current_impl, the runtime entry point returns 0,
33 which indicates that the enumeration is complete.
14 Constraints on Arguments
15 Any string returned through the next_impl_name argument must be immutable and defined for the
16 lifetime of a program execution.
17 19.6.1.3 ompt_set_callback_t
18 Summary
19 The ompt_set_callback_t type is the type signature of the ompt_set_callback runtime
20 entry point, which registers a pointer to a tool callback that an OpenMP implementation invokes
21 when a host OpenMP event occurs.
22 Format
C / C++
23 typedef ompt_set_result_t (*ompt_set_callback_t) (
24 ompt_callbacks_t event,
25 ompt_callback_t callback
26 );
C / C++
27 Semantics
28 OpenMP implementations can use callbacks to indicate the occurrence of events during the
29 execution of an OpenMP program. The ompt_set_callback runtime entry point, which has
30 type signature ompt_set_callback_t, registers a callback for an OpenMP event on the
31 current device, The return value of ompt_set_callback indicates the outcome of registering
32 the callback.
18 19.6.1.4 ompt_get_callback_t
19 Summary
20 The ompt_get_callback_t type is the type signature of the ompt_get_callback runtime
21 entry point, which retrieves a pointer to a registered tool callback routine (if any) that an OpenMP
22 implementation invokes when a host OpenMP event occurs.
23 Format
C / C++
24 typedef int (*ompt_get_callback_t) (
25 ompt_callbacks_t event,
26 ompt_callback_t *callback
27 );
C / C++
28 Semantics
29 The ompt_get_callback runtime entry point, which has type signature
30 ompt_get_callback_t, retrieves a pointer to the tool callback that an OpenMP
31 implementation may invoke when a host OpenMP event occurs. If a non-null tool callback is
32 registered for the specified event, the pointer to the tool callback is assigned to the variable passed
33 by reference in callback and ompt_get_callback returns 1; otherwise, it returns 0. If
34 ompt_get_callback returns 0, the value of the variable passed by reference as callback is
35 undefined.
10 19.6.1.5 ompt_get_thread_data_t
11 Summary
12 The ompt_get_thread_data_t type is the type signature of the
13 ompt_get_thread_data runtime entry point, which returns the address of the thread data
14 object for the current thread.
15 Format
C / C++
16 typedef ompt_data_t *(*ompt_get_thread_data_t) (void);
C / C++
17 Semantics
18 Each OpenMP thread can have an associated thread data object of type ompt_data_t. The
19 ompt_get_thread_data runtime entry point, which has type signature
20 ompt_get_thread_data_t, retrieves a pointer to the thread data object, if any, that is
21 associated with the current thread. A tool may use a pointer to an OpenMP thread’s data object that
22 ompt_get_thread_data retrieves to inspect or to modify the value of the data object. When
23 an OpenMP thread is created, its data object is initialized with value ompt_data_none. This
24 runtime entry point is async signal safe.
25 Cross References
26 • ompt_data_t, see Section 19.4.4.4
27 19.6.1.6 ompt_get_num_procs_t
28 Summary
29 The ompt_get_num_procs_t type is the type signature of the ompt_get_num_procs
30 runtime entry point, which returns the number of processors currently available to the execution
31 environment on the host device.
5 Semantics
6 The ompt_get_num_procs runtime entry point, which has type signature
7 ompt_get_num_procs_t, returns the number of processors that are available on the host
8 device at the time the routine is called. This value may change between the time that it is
9 determined and the time that it is read in the calling context due to system actions outside the
10 control of the OpenMP implementation. This runtime entry point is async signal safe.
11 19.6.1.7 ompt_get_num_places_t
12 Summary
13 The ompt_get_num_places_t type is the type signature of the ompt_get_num_places
14 runtime entry point, which returns the number of places currently available to the execution
15 environment in the place list.
16 Format
C / C++
17 typedef int (*ompt_get_num_places_t) (void);
C / C++
18 Binding
19 The binding thread set is all threads on a device.
20 Semantics
21 The ompt_get_num_places runtime entry point, which has type signature
22 ompt_get_num_places_t, returns the number of places in the place list. This value is
23 equivalent to the number of places in the place-partition-var ICV in the execution environment of
24 the initial task. This runtime entry point is async signal safe.
25 Cross References
26 • OMP_PLACES, see Section 21.1.6
27 • place-partition-var ICV, see Table 2.1
6 Format
C / C++
7 typedef int (*ompt_get_place_proc_ids_t) (
8 int place_num,
9 int ids_size,
10 int *ids
11 );
C / C++
12 Binding
13 The binding thread set is all threads on a device.
14 Semantics
15 The ompt_get_place_proc_ids runtime entry point, which has type signature
16 ompt_get_place_proc_ids_t, returns the numerical identifiers of each processor that is
17 associated with the specified place. These numerical identifiers are non-negative, and their meaning
18 is implementation defined.
19 Description of Arguments
20 The place_num argument specifies the place that is being queried.
21 The ids argument is an array in which the routine can return a vector of processor identifiers in the
22 specified place.
23 The ids_size argument indicates the size of the result array that is specified by ids.
24 Effect
25 If the ids array of size ids_size is large enough to contain all identifiers then they are returned in ids
26 and their order in the array is implementation defined. Otherwise, if the ids array is too small, the
27 values in ids when the function returns are unspecified. The routine always returns the number of
28 numerical identifiers of the processors that are available to the execution environment in the
29 specified place.
30 19.6.1.9 ompt_get_place_num_t
31 Summary
32 The ompt_get_place_num_t type is the type signature of the ompt_get_place_num
33 runtime entry point, which returns the place number of the place to which the current thread is
34 bound.
8 19.6.1.10 ompt_get_partition_place_nums_t
9 Summary
10 The ompt_get_partition_place_nums_t type is the type signature of the
11 ompt_get_partition_place_nums runtime entry point, which returns a list of place
12 numbers that correspond to the places in the place-partition-var ICV of the innermost implicit task.
13 Format
C / C++
14 typedef int (*ompt_get_partition_place_nums_t) (
15 int place_nums_size,
16 int *place_nums
17 );
C / C++
18 Semantics
19 The ompt_get_partition_place_nums runtime entry point, which has type signature
20 ompt_get_partition_place_nums_t, returns a list of place numbers that correspond to
21 the places in the place-partition-var ICV of the innermost implicit task. This runtime entry point is
22 async signal safe.
23 Description of Arguments
24 The place_nums argument is an array in which the routine can return a vector of place identifiers.
25 The place_nums_size argument indicates the size of the result array that the place_nums argument
26 specifies.
27 Effect
28 If the place_nums array of size place_nums_size is large enough to contain all identifiers then they
29 are returned in place_nums and their order in the array is implementation defined. Otherwise, if the
30 place_nums array is too small, the values in place_nums when the function returns are unspecified.
31 The routine always returns the number of places in the place-partition-var ICV of the innermost
32 implicit task.
4 19.6.1.11 ompt_get_proc_id_t
5 Summary
6 The ompt_get_proc_id_t type is the type signature of the ompt_get_proc_id runtime
7 entry point, which returns the numerical identifier of the processor of the current thread.
8 Format
C / C++
9 typedef int (*ompt_get_proc_id_t) (void);
C / C++
10 Semantics
11 The ompt_get_proc_id runtime entry point, which has type signature
12 ompt_get_proc_id_t, returns the numerical identifier of the processor of the current thread.
13 A defined numerical identifier is non-negative, and its meaning is implementation defined. A
14 negative number indicates a failure to retrieve the numerical identifier. This runtime entry point is
15 async signal safe.
16 19.6.1.12 ompt_get_state_t
17 Summary
18 The ompt_get_state_t type is the type signature of the ompt_get_state runtime entry
19 point, which returns the state and the wait identifier of the current thread.
20 Format
C / C++
21 typedef int (*ompt_get_state_t) (
22 ompt_wait_id_t *wait_id
23 );
C / C++
24 Semantics
25 Each OpenMP thread has an associated state and a wait identifier. If a thread’s state indicates that
26 the thread is waiting for mutual exclusion then its wait identifier contains an opaque handle that
27 indicates the data object upon which the thread is waiting. The ompt_get_state runtime entry
28 point, which has type signature ompt_get_state_t, retrieves the state and wait identifier of the
29 current thread. The returned value may be any one of the states predefined by ompt_state_t or
30 a value that represents an implementation-specific state. The tool may obtain a string representation
5 Description of Arguments
6 The wait_id argument is a pointer to an opaque handle that is available to receive the value of the
7 wait identifier of the thread. If wait_id is not NULL then the entry point assigns the value of the
8 wait identifier of the thread to the object to which wait_id points. If the returned state is not one of
9 the specified wait states then the value of the opaque object to which wait_id points is undefined
10 after the call.
11 Constraints on Arguments
12 The argument passed to the entry point must be a reference to a variable of the specified type or
13 NULL.
14 Cross References
15 • ompt_enumerate_states_t, see Section 19.6.1.1
16 • ompt_state_t, see Section 19.4.4.28
17 • ompt_wait_id_t, see Section 19.4.4.31
18 19.6.1.13 ompt_get_parallel_info_t
19 Summary
20 The ompt_get_parallel_info_t type is the type signature of the
21 ompt_get_parallel_info runtime entry point, which returns information about the parallel
22 region, if any, at the specified ancestor level for the current execution context.
23 Format
C / C++
24 typedef int (*ompt_get_parallel_info_t) (
25 int ancestor_level,
26 ompt_data_t **parallel_data,
27 int *team_size
28 );
C / C++
29 Semantics
30 During execution, an OpenMP program may employ nested parallel regions. The
31 ompt_get_parallel_info runtime entry point, which has type signature
32 ompt_get_parallel_info_t, retrieves information about the current parallel region and any
33 enclosing parallel regions for the current execution context. The entry point returns 2 if a parallel
34 region exists at the specified ancestor level and the information is available, 1 if a parallel region
35 exists at the specified ancestor level but the information is currently unavailable, and 0 otherwise.
32 19.6.1.14 ompt_get_task_info_t
33 Summary
34 The ompt_get_task_info_t type is the type signature of the ompt_get_task_info
35 runtime entry point, which returns information about the task, if any, at the specified ancestor level
36 in the current execution context.
24 Cross References
25 • ompt_data_t, see Section 19.4.4.4
26 • ompt_frame_t, see Section 19.4.4.29
27 • ompt_task_flag_t, see Section 19.4.4.19
28 19.6.1.15 ompt_get_task_memory_t
29 Summary
30 The ompt_get_task_memory_t type is the type signature of the
31 ompt_get_task_memory runtime entry point, which returns information about memory ranges
32 that are associated with the task.
17 Description of Arguments
18 The addr argument is a pointer to a void pointer return value to provide the start address of a
19 memory block.
20 The size argument is a pointer to a size type return value to provide the size of the memory block.
21 The block argument is an integer value to specify the memory block of interest.
22 19.6.1.16 ompt_get_target_info_t
23 Summary
24 The ompt_get_target_info_t type is the type signature of the
25 ompt_get_target_info runtime entry point, which returns identifiers that specify a thread’s
26 current target region and target operation ID, if any.
27 Format
C / C++
28 typedef int (*ompt_get_target_info_t) (
29 uint64_t *device_num,
30 ompt_id_t *target_id,
31 ompt_id_t *host_op_id
32 );
C / C++
8 Description of Arguments
9 The device_num argument returns the device number if the current thread is in a target region.
10 The target_id argument returns the target region identifier if the current thread is in a target
11 region.
12 If the current thread is in the process of initiating an operation on a target device (for example,
13 copying data to or from an accelerator or launching a kernel), then host_op_id returns the identifier
14 for the operation; otherwise, host_op_id returns ompt_id_none.
15 Constraints on Arguments
16 Arguments passed to the entry point must be valid references to variables of the specified types.
17 Cross References
18 • ompt_id_t, see Section 19.4.4.3
19 19.6.1.17 ompt_get_num_devices_t
20 Summary
21 The ompt_get_num_devices_t type is the type signature of the
22 ompt_get_num_devices runtime entry point, which returns the number of available devices.
23 Format
C / C++
24 typedef int (*ompt_get_num_devices_t) (void);
C / C++
25 Semantics
26 The ompt_get_num_devices runtime entry point, which has type signature
27 ompt_get_num_devices_t, returns the number of devices available to an OpenMP program.
28 This runtime entry point is async signal safe.
29 19.6.1.18 ompt_get_unique_id_t
30 Summary
31 The ompt_get_unique_id_t type is the type signature of the ompt_get_unique_id
32 runtime entry point, which returns a unique number.
8 19.6.1.19 ompt_finalize_tool_t
9 Summary
10 The ompt_finalize_tool_t type is the type signature of the ompt_finalize_tool
11 runtime entry point, which enables a tool to finalize itself.
12 Format
C / C++
13 typedef void (*ompt_finalize_tool_t) (void);
C / C++
14 Semantics
15 A tool may detect that the execution of an OpenMP program is ending before the OpenMP
16 implementation does. To facilitate clean termination of the tool, the tool may invoke the
17 ompt_finalize_tool runtime entry point, which has type signature
18 ompt_finalize_tool_t. Upon completion of ompt_finalize_tool, no OMPT
19 callbacks are dispatched.
20 Effect
21 The ompt_finalize_tool routine detaches the tool from the runtime, unregisters all callbacks
22 and invalidates all OMPT entry points passed to the tool in the lookup-function. Upon completion
23 of ompt_finalize_tool, no further callbacks will be issued on any thread. Before the
24 callbacks are unregistered, the OpenMP runtime should attempt to dispatch all outstanding
25 registered callbacks as well as the callbacks that would be encountered during shutdown of the
26 runtime, if possible in the current execution context.
16 Description of Arguments
17 The device argument is a pointer to an opaque object that represents the target device instance. The
18 pointer to the device instance object is used by functions in the device tracing interface to identify
19 the device being addressed.
20 Cross References
21 • ompt_device_t, see Section 19.4.4.5
22 19.6.2.2 ompt_get_device_time_t
23 Summary
24 The ompt_get_device_time_t type is the type signature of the
25 ompt_get_device_time runtime entry point, which returns the current time on the specified
26 device.
27 Format
C / C++
28 typedef ompt_device_time_t (*ompt_get_device_time_t) (
29 ompt_device_t *device
30 );
C / C++
15 19.6.2.3 ompt_translate_time_t
16 Summary
17 The ompt_translate_time_t type is the type signature of the ompt_translate_time
18 runtime entry point, which translates a time value that is obtained from the specified device to a
19 corresponding time value on the host device.
20 Format
C / C++
21 typedef double (*ompt_translate_time_t) (
22 ompt_device_t *device,
23 ompt_device_time_t time
24 );
C / C++
25 Semantics
26 The ompt_translate_time runtime entry point, which has type signature
27 ompt_translate_time_t, translates a time value obtained from the specified device to a
28 corresponding time value on the host device. The returned value for the host time has the same
29 meaning as the value returned from omp_get_wtime.
30 Description of Arguments
31 The device argument is a pointer to an opaque object that represents the target device instance. The
32 pointer to the device instance object is used by functions in the device tracing interface to identify
33 the device being addressed.
34 The time argument is a time from the specified device.
5 19.6.2.4 ompt_set_trace_ompt_t
6 Summary
7 The ompt_set_trace_ompt_t type is the type signature of the ompt_set_trace_ompt
8 runtime entry point, which enables or disables the recording of trace records for one or more types
9 of OMPT events.
10 Format
C / C++
11 typedef ompt_set_result_t (*ompt_set_trace_ompt_t) (
12 ompt_device_t *device,
13 unsigned int enable,
14 unsigned int etype
15 );
C / C++
16 Description of Arguments
17 The device argument points to an opaque object that represents the target device instance. Functions
18 in the device tracing interface use this pointer to identify the device that is being addressed.
19 The etype argument indicates the events to which the invocation of ompt_set_trace_ompt
20 applies. If the value of etype is 0 then the invocation applies to all events. If etype is positive then it
21 applies to the event in ompt_callbacks_t that matches that value.
22 The enable argument indicates whether tracing should be enabled or disabled for the event or events
23 that the etype argument specifies. A positive value for enable indicates that recording should be
24 enabled; a value of 0 for enable indicates that recording should be disabled.
25 Restrictions
26 Restrictions on the ompt_set_trace_ompt runtime entry point are as follows:
27 • The entry point must not return ompt_set_sometimes_paired.
28 Cross References
29 • Callbacks, see Section 19.4.2
30 • Tracing Activity on Target Devices with OMPT, see Section 19.2.5
31 • ompt_device_t, see Section 19.4.4.5
32 • ompt_set_result_t, see Section 19.4.4.2
6 Format
C / C++
7 typedef ompt_set_result_t (*ompt_set_trace_native_t) (
8 ompt_device_t *device,
9 int enable,
10 int flags
11 );
C / C++
12 Semantics
13 This interface is designed for use by a tool that cannot directly use native control functions for the
14 device. If a tool can directly use the native control functions then it can invoke native control
15 functions directly using pointers that the lookup function associated with the device provides and
16 that are described in the documentation string that is provided to the device initializer callback.
17 Description of Arguments
18 The device argument points to an opaque object that represents the target device instance. Functions
19 in the device tracing interface use this pointer to identify the device that is being addressed.
20 The enable argument indicates whether this invocation should enable or disable recording of events.
21 The flags argument specifies the kinds of native device monitoring to enable or to disable. Each
22 kind of monitoring is specified by a flag bit. Flags can be composed by using logical or to combine
23 enumeration values from type ompt_native_mon_flag_t.
24 Restrictions
25 Restrictions on the ompt_set_trace_native runtime entry point are as follows:
26 • The entry point must not return ompt_set_sometimes_paired.
27 Cross References
28 • Tracing Activity on Target Devices with OMPT, see Section 19.2.5
29 • ompt_device_t, see Section 19.4.4.5
30 • ompt_native_mon_flag_t, see Section 19.4.4.18
31 • ompt_set_result_t, see Section 19.4.4.2
5 Format
C / C++
6 typedef int (*ompt_start_trace_t) (
7 ompt_device_t *device,
8 ompt_callback_buffer_request_t request,
9 ompt_callback_buffer_complete_t complete
10 );
C / C++
11 Semantics
12 A device’s ompt_start_trace runtime entry point, which has type signature
13 ompt_start_trace_t, initiates tracing on the device. Under normal operating conditions,
14 every event buffer provided to a device by a tool callback is returned to the tool before the OpenMP
15 runtime shuts down. If an exceptional condition terminates execution of an OpenMP program, the
16 OpenMP runtime may not return buffers provided to the device. An invocation of
17 ompt_start_trace returns 1 if the command succeeds and 0 otherwise.
18 Description of Arguments
19 The device argument points to an opaque object that represents the target device instance. Functions
20 in the device tracing interface use this pointer to identify the device that is being addressed.
21 The request argument specifies a tool callback that supplies a buffer in which a device can deposit
22 events.
23 The complete argument specifies a tool callback that is invoked by the OpenMP implementation to
24 empty a buffer that contains event records.
25 Cross References
26 • ompt_callback_buffer_complete_t, see Section 19.5.2.24
27 • ompt_callback_buffer_request_t, see Section 19.5.2.23
28 • ompt_device_t, see Section 19.4.4.5
29 19.6.2.7 ompt_pause_trace_t
30 Summary
31 The ompt_pause_trace_t type is the type signature of the ompt_pause_trace runtime
32 entry point, which pauses or restarts activity tracing on a specific device.
11 Description of Arguments
12 The device argument points to an opaque object that represents the target device instance. Functions
13 in the device tracing interface use this pointer to identify the device that is being addressed.
14 The begin_pause argument indicates whether to pause or to resume tracing. To resume tracing,
15 zero should be supplied for begin_pause; to pause tracing, any other value should be supplied.
16 Cross References
17 • ompt_device_t, see Section 19.4.4.5
18 19.6.2.8 ompt_flush_trace_t
19 Summary
20 The ompt_flush_trace_t type is the type signature of the ompt_flush_trace runtime
21 entry point, which causes all pending trace records for the specified device to be delivered.
22 Format
C / C++
23 typedef int (*ompt_flush_trace_t) (
24 ompt_device_t *device
25 );
C / C++
26 Semantics
27 A device’s ompt_flush_trace runtime entry point, which has type signature
28 ompt_flush_trace_t, causes the OpenMP implementation to issue a sequence of zero or more
29 buffer completion callbacks to deliver all trace records that have been collected prior to the flush.
30 An invocation of ompt_flush_trace returns 1 if the command succeeds and 0 otherwise.
31 Description of Arguments
32 The device argument points to an opaque object that represents the target device instance. Functions
33 in the device tracing interface use this pointer to identify the device that is being addressed.
3 19.6.2.9 ompt_stop_trace_t
4 Summary
5 The ompt_stop_trace_t type is the type signature of the ompt_stop_trace runtime entry
6 point, which stops tracing for a device.
7 Format
C / C++
8 typedef int (*ompt_stop_trace_t) (
9 ompt_device_t *device
10 );
C / C++
11 Semantics
12 A device’s ompt_stop_trace runtime entry point, which has type signature
13 ompt_stop_trace_t, halts tracing on the device and requests that any pending trace records be
14 flushed. An invocation of ompt_stop_trace returns 1 if the command succeeds and 0
15 otherwise.
16 Description of Arguments
17 The device argument points to an opaque object that represents the target device instance. Functions
18 in the device tracing interface use this pointer to identify the device that is being addressed.
19 Cross References
20 • ompt_device_t, see Section 19.4.4.5
21 19.6.2.10 ompt_advance_buffer_cursor_t
22 Summary
23 The ompt_advance_buffer_cursor_t type is the type signature of the
24 ompt_advance_buffer_cursor runtime entry point, which advances a trace buffer cursor to
25 the next record.
26 Format
C / C++
27 typedef int (*ompt_advance_buffer_cursor_t) (
28 ompt_device_t *device,
29 ompt_buffer_t *buffer,
30 size_t size,
31 ompt_buffer_cursor_t current,
32 ompt_buffer_cursor_t *next
33 );
C / C++
6 Description of Arguments
7 The device argument points to an opaque object that represents the target device instance. Functions
8 in the device tracing interface use this pointer to identify the device that is being addressed.
9 The buffer argument indicates a trace buffer that is associated with the cursors.
10 The argument size indicates the size of buffer in bytes.
11 The current argument is an opaque buffer cursor.
12 The next argument returns the next value of an opaque buffer cursor.
13 Cross References
14 • ompt_buffer_cursor_t, see Section 19.4.4.8
15 • ompt_device_t, see Section 19.4.4.5
16 19.6.2.11 ompt_get_record_type_t
17 Summary
18 The ompt_get_record_type_t type is the type signature of the
19 ompt_get_record_type runtime entry point, which inspects the type of a trace record.
20 Format
C / C++
21 typedef ompt_record_t (*ompt_get_record_type_t) (
22 ompt_buffer_t *buffer,
23 ompt_buffer_cursor_t current
24 );
C / C++
25 Semantics
26 Trace records for a device may be in one of two forms: native record format, which may be
27 device-specific, or OMPT record format, in which each trace record corresponds to an OpenMP
28 event and most fields in the record structure are the arguments that would be passed to the OMPT
29 callback for the event. A device’s ompt_get_record_type runtime entry point, which has
30 type signature ompt_get_record_type_t, inspects the type of a trace record and indicates
31 whether the record at the current position in the trace buffer is an OMPT record, a native record, or
32 an invalid record. An invalid record type is returned if the cursor is out of bounds.
8 19.6.2.12 ompt_get_record_ompt_t
9 Summary
10 The ompt_get_record_ompt_t type is the type signature of the
11 ompt_get_record_ompt runtime entry point, which obtains a pointer to an OMPT trace
12 record from a trace buffer associated with a device.
13 Format
C / C++
14 typedef ompt_record_ompt_t *(*ompt_get_record_ompt_t) (
15 ompt_buffer_t *buffer,
16 ompt_buffer_cursor_t current
17 );
C / C++
18 Semantics
19 A device’s ompt_get_record_ompt runtime entry point, which has type signature
20 ompt_get_record_ompt_t, returns a pointer that may point to a record in the trace buffer, or
21 it may point to a record in thread-local storage in which the information extracted from a record was
22 assembled. The information available for an event depends upon its type. The return value of the
23 ompt_record_ompt_t type includes a field of a union type that can represent information for
24 any OMPT event record type. Another call to the runtime entry point may overwrite the contents of
25 the fields in a record returned by a prior invocation.
26 Description of Arguments
27 The buffer argument indicates a trace buffer.
28 The current argument is an opaque buffer cursor.
29 Cross References
30 • Standard Trace Record Type, see Section 19.4.3.4
31 • ompt_buffer_cursor_t, see Section 19.4.4.8
32 • ompt_device_t, see Section 19.4.4.5
6 Format
C / C++
7 typedef void *(*ompt_get_record_native_t) (
8 ompt_buffer_t *buffer,
9 ompt_buffer_cursor_t current,
10 ompt_id_t *host_op_id
11 );
C / C++
12 Semantics
13 A device’s ompt_get_record_native runtime entry point, which has type signature
14 ompt_get_record_native_t, returns a pointer that may point into the specified trace buffer,
15 or into thread-local storage in which the information extracted from a trace record was assembled.
16 The information available for a native event depends upon its type. If the function returns a non-null
17 result, it will also set the object to which host_op_id points to a host-side identifier for the
18 operation that is associated with the record. A subsequent call to ompt_get_record_native
19 may overwrite the contents of the fields in a record returned by a prior invocation.
20 Description of Arguments
21 The buffer argument indicates a trace buffer.
22 The current argument is an opaque buffer cursor.
23 The host_op_id argument is a pointer to an identifier that is returned by the function. The entry
24 point sets the identifier to which host_op_id points to the value of a host-side identifier for an
25 operation on a target device that was created when the operation was initiated by the host.
26 Cross References
27 • ompt_buffer_cursor_t, see Section 19.4.4.8
28 • ompt_buffer_t, see Section 19.4.4.7
29 • ompt_id_t, see Section 19.4.4.3
30 19.6.2.14 ompt_get_record_abstract_t
31 Summary
32 The ompt_get_record_abstract_t type is the type signature of the
33 ompt_get_record_abstract runtime entry point, which summarizes the context of a native
34 (device-specific) trace record.
10 Description of Arguments
11 The native_record argument is a pointer to a native trace record.
12 Cross References
13 • Native Record Abstract Type, see Section 19.4.3.3
18 Format
C / C++
19 typedef void (*ompt_interface_fn_t) (void);
20
21 typedef ompt_interface_fn_t (*ompt_function_lookup_t) (
22 const char *interface_function_name
23 );
C / C++
24 Semantics
25 An OpenMP implementation provides pointers to lookup routines that provide pointers to OMPT
26 runtime entry points. When the implementation invokes a tool initializer to configure the OMPT
27 callback interface, it provides a lookup function that provides pointers to runtime entry points that
28 implement routines that are part of the OMPT callback interface. Alternatively, when it invokes a
29 tool initializer to configure the OMPT tracing interface for a device, it provides a lookup function
30 that provides pointers to runtime entry points that implement tracing control routines appropriate
31 for that device.
9 Description of Arguments
10 The interface_function_name argument is a C string that represents the name of a runtime entry
11 point.
12 Cross References
13 • Entry Points in the OMPT Callback Interface, see Section 19.6.1
14 • Entry Points in the OMPT Device Tracing Interface, see Section 19.6.2
15 • Tracing Activity on Target Devices with OMPT, see Section 19.2.5
16 • ompt_initialize_t, see Section 19.5.1.1
539
1 location. The location can, but may not, be a function. It can, for example, simply be a label.
2 However, the names of the locations must have external C linkage.
19 Cross References
20 • OMP_DEBUG, see Section 21.4.1
21 20.2.2 ompd_dll_locations
22 Summary
23 The ompd_dll_locations global variable points to the locations of OMPD libraries that are
24 compatible with the OpenMP implementation.
25 Format
C
26 extern const char **ompd_dll_locations;
C
23 Cross References
24 • ompd_dll_locations_valid, see Section 20.2.3
25 20.2.3 ompd_dll_locations_valid
26 Summary
27 The OpenMP runtime notifies third-party tools that ompd_dll_locations is valid by allowing
28 execution to pass through a location that the symbol ompd_dll_locations_valid identifies.
29 Format
C
30 void ompd_dll_locations_valid(void);
C
31 Semantics
32 Since ompd_dll_locations may not be a static variable, it may require runtime initialization.
33 The OpenMP runtime notifies third-party tools that ompd_dll_locations is valid by having
34 execution pass through a location that the symbol ompd_dll_locations_valid identifies. If
35 ompd_dll_locations is NULL, a third-party tool can place a breakpoint at
36 ompd_dll_locations_valid to be notified that ompd_dll_locations is initialized. In
37 practice, the symbol ompd_dll_locations_valid may not be a function; instead, it may be a
38 labeled machine instruction through which execution passes once the vector is valid.
17 Cross References
18 • ompt_wait_id_t, see Section 19.4.4.31
18 Cross References
19 • Basic Value Types, see Section 20.3.3
10 Format
C / C++
11 typedef struct ompd_device_type_sizes_t {
12 uint8_t sizeof_char;
13 uint8_t sizeof_short;
14 uint8_t sizeof_int;
15 uint8_t sizeof_long;
16 uint8_t sizeof_long_long;
17 uint8_t sizeof_pointer;
18 } ompd_device_type_sizes_t;
C / C++
19 Semantics
20 The ompd_device_type_sizes_t type is used in operations through which the OMPD
21 library can interrogate the third-party tool about the size of primitive types for the target
22 architecture of the OpenMP runtime, as returned by the sizeof operator. The fields of
23 ompd_device_type_sizes_t give the sizes of the eponymous basic types used by the
24 OpenMP runtime. As the third-party tool and the OMPD library, by definition, execute on the same
25 architecture, the size of the fields can be given as uint8_t.
26 Cross References
27 • ompd_callback_sizeof_fn_t, see Section 20.4.2.2
4 20.4.1.1 ompd_callback_memory_alloc_fn_t
5 Summary
6 The ompd_callback_memory_alloc_fn_t type is the type signature of the callback routine
7 that the third-party tool provides to the OMPD library to allocate memory.
8 Format
C
9 typedef ompd_rc_t (*ompd_callback_memory_alloc_fn_t) (
10 ompd_size_t nbytes,
11 void **ptr
12 );
C
13 Semantics
14 The ompd_callback_memory_alloc_fn_t type is the type signature of the memory
15 allocation callback routine that the third-party tool provides. The OMPD library may call the
16 ompd_callback_memory_alloc_fn_t callback function to allocate memory.
17 Description of Arguments
18 The nbytes argument is the size in bytes of the block of memory to allocate.
19 The address of the newly allocated block of memory is returned in the location to which the ptr
20 argument points. The newly allocated block is suitably aligned for any type of variable and is not
21 guaranteed to be set to zero.
25 Cross References
26 • Return Code Types, see Section 20.3.12
27 • Size Type, see Section 20.3.1
28 • The Callback Interface, see Section 20.4.6
29 20.4.1.2 ompd_callback_memory_free_fn_t
30 Summary
31 The ompd_callback_memory_free_fn_t type is the type signature of the callback routine
32 that the third-party tool provides to the OMPD library to deallocate memory.
10 Description of Arguments
11 The ptr argument is the address of the block to be deallocated.
15 Cross References
16 • Return Code Types, see Section 20.3.12
17 • The Callback Interface, see Section 20.4.6
18 • ompd_callback_memory_alloc_fn_t, see Section 20.4.1.1
23 20.4.2.1 ompd_callback_get_thread_context_for_thread_id_fn_t
24 Summary
25 The ompd_callback_get_thread_context_for_thread_id_fn_t is the type
26 signature of the callback routine that the third-party tool provides to the OMPD library to map a
27 native thread identifier to a third-party tool thread context.
28 Restrictions
29 Restrictions on routines that use
30 ompd_callback_get_thread_context_for_thread_id_fn_t are as follows:
31 • The provided thread_context must be valid until the OMPD library returns from the OMPD
32 third-party tool interface routine.
7 20.4.2.2 ompd_callback_sizeof_fn_t
8 Summary
9 The ompd_callback_sizeof_fn_t type is the type signature of the callback routine that the
10 third-party tool provides to the OMPD library to determine the sizes of the primitive types in an
11 address space.
12 Format
C
13 typedef ompd_rc_t (*ompd_callback_sizeof_fn_t) (
14 ompd_address_space_context_t *address_space_context,
15 ompd_device_type_sizes_t *sizes
16 );
C
17 Semantics
18 The ompd_callback_sizeof_fn_t is the type signature of the type-size query callback
19 routine that the third-party tool provides. This callback provides the sizes of the basic primitive
20 types for a given address space.
21 Description of Arguments
22 The callback returns the sizes of the basic primitive types used by the address space context that the
23 address_space_context argument specifies in the location to which the sizes argument points.
27 Cross References
28 • Primitive Type Sizes, see Section 20.3.13
29 • Return Code Types, see Section 20.3.12
30 • The Callback Interface, see Section 20.4.6
31 • Tool Context Types, see Section 20.3.11
6 20.4.3.1 ompd_callback_symbol_addr_fn_t
7 Summary
8 The ompd_callback_symbol_addr_fn_t type is the type signature of the callback that the
9 third-party tool provides to look up the addresses of symbols in an OpenMP program.
10 Format
C
11 typedef ompd_rc_t (*ompd_callback_symbol_addr_fn_t) (
12 ompd_address_space_context_t *address_space_context,
13 ompd_thread_context_t *thread_context,
14 const char *symbol_name,
15 ompd_address_t *symbol_addr,
16 const char *file_name
17 );
C
18 Semantics
19 The ompd_callback_symbol_addr_fn_t is the type signature of the symbol-address query
20 callback routine that the third-party tool provides. This callback looks up addresses of symbols
21 within a specified address space.
22 Description of Arguments
23 This callback looks up the symbol provided in the symbol_name argument.
24 The address_space_context argument is the third-party tool’s representation of the address space of
25 the process, core file, or device.
26 The thread_context argument is NULL for global memory accesses. If thread_context is not NULL,
27 thread_context gives the thread-specific context for the symbol lookup for the purpose of
28 calculating thread local storage addresses. In this case, the thread to which thread_context refers
29 must be associated with either the process or the device that corresponds to the
30 address_space_context argument.
31 The third-party tool uses the symbol_name argument that the OMPD library supplies verbatim. In
32 particular, no name mangling, demangling or other transformations are performed prior to the
33 lookup. The symbol_name parameter must correspond to a statically allocated symbol within the
34 specified address space. The symbol can correspond to any type of object, such as a variable,
35 thread local storage variable, function, or untyped label. The symbol can have local, global, or
36 weak binding.
18 Restrictions
19 Restrictions on routines that use the ompd_callback_symbol_addr_fn_t type are as
20 follows:
21 • The address_space_context argument must be non-null.
22 • The symbol that the symbol_name argument specifies must be defined.
23 Cross References
24 • Address Type, see Section 20.3.4
25 • Return Code Types, see Section 20.3.12
26 • The Callback Interface, see Section 20.4.6
27 • Tool Context Types, see Section 20.3.11
28 20.4.3.2 ompd_callback_memory_read_fn_t
29 Summary
30 The ompd_callback_memory_read_fn_t type is the type signature of the callback that the
31 third-party tool provides to read data (read_memory) or a string (read_string) from an OpenMP
32 program.
17 Description of Arguments
18 The address from which the data are to be read in the OpenMP program that
19 address_space_context specifies is given by addr. The nbytes argument is the number of bytes to
20 be transferred. The thread_context argument for global memory accesses should be NULL. If it is
21 non-null, thread_context identifies the thread-specific context for the memory access for the
22 purpose of accessing thread local storage.
23 The data are returned through buffer, which is allocated and owned by the OMPD library. The
24 contents of the buffer are unstructured, raw bytes. The OMPD library must arrange for any
25 transformations such as byte-swapping that may be necessary (see Section 20.4.4) to interpret the
26 data.
8 20.4.3.3 ompd_callback_memory_write_fn_t
9 Summary
10 The ompd_callback_memory_write_fn_t type is the type signature of the callback that
11 the third-party tool provides to write data to an OpenMP program.
12 Format
C
13 typedef ompd_rc_t (*ompd_callback_memory_write_fn_t) (
14 ompd_address_space_context_t *address_space_context,
15 ompd_thread_context_t *thread_context,
16 const ompd_address_t *addr,
17 ompd_size_t nbytes,
18 const void *buffer
19 );
C
20 Semantics
21 The ompd_callback_memory_write_fn_t is the type signature of the write callback
22 routine that the third-party tool provides. The OMPD library may call this callback to have the
23 third-party tool write a block of data to a location within an address space from a provided buffer.
24 Description of Arguments
25 The address to which the data are to be written in the OpenMP program that address_space_context
26 specifies is given by addr. The nbytes argument is the number of bytes to be transferred. The
27 thread_context argument for global memory accesses should be NULL. If it is non-null, then
28 thread_context identifies the thread-specific context for the memory access for the purpose of
29 accessing thread local storage.
30 The data to be written are passed through buffer, which is allocated and owned by the OMPD
31 library. The contents of the buffer are unstructured, raw bytes. The OMPD library must arrange for
32 any transformations such as byte-swapping that may be necessary (see Section 20.4.4) to render the
33 data into a form that is compatible with the OpenMP runtime.
4 Cross References
5 • Address Type, see Section 20.3.4
6 • Data Format Conversion: ompd_callback_device_host_fn_t, see Section 20.4.4
7 • Return Code Types, see Section 20.3.12
8 • Size Type, see Section 20.3.1
9 • The Callback Interface, see Section 20.4.6
10 • Tool Context Types, see Section 20.3.11
17 Format
C
18 typedef ompd_rc_t (*ompd_callback_device_host_fn_t) (
19 ompd_address_space_context_t *address_space_context,
20 const void *input,
21 ompd_size_t unit_size,
22 ompd_size_t count,
23 void *output
24 );
C
25 Semantics
26 The architecture on which the third-party tool and the OMPD library execute may be different from
27 the architecture on which the OpenMP program that is being examined executes. Thus, the
28 conventions for representing data may differ. The callback interface includes operations to convert
29 between the conventions, such as the byte order (endianness), that the third-party tool and OMPD
30 library use and the ones that the OpenMP program use. The callback with the
31 ompd_callback_device_host_fn_t type signature converts data between the formats.
16 20.4.5 ompd_callback_print_string_fn_t
17 Summary
18 The ompd_callback_print_string_fn_t type is the type signature of the callback that
19 the third-party tool provides so that the OMPD library can emit output.
20 Format
C
21 typedef ompd_rc_t (*ompd_callback_print_string_fn_t) (
22 const char *string,
23 int category
24 );
C
25 Semantics
26 The OMPD library may call the ompd_callback_print_string_fn_t callback function to
27 emit output, such as logging or debug information. The third-party tool may set the
28 ompd_callback_print_string_fn_t callback function to NULL to prevent the OMPD
29 library from emitting output. The OMPD library may not write to file descriptors that it did not
30 open.
31 Description of Arguments
32 The string argument is the null-terminated string to be printed. No conversion or formatting is
33 performed on the string.
34 The category argument is the implementation-defined category of the string to be printed.
4 Cross References
5 • Return Code Types, see Section 20.3.12
6 • The Callback Interface, see Section 20.4.6
12 Format
C
13 typedef struct ompd_callbacks_t {
14 ompd_callback_memory_alloc_fn_t alloc_memory;
15 ompd_callback_memory_free_fn_t free_memory;
16 ompd_callback_print_string_fn_t print_string;
17 ompd_callback_sizeof_fn_t sizeof_type;
18 ompd_callback_symbol_addr_fn_t symbol_addr_lookup;
19 ompd_callback_memory_read_fn_t read_memory;
20 ompd_callback_memory_write_fn_t write_memory;
21 ompd_callback_memory_read_fn_t read_string;
22 ompd_callback_device_host_fn_t device_to_host;
23 ompd_callback_device_host_fn_t host_to_device;
24 ompd_callback_get_thread_context_for_thread_id_fn_t
25 get_thread_context_for_thread_id;
26 } ompd_callbacks_t;
C
27 Semantics
28 The set of callbacks that the OMPD library must use is collected in the ompd_callbacks_t
29 structure. An instance of this type is passed to the OMPD library as a parameter to
30 ompd_initialize (see Section 20.5.1.1). Each field points to a function that the OMPD library
31 must use either to interact with the OpenMP program or for memory operations.
32 The alloc_memory and free_memory fields are pointers to functions the OMPD library uses to
33 allocate and to release dynamic memory.
34 The print_string field points to a function that prints a string.
35 The architecture on which the OMPD library and third-party tool execute may be different from the
36 architecture on which the OpenMP program that is being examined executes. The sizeof_type field
13 Cross References
14 • Data Format Conversion: ompd_callback_device_host_fn_t, see Section 20.4.4
15 • ompd_callback_get_thread_context_for_thread_id_fn_t, see
16 Section 20.4.2.1
17 • ompd_callback_memory_alloc_fn_t, see Section 20.4.1.1
18 • ompd_callback_memory_free_fn_t, see Section 20.4.1.2
19 • ompd_callback_memory_read_fn_t, see Section 20.4.3.2
20 • ompd_callback_memory_write_fn_t, see Section 20.4.3.3
21 • ompd_callback_print_string_fn_t, see Section 20.4.5
22 • ompd_callback_sizeof_fn_t, see Section 20.4.2.2
23 • ompd_callback_symbol_addr_fn_t, see Section 20.4.3.1
16 20.5.1.1 ompd_initialize
17 Summary
18 The ompd_initialize function initializes the OMPD library.
19 Format
C
20 ompd_rc_t ompd_initialize(
21 ompd_word_t api_version,
22 const ompd_callbacks_t *callbacks
23 );
C
24 Semantics
25 A tool that uses OMPD calls ompd_initialize to initialize each OMPD library that it loads.
26 More than one library may be present in a third-party tool, such as a debugger, because the tool
27 may control multiple devices, which may use different runtime systems that require different
28 OMPD libraries. This initialization must be performed exactly once before the tool can begin to
29 operate on an OpenMP process or core file.
30 Description of Arguments
31 The api_version argument is the OMPD API version that the tool requests to use. The tool may call
32 ompd_get_api_version to obtain the latest OMPD API version that the OMPD library
33 supports.
14 20.5.1.2 ompd_get_api_version
15 Summary
16 The ompd_get_api_version function returns the OMPD API version.
17 Format
C
18 ompd_rc_t ompd_get_api_version(ompd_word_t *version);
C
19 Semantics
20 The tool may call the ompd_get_api_version function to obtain the latest OMPD API
21 version number of the OMPD library. The OMPD API version number is equal to the value of the
22 _OPENMP macro defined in the associated OpenMP implementation, if the C preprocessor is
23 supported. If the associated OpenMP implementation compiles Fortran codes without the use of a
24 C preprocessor, the OMPD API version number is equal to the value of the Fortran integer
25 parameter openmp_version.
26 Description of Arguments
27 The latest version number is returned into the location to which the version argument points.
30 Cross References
31 • Return Code Types, see Section 20.3.12
25 Cross References
26 • Return Code Types, see Section 20.3.12
27 20.5.1.4 ompd_finalize
28 Summary
29 When the tool is finished with the OMPD library it should call ompd_finalize before it
30 unloads the library.
31 Format
C
32 ompd_rc_t ompd_finalize(void);
C
11 Cross References
12 • Return Code Types, see Section 20.3.12
18 Format
C
19 ompd_rc_t ompd_process_initialize(
20 ompd_address_space_context_t *context,
21 ompd_address_space_handle_t **host_handle
22 );
C
23 Semantics
24 A tool calls ompd_process_initialize to obtain an address space handle for the host device
25 when it initializes a session on a live process or core file. On return from
26 ompd_process_initialize, the tool owns the address space handle, which it must release
27 with ompd_rel_address_space_handle. The initialization function must be called before
28 any OMPD operations are performed on the OpenMP process or core file. This call allows the
29 OMPD library to confirm that it can handle the OpenMP process or core file that context identifies.
30 Description of Arguments
31 The context argument is an opaque handle that the tool provides to address an address space from
32 the host device. On return, the host_handle argument provides an opaque handle to the tool for this
33 address space, which the tool must release when it is no longer needed.
6 Cross References
7 • OMPD Handle Types, see Section 20.3.8
8 • Return Code Types, see Section 20.3.12
9 • Tool Context Types, see Section 20.3.11
10 • ompd_rel_address_space_handle, see Section 20.5.2.3
11 20.5.2.2 ompd_device_initialize
12 Summary
13 A tool calls ompd_device_initialize to obtain an address space handle for a non-host
14 device that has at least one active target region.
15 Format
C
16 ompd_rc_t ompd_device_initialize(
17 ompd_address_space_handle_t *host_handle,
18 ompd_address_space_context_t *device_context,
19 ompd_device_t kind,
20 ompd_size_t sizeof_id,
21 void *id,
22 ompd_address_space_handle_t **device_handle
23 );
C
24 Semantics
25 A tool calls ompd_device_initialize to obtain an address space handle for a non-host
26 device that has at least one active target region. On return from ompd_device_initialize,
27 the tool owns the address space handle.
28 Description of Arguments
29 The host_handle argument is an opaque handle that the tool provides to reference the host device
30 address space associated with an OpenMP process or core file. The device_context argument is an
31 opaque handle that the tool provides to reference a non-host device address space. The kind,
32 sizeof_id, and id arguments represent a device identifier. On return the device_handle argument
33 provides an opaque handle to the tool for this address space.
5 Cross References
6 • OMPD Handle Types, see Section 20.3.8
7 • Return Code Types, see Section 20.3.12
8 • Size Type, see Section 20.3.1
9 • System Device Identifiers, see Section 20.3.6
10 • Tool Context Types, see Section 20.3.11
11 20.5.2.3 ompd_rel_address_space_handle
12 Summary
13 A tool calls ompd_rel_address_space_handle to release an address space handle.
14 Format
C
15 ompd_rc_t ompd_rel_address_space_handle(
16 ompd_address_space_handle_t *handle
17 );
C
18 Semantics
19 When the tool is finished with the OpenMP process address space handle it should call
20 ompd_rel_address_space_handle to release the handle, which allows the OMPD library
21 to release any resources that it has related to the address space.
22 Description of Arguments
23 The handle argument is an opaque handle for the address space to be released.
24 Restrictions
25 Restrictions to the ompd_rel_address_space_handle routine are as follows:
26 • An address space context must not be used after the corresponding address space handle is
27 released.
4 20.5.2.4 ompd_get_device_thread_id_kinds
5 Summary
6 The ompd_get_device_thread_id_kinds function returns a list of supported native
7 thread identifier kinds and a corresponding list of their respective sizes.
8 Format
C
9 ompd_rc_t ompd_get_device_thread_id_kinds(
10 ompd_address_space_handle_t *device_handle,
11 ompd_thread_id_t **kinds,
12 ompd_size_t **thread_id_sizes,
13 int *count
14 );
C
15 Semantics
16 The ompd_get_device_thread_id_kinds function returns an array of supported native
17 thread identifier kinds and a corresponding array of their respective sizes for a given device. The
18 OMPD library allocates storage for the arrays with the memory allocation callback that the tool
19 provides. Each supported native thread identifier kind is guaranteed to be recognizable by the
20 OMPD library and may be mapped to and from any OpenMP thread that executes on the device.
21 The third-party tool owns the storage for the array of kinds and the array of sizes that is returned via
22 the kinds and thread_id_sizes arguments, and it is responsible for freeing that storage.
23 Description of Arguments
24 The device_handle argument is a pointer to an opaque address space handle that represents a host
25 device (returned by ompd_process_initialize) or a non-host device (returned by
26 ompd_device_initialize). On return, the kinds argument is the address of a pointer to an
27 array of native thread identifier kinds, the thread_id_sizes argument is the address of a pointer to an
28 array of the corresponding native thread identifier sizes used by the OMPD library, and the count
29 argument is the address of a variable that indicates the sizes of the returned arrays.
15 Format
C
16 ompd_rc_t ompd_get_omp_version(
17 ompd_address_space_handle_t *address_space,
18 ompd_word_t *omp_version
19 );
C
20 Semantics
21 The tool may call the ompd_get_omp_version function to obtain the version of the OpenMP
22 API that is associated with the address space.
23 Description of Arguments
24 The address_space argument is an opaque handle that the tool provides to reference the address
25 space of the OpenMP process or device.
26 Upon return, the omp_version argument contains the version of the OpenMP runtime in the
27 _OPENMP version macro format.
4 20.5.4.2 ompd_get_omp_version_string
5 Summary
6 The ompd_get_omp_version_string function returns a descriptive string for the OpenMP
7 API version that is associated with an address space.
8 Format
C
9 ompd_rc_t ompd_get_omp_version_string(
10 ompd_address_space_handle_t *address_space,
11 const char **string
12 );
C
13 Semantics
14 After initialization, the tool may call the ompd_get_omp_version_string function to obtain
15 the version of the OpenMP API that is associated with an address space.
16 Description of Arguments
17 The address_space argument is an opaque handle that the tool provides to reference the address
18 space of the OpenMP process or device. A pointer to a descriptive version string is placed into the
19 location to which the string output argument points. After returning from the call, the tool owns the
20 string. The OMPD library must use the memory allocation callback that the tool provides to
21 allocate the string storage. The tool is responsible for releasing the memory.
22 Description of Return Codes
23 This routine must return any of the general return codes listed at the beginning of Section 20.5.
24 Cross References
25 • OMPD Handle Types, see Section 20.3.8
26 • Return Code Types, see Section 20.3.12
25 Cross References
26 • OMPD Handle Types, see Section 20.3.8
27 • Return Code Types, see Section 20.3.12
28 • ompd_get_icv_from_scope, see Section 20.5.10.2
29 20.5.5.2 ompd_get_thread_handle
30 Summary
31 The ompd_get_thread_handle function maps a native thread to an OMPD thread handle.
31 20.5.5.3 ompd_rel_thread_handle
32 Summary
33 The ompd_rel_thread_handle function releases a thread handle.
15 20.5.5.4 ompd_thread_handle_compare
16 Summary
17 The ompd_thread_handle_compare function allows tools to compare two thread handles.
18 Format
C
19 ompd_rc_t ompd_thread_handle_compare(
20 ompd_thread_handle_t *thread_handle_1,
21 ompd_thread_handle_t *thread_handle_2,
22 int *cmp_value
23 );
C
24 Semantics
25 The internal structure of thread handles is opaque to a tool. While the tool can easily compare
26 pointers to thread handles, it cannot determine whether handles of two different addresses refer to
27 the same underlying thread. The ompd_thread_handle_compare function compares thread
28 handles.
29 On success, ompd_thread_handle_compare returns in the location to which cmp_value
30 points a signed integer value that indicates how the underlying threads compare: a value less than,
31 equal to, or greater than 0 indicates that the thread corresponding to thread_handle_1 is,
32 respectively, less than, equal to, or greater than that corresponding to thread_handle_2.
6 Cross References
7 • OMPD Handle Types, see Section 20.3.8
8 • Return Code Types, see Section 20.3.12
9 20.5.5.5 ompd_get_thread_id
10 Summary
11 The ompd_get_thread_id function maps an OMPD thread handle to a native thread.
12 Format
C
13 ompd_rc_t ompd_get_thread_id(
14 ompd_thread_handle_t *thread_handle,
15 ompd_thread_id_t kind,
16 ompd_size_t sizeof_thread_id,
17 void *thread_id
18 );
C
19 Semantics
20 The ompd_get_thread_id function maps an OMPD thread handle to a native thread identifier.
21 This call yields meaningful results only if the referenced OpenMP thread is stopped.
22 Description of Arguments
23 The thread_handle argument is an opaque thread handle. The kind argument represents the native
24 thread identifier. The sizeof_thread_id argument represents the size of the native thread identifier.
25 On return, the thread_id argument is a buffer that represents a native thread identifier.
6 20.5.5.6 ompd_get_device_from_thread
7 Summary
8 The ompd_get_device_from_thread function obtains a pointer to the address space handle
9 for a device on which an OpenMP thread is executing.
10 Format
C
11 ompd_rc_t ompd_get_device_from_thread(
12 ompd_thread_handle_t *thread_handle,
13 ompd_address_space_handle_t **device
14 );
C
15 Semantics
16 The ompd_get_device_from_thread function obtains a pointer to the address space handle
17 for a device on which an OpenMP thread is executing. The returned pointer will be the same as the
18 address space handle pointer that was previously returned by a call to
19 ompd_process_initialize (for a host device) or a call to ompd_device_initialize
20 (for a non-host device). This call yields meaningful results only if the referenced OpenMP thread is
21 stopped.
22 Description of Arguments
23 The thread_handle argument is a pointer to an opaque thread handle that represents an OpenMP
24 thread. On return, the device argument is the address of a pointer to an OMPD address space
25 handle.
28 Cross References
29 • OMPD Handle Types, see Section 20.3.8
30 • Return Code Types, see Section 20.3.12
6 Format
C
7 ompd_rc_t ompd_get_curr_parallel_handle(
8 ompd_thread_handle_t *thread_handle,
9 ompd_parallel_handle_t **parallel_handle
10 );
C
11 Semantics
12 The ompd_get_curr_parallel_handle function enables the tool to obtain a pointer to the
13 parallel handle for the current parallel region that is associated with an OpenMP thread. This call
14 yields meaningful results only if the referenced OpenMP thread is stopped. The parallel handle is
15 owned by the tool and it must be released by calling ompd_rel_parallel_handle.
16 Description of Arguments
17 The thread_handle argument is an opaque handle for a thread and selects the thread on which to
18 operate. On return, the parallel_handle argument is set to a handle for the parallel region that the
19 associated thread is currently executing, if any.
24 Cross References
25 • OMPD Handle Types, see Section 20.3.8
26 • Return Code Types, see Section 20.3.12
27 • ompd_rel_parallel_handle, see Section 20.5.6.4
28 20.5.6.2 ompd_get_enclosing_parallel_handle
29 Summary
30 The ompd_get_enclosing_parallel_handle function obtains a pointer to the parallel
31 handle for an enclosing parallel region.
26 20.5.6.3 ompd_get_task_parallel_handle
27 Summary
28 The ompd_get_task_parallel_handle function obtains a pointer to the parallel handle for
29 the parallel region that encloses a task region.
30 Format
C
31 ompd_rc_t ompd_get_task_parallel_handle(
32 ompd_task_handle_t *task_handle,
33 ompd_parallel_handle_t **task_parallel_handle
34 );
C
8 Description of Arguments
9 The task_handle argument is an opaque handle that selects the task on which to operate. On return,
10 the parallel_handle argument is set to a handle for the parallel region that encloses the selected task.
13 Cross References
14 • OMPD Handle Types, see Section 20.3.8
15 • Return Code Types, see Section 20.3.12
16 • ompd_rel_parallel_handle, see Section 20.5.6.4
17 20.5.6.4 ompd_rel_parallel_handle
18 Summary
19 The ompd_rel_parallel_handle function releases a parallel region handle.
20 Format
C
21 ompd_rc_t ompd_rel_parallel_handle(
22 ompd_parallel_handle_t *parallel_handle
23 );
C
24 Semantics
25 Parallel region handles are opaque so tools cannot release them directly. Instead, a tool must pass a
26 parallel region handle to the ompd_rel_parallel_handle function for disposal when
27 finished with it.
28 Description of Arguments
29 The parallel_handle argument is an opaque handle to be released.
4 20.5.6.5 ompd_parallel_handle_compare
5 Summary
6 The ompd_parallel_handle_compare function compares two parallel region handles.
7 Format
C
8 ompd_rc_t ompd_parallel_handle_compare(
9 ompd_parallel_handle_t *parallel_handle_1,
10 ompd_parallel_handle_t *parallel_handle_2,
11 int *cmp_value
12 );
C
13 Semantics
14 The internal structure of parallel region handles is opaque to tools. While tools can easily compare
15 pointers to parallel region handles, they cannot determine whether handles at two different
16 addresses refer to the same underlying parallel region and, instead must use the
17 ompd_parallel_handle_compare function.
18 On success, ompd_parallel_handle_compare returns a signed integer value in the location
19 to which cmp_value points that indicates how the underlying parallel regions compare. A value less
20 than, equal to, or greater than 0 indicates that the region corresponding to parallel_handle_1 is,
21 respectively, less than, equal to, or greater than that corresponding to parallel_handle_2. This
22 function is provided since the means by which parallel region handles are ordered is
23 implementation defined.
24 Description of Arguments
25 The parallel_handle_1 and parallel_handle_2 arguments are opaque handles that correspond to
26 parallel regions. On return the cmp_value argument points to a signed integer value that indicates
27 how the underlying parallel regions compare.
30 Cross References
31 • OMPD Handle Types, see Section 20.3.8
32 • Return Code Types, see Section 20.3.12
6 Format
C
7 ompd_rc_t ompd_get_curr_task_handle(
8 ompd_thread_handle_t *thread_handle,
9 ompd_task_handle_t **task_handle
10 );
C
11 Semantics
12 The ompd_get_curr_task_handle function obtains a pointer to the task handle for the
13 current task region that is associated with an OpenMP thread. This call yields meaningful results
14 only if the thread for which the handle is provided is stopped. The task handle must be released
15 with ompd_rel_task_handle.
16 Description of Arguments
17 The thread_handle argument is an opaque handle that selects the thread on which to operate. On
18 return, the task_handle argument points to a location that points to a handle for the task that the
19 thread is currently executing.
24 Cross References
25 • OMPD Handle Types, see Section 20.3.8
26 • Return Code Types, see Section 20.3.12
27 • ompd_rel_task_handle, see Section 20.5.7.5
28 20.5.7.2 ompd_get_generating_task_handle
29 Summary
30 The ompd_get_generating_task_handle function obtains a pointer to the task handle of
31 the generating task region.
25 20.5.7.3 ompd_get_scheduling_task_handle
26 Summary
27 The ompd_get_scheduling_task_handle function obtains a task handle for the task that
28 was active at a task scheduling point.
29 Format
C
30 ompd_rc_t ompd_get_scheduling_task_handle(
31 ompd_task_handle_t *task_handle,
32 ompd_task_handle_t **scheduling_task_handle
33 );
C
7 Description of Arguments
8 The task_handle argument is an opaque handle for a task and selects the task on which to operate.
9 On return, the scheduling_task_handle argument points to a location that points to a handle for the
10 task that is still on the stack of execution on the same thread and was deferred in favor of executing
11 the selected task.
16 Cross References
17 • OMPD Handle Types, see Section 20.3.8
18 • Return Code Types, see Section 20.3.12
19 • ompd_rel_task_handle, see Section 20.5.7.5
20 20.5.7.4 ompd_get_task_in_parallel
21 Summary
22 The ompd_get_task_in_parallel function obtains handles for the implicit tasks that are
23 associated with a parallel region.
24 Format
C
25 ompd_rc_t ompd_get_task_in_parallel(
26 ompd_parallel_handle_t *parallel_handle,
27 int thread_num,
28 ompd_task_handle_t **task_handle
29 );
C
30 Semantics
31 The ompd_get_task_in_parallel function obtains handles for the implicit tasks that are
32 associated with a parallel region. A successful invocation of ompd_get_task_in_parallel
33 returns a pointer to a task handle in the location to which task_handle points. This call yields
34 meaningful results only if all OpenMP threads in the parallel region are stopped.
12 Restrictions
13 Restrictions on the ompd_get_task_in_parallel function are as follows:
14 • The value of thread_num must be a non-negative integer that is smaller than the size of the team
15 size that is the value of the team-size-var ICV that ompd_get_icv_from_scope returns.
16 Cross References
17 • OMPD Handle Types, see Section 20.3.8
18 • Return Code Types, see Section 20.3.12
19 • ompd_get_icv_from_scope, see Section 20.5.10.2
20 20.5.7.5 ompd_rel_task_handle
21 Summary
22 This ompd_rel_task_handle function releases a task handle.
23 Format
C
24 ompd_rc_t ompd_rel_task_handle(
25 ompd_task_handle_t *task_handle
26 );
C
27 Semantics
28 Task handles are opaque to tools; thus tools cannot release them directly. Instead, when a tool is
29 finished with a task handle it must use the ompd_rel_task_handle function to release it.
30 Description of Arguments
31 The task_handle argument is an opaque task handle to be released.
3 Cross References
4 • OMPD Handle Types, see Section 20.3.8
5 • Return Code Types, see Section 20.3.12
6 20.5.7.6 ompd_task_handle_compare
7 Summary
8 The ompd_task_handle_compare function compares task handles.
9 Format
C
10 ompd_rc_t ompd_task_handle_compare(
11 ompd_task_handle_t *task_handle_1,
12 ompd_task_handle_t *task_handle_2,
13 int *cmp_value
14 );
C
15 Semantics
16 The internal structure of task handles is opaque; so tools cannot directly determine if handles at two
17 different addresses refer to the same underlying task. The ompd_task_handle_compare
18 function compares task handles. After a successful call to ompd_task_handle_compare, the
19 value of the location to which cmp_value points is a signed integer that indicates how the underlying
20 tasks compare: a value less than, equal to, or greater than 0 indicates that the task that corresponds
21 to task_handle_1 is, respectively, less than, equal to, or greater than the task that corresponds to
22 task_handle_2. The means by which task handles are ordered is implementation defined.
23 Description of Arguments
24 The task_handle_1 and task_handle_2 arguments are opaque handles that correspond to tasks. On
25 return, the cmp_value argument points to a location in which a signed integer value indicates how
26 the underlying tasks compare.
29 Cross References
30 • OMPD Handle Types, see Section 20.3.8
31 • Return Code Types, see Section 20.3.12
24 20.5.7.8 ompd_get_task_frame
25 Summary
26 The ompd_get_task_frame function extracts the frame pointers of a task.
27 Format
C
28 ompd_rc_t ompd_get_task_frame (
29 ompd_task_handle_t *task_handle,
30 ompd_frame_info_t *exit_frame,
31 ompd_frame_info_t *enter_frame
32 );
C
7 Description of Arguments
8 The task_handle argument specifies an OpenMP task. On return, the exit_frame argument points to
9 an ompd_frame_info_t object that has the frame information with the same semantics as the
10 exit_frame field in the ompt_frame_t object that is associated with the specified task. On return,
11 the enter_frame argument points to an ompd_frame_info_t object that has the frame
12 information with the same semantics as the enter_frame field in the ompt_frame_t object that is
13 associated with the specified task.
16 Cross References
17 • Address Type, see Section 20.3.4
18 • Frame Information Type, see Section 20.3.5
19 • OMPD Handle Types, see Section 20.3.8
20 • Return Code Types, see Section 20.3.12
21 • ompt_frame_t, see Section 19.4.4.29
27 Format
C
28 ompd_rc_t ompd_enumerate_states (
29 ompd_address_space_handle_t *address_space_handle,
30 ompd_word_t current_state,
31 ompd_word_t *next_state,
32 const char **next_state_name,
33 ompd_word_t *more_enums
34 );
C
15 Description of Arguments
16 The address_space_handle argument identifies the address space. The current_state argument must
17 be a thread state that the OpenMP implementation supports. To begin enumerating the supported
18 states, a tool should pass ompt_state_undefined as the value of current_state. Subsequent
19 calls to ompd_enumerate_states by the tool should pass the value that the call returned in
20 the next_state argument. On return, the next_state argument points to an integer with the value of
21 the next state in the enumeration. On return, the next_state_name argument points to a character
22 string that describes the next state. On return, the more_enums argument points to an integer with a
23 value of 1 when more states are left to enumerate and a value of 0 when no more states are left.
28 Cross References
29 • OMPD Handle Types, see Section 20.3.8
30 • Return Code Types, see Section 20.3.12
31 • ompt_state_t, see Section 19.4.4.28
32 20.5.8.2 ompd_get_state
33 Summary
34 The ompd_get_state function obtains the state of a thread.
14 Description of Arguments
15 The address_space_handle argument identifies the address space. On return, the control_vars
16 argument points to the vector of display control variables.
19 Cross References
20 • OMPD Handle Types, see Section 20.3.8
21 • Return Code Types, see Section 20.3.12
22 • ompd_initialize, see Section 20.5.1.1
23 • ompd_rel_display_control_vars, see Section 20.5.9.2
24 20.5.9.2 ompd_rel_display_control_vars
25 Summary
26 The ompd_rel_display_control_vars releases a list of name/value pairs of OpenMP
27 control variables previously acquired with ompd_get_display_control_vars.
28 Format
C
29 ompd_rc_t ompd_rel_display_control_vars (
30 const char * const **control_vars
31 );
C
9 Cross References
10 • Return Code Types, see Section 20.3.12
11 • ompd_get_display_control_vars, see Section 20.5.9.1
4 Description of Arguments
5 The address_space_handle argument identifies the address space. The current argument must be
6 an ICV that the OpenMP implementation supports. To begin enumerating the ICVs, a tool should
7 pass ompd_icv_undefined as the value of current. Subsequent calls to
8 ompd_enumerate_icvs should pass the value returned by the call in the next_id output
9 argument. On return, the next_id argument points to an integer with the value of the ID of the next
10 ICV in the enumeration. On return, the next_icv_name argument points to a character string with
11 the name of the next ICV. On return, the next_scope argument points to the scope enum value of the
12 scope of the next ICV. On return, the more_enums argument points to an integer with the value of 1
13 when more ICVs are left to enumerate and the value of 0 when no more ICVs are left.
18 Cross References
19 • ICV ID Type, see Section 20.3.10
20 • OMPD Handle Types, see Section 20.3.8
21 • OMPD Scope Types, see Section 20.3.9
22 • Return Code Types, see Section 20.3.12
23 20.5.10.2 ompd_get_icv_from_scope
24 Summary
25 The ompd_get_icv_from_scope function returns the value of an ICV.
26 Format
C
27 ompd_rc_t ompd_get_icv_from_scope (
28 void *handle,
29 ompd_scope_t scope,
30 ompd_icv_id_t icv_id,
31 ompd_word_t *icv_value
32 );
C
4 Description of Arguments
5 The handle argument provides an OpenMP scope handle. The scope argument specifies the kind of
6 scope provided in handle. The icv_id argument specifies the ID of the requested ICV. On return,
7 the icv_value argument points to a location with the value of the requested ICV.
8 Constraints on Arguments
9 The provided handle must match the scope as defined in Section 20.3.10.
10 The provided scope must match the scope for icv_id as requested by ompd_enumerate_icvs.
11 Description of Return Codes
12 This routine must return any of the general return codes listed at the beginning of Section 20.5 or
13 any of the following return codes:
14 • ompd_rc_incompatible if the ICV cannot be represented as an integer;
15 • ompd_rc_incomplete if only the first item of the ICV is returned in the integer (e.g., if
16 nthreads-var is a list); or
17 • ompd_rc_bad_input if an unknown value is provided in icv_id.
18 Cross References
19 • ICV ID Type, see Section 20.3.10
20 • OMPD Handle Types, see Section 20.3.8
21 • OMPD Scope Types, see Section 20.3.9
22 • Return Code Types, see Section 20.3.12
23 • ompd_enumerate_icvs, see Section 20.5.10.1
24 20.5.10.3 ompd_get_icv_string_from_scope
25 Summary
26 The ompd_get_icv_string_from_scope function returns the value of an ICV.
27 Format
C
28 ompd_rc_t ompd_get_icv_string_from_scope (
29 void *handle,
30 ompd_scope_t scope,
31 ompd_icv_id_t icv_id,
32 const char **icv_string
33 );
C
24 20.5.10.4 ompd_get_tool_data
25 Summary
26 The ompd_get_tool_data function provides access to the OMPT data variable stored for each
27 OpenMP scope.
28 Format
C
29 ompd_rc_t ompd_get_tool_data(
30 void* handle,
31 ompd_scope_t scope,
32 ompd_word_t *value,
33 ompd_address_t *ptr
34 );
C
8 Cross References
9 • ompd_get_curr_task_handle, see Section 20.5.7.1
14 Format
C
15 void ompd_bp_task_end(void);
C
16 Semantics
17 The OpenMP implementation must execute ompd_bp_task_end immediately after completion
18 of a structured-block that is associated with a non-merged task. At the point that the implementation
19 reaches ompd_bp_task_end, the binding for ompd_get_curr_task_handle is the task
20 that finished execution. After execution of ompd_bp_task_end, any task_handle that was
21 acquired for the task region is invalid and should be released.
22 Cross References
23 • ompd_get_curr_task_handle, see Section 20.5.7.1
24 • ompd_rel_task_handle, see Section 20.5.7.5
7 Cross References
8 • Initial Task, see Section 12.8
9 • parallel directive, see Section 10.1
13 Format
C
14 void ompd_bp_thread_end(void);
C
15 Semantics
16 The OpenMP implementation must execute ompd_bp_thread_end at every native-thread-end
17 and initial-thread-end event. This execution occurs after the thread completes the execution of all
18 OpenMP regions. After executing ompd_bp_thread_end, any thread_handle that was acquired
19 for this thread is invalid and should be released.
20 Cross References
21 • Initial Task, see Section 12.8
22 • ompd_rel_thread_handle, see Section 20.5.5.3
23 • parallel directive, see Section 10.1
7 Cross References
8 • Device Initialization, see Section 13.4
12 Format
C
13 void ompd_bp_device_end(void);
C
14 Semantics
15 The OpenMP implementation must execute ompd_bp_device_end at every device-finalize
16 event. This execution occurs after the thread executes all OpenMP regions. After execution of
17 ompd_bp_device_end, any address_space_handle that was acquired for this device is invalid
18 and should be released.
19 Cross References
20 • Device Initialization, see Section 13.4
21 • ompd_rel_address_space_handle, see Section 20.5.2.3
14 • bash-like shells:
15 export OMP_SCHEDULE="dynamic"
18 As defined following Table 2.2 in Section 2.2, device-specific environment variables extend many
19 of the environment variables defined in this chapter. If the corresponding environment variable for
20 a specific device number, including the host device, is set, then the setting for that environment
21 variable is used to set the value of the associated ICV of the device with the corresponding device
22 number. If the corresponding environment variable that includes the _DEV suffix but no device
23 number is set, then the setting of that environment variable is used to set the value of the associated
24 ICV of any non-host device for which the device-number-specific corresponding environment
25 variable is not set. In all cases the setting of an environment variable for which a device number is
26 specified takes precedence.
27 Restrictions
28 Restrictions to device-specific environment variables are as follows:
29 • Device-specific environment variables must not correspond to environment variables that
30 initialize ICVs with global scope.
599
1 21.1 Parallel Region Environment Variables
2 This section defines environment variables that affect the operation of parallel regions.
3 21.1.1 OMP_DYNAMIC
4 The OMP_DYNAMIC environment variable controls dynamic adjustment of the number of threads
5 to use for executing parallel regions by setting the initial value of the dyn-var ICV.
6 The value of this environment variable must be one of the following:
7 true | false
8 If the environment variable is set to true, the OpenMP implementation may adjust the number of
9 threads to use for executing parallel regions in order to optimize the use of system resources. If
10 the environment variable is set to false, the dynamic adjustment of the number of threads is
11 disabled. The behavior of the program is implementation defined if the value of OMP_DYNAMIC is
12 neither true nor false.
13 Example:
14 setenv OMP_DYNAMIC true
15 Cross References
16 • omp_get_dynamic, see Section 18.2.7
17 • omp_set_dynamic, see Section 18.2.6
18 • dyn-var ICV, see Table 2.1
19 • parallel directive, see Section 10.1
20 21.1.2 OMP_NUM_THREADS
21 The OMP_NUM_THREADS environment variable sets the number of threads to use for parallel
22 regions by setting the initial value of the nthreads-var ICV. See Chapter 2 for a comprehensive set
23 of rules about the interaction between the OMP_NUM_THREADS environment variable, the
24 num_threads clause, the omp_set_num_threads library routine and dynamic adjustment of
25 threads, and Section 10.1.1 for a complete algorithm that describes how the number of threads for a
26 parallel region is determined.
27 The value of this environment variable must be a list of positive integer values. The values of the
28 list set the number of threads to use for parallel regions at the corresponding nested levels.
29 The behavior of the program is implementation defined if any value of the list specified in the
30 OMP_NUM_THREADS environment variable leads to a number of threads that is greater than an
31 implementation can support, or if any value is not a positive integer.
8 Cross References
9 • OMP_MAX_ACTIVE_LEVELS, see Section 21.1.4
10 • OMP_NESTED (Deprecated), see Section 21.1.5
11 • omp_set_num_threads, see Section 18.2.1
12 • nthreads-var ICV, see Table 2.1
13 • num_threads clause, see Section 10.1.2
14 • parallel directive, see Section 10.1
15 21.1.3 OMP_THREAD_LIMIT
16 The OMP_THREAD_LIMIT environment variable sets the maximum number of OpenMP threads
17 to use in a contention group by setting the thread-limit-var ICV. The value of this environment
18 variable must be a positive integer. The behavior of the program is implementation defined if the
19 requested value of OMP_THREAD_LIMIT is greater than the number of threads an implementation
20 can support, or if the value is not a positive integer.
21 Cross References
22 • thread-limit-var ICV, see Table 2.1
23 21.1.4 OMP_MAX_ACTIVE_LEVELS
24 The OMP_MAX_ACTIVE_LEVELS environment variable controls the maximum number of nested
25 active parallel regions by setting the initial value of the max-active-levels-var ICV. The value
26 of this environment variable must be a non-negative integer. The behavior of the program is
27 implementation defined if the requested value of OMP_MAX_ACTIVE_LEVELS is greater than the
28 maximum number of nested active parallel levels an implementation can support, or if the value is
29 not a non-negative integer.
30 Cross References
31 • max-active-levels-var ICV, see Table 2.1
15 Cross References
16 • OMP_MAX_ACTIVE_LEVELS, see Section 21.1.4
17 • max-active-levels-var ICV, see Table 2.1
18 21.1.6 OMP_PLACES
19 The OMP_PLACES environment variable sets the initial value of the place-partition-var ICV. A list
20 of places can be specified in the OMP_PLACES environment variable. The value of OMP_PLACES
21 can be one of two types of values: either an abstract name that describes a set of places or an
22 explicit list of places described by non-negative numbers.
23 The OMP_PLACES environment variable can be defined using an explicit ordered list of
24 comma-separated places. A place is defined by an unordered set of comma-separated non-negative
25 numbers enclosed by braces, or a non-negative number. The meaning of the numbers and how the
26 numbering is done are implementation defined. Generally, the numbers represent the smallest unit
27 of execution exposed by the execution environment, typically a hardware thread.
28 Intervals may also be used to define places. Intervals can be specified using the <lower-bound> :
29 <length> : <stride> notation to represent the following list of numbers: “<lower-bound>,
30 <lower-bound> + <stride>, ..., <lower-bound> + (<length> - 1)*<stride>.” When <stride> is
31 omitted, a unit stride is assumed. Intervals can specify numbers within a place as well as sequences
32 of places.
33 An exclusion operator “!” can also be used to exclude the number or place immediately following
34 the operator.
8 where each of the last three definitions corresponds to the same 4 places including the smallest
9 units of execution exposed by the execution environment numbered, in turn, 0 to 3, 4 to 7, 8 to 11,
10 and 12 to 15.
11 Cross References
12 • place-partition-var ICV, see Table 2.1
13 21.1.7 OMP_PROC_BIND
14 The OMP_PROC_BIND environment variable sets the initial value of the bind-var ICV. The value
15 of this environment variable is either true, false, or a comma separated list of primary,
16 master (master has been deprecated), close, or spread. The values of the list set the thread
17 affinity policy to be used for parallel regions at the corresponding nested level.
18 If the environment variable is set to false, the execution environment may move OpenMP threads
19 between OpenMP places, thread affinity is disabled, and proc_bind clauses on parallel
20 constructs are ignored.
21 Otherwise, the execution environment should not move OpenMP threads between OpenMP places,
22 thread affinity is enabled, and the initial thread is bound to the first place in the place-partition-var
23 ICV prior to the first active parallel region. An initial thread that is created by a teams construct is
24 bound to the first place in its place-partition-var ICV before it begins execution of the associated
25 structured block.
26 If the environment variable is set to true, the thread affinity policy is implementation defined but
27 must conform to the previous paragraph. The behavior of the program is implementation defined if
28 the value in the OMP_PROC_BIND environment variable is not true, false, or a comma
25 21.2.1 OMP_SCHEDULE
26 The OMP_SCHEDULE environment variable controls the schedule kind and chunk size of all
27 worksharing-loop directives that have the schedule kind runtime, by setting the value of the
28 run-sched-var ICV. The value of this environment variable takes the form [modifier:]kind[, chunk],
29 where:
30 • modifier is one of monotonic or nonmonotonic;
31 • kind is one of static, dynamic, guided, or auto;
32 • chunk is an optional positive integer that specifies the chunk size.
11 Cross References
12 • run-sched-var ICV, see Table 2.1
13 • schedule clause, see Section 11.5.3
14 21.2.2 OMP_STACKSIZE
15 The OMP_STACKSIZE environment variable controls the size of the stack for threads created by
16 the OpenMP implementation, by setting the value of the stacksize-var ICV. The environment
17 variable does not control the size of the stack for an initial thread. The value of this environment
18 variable takes the form size[unit], where:
19 • size is a positive integer that specifies the size of the stack for threads that are created by the
20 OpenMP implementation.
21 • unit is B, K, M, or G and specifies whether the given size is in Bytes, Kilobytes (1024 Bytes),
22 Megabytes (1024 Kilobytes), or Gigabytes (1024 Megabytes), respectively. If unit is present,
23 white space may occur between size and it, whereas if unit is not present then K is assumed.
24 The behavior of the program is implementation defined if OMP_STACKSIZE does not conform to
25 the above format, or if the implementation cannot provide a stack with the requested size.
26 Examples:
27 setenv OMP_STACKSIZE 2000500B
28 setenv OMP_STACKSIZE "3000 k "
29 setenv OMP_STACKSIZE 10M
30 setenv OMP_STACKSIZE " 10 M "
31 setenv OMP_STACKSIZE "20 m "
32 setenv OMP_STACKSIZE " 1G"
33 setenv OMP_STACKSIZE 20000
34 Cross References
35 • stacksize-var ICV, see Table 2.1
19 Cross References
20 • wait-policy-var ICV, see Table 2.1
21 21.2.4 OMP_DISPLAY_AFFINITY
22 The OMP_DISPLAY_AFFINITY environment variable instructs the runtime to display formatted
23 affinity information by setting the display-affinity-var ICV. Affinity information is printed for all
24 OpenMP threads in the parallel region upon entering it and when any change occurs in the
25 information accessible by the format specifiers listed in Table 21.2. If affinity of any thread in a
26 parallel region changes then thread affinity information for all threads in that region is displayed. If
27 the thread affinity for each respective parallel region at each nesting level has already been displayed
28 and the thread affinity has not changed, then the information is not displayed again. Thread affinity
29 information for threads in the same parallel region may be displayed in any order. The value of the
30 OMP_DISPLAY_AFFINITY environment variable may be set to one of these values:
31 true | false
32 The true value instructs the runtime to display the OpenMP thread affinity information, and uses
33 the format setting defined in the affinity-format-var ICV. The runtime does not display the OpenMP
34 thread affinity information when the value of the OMP_DISPLAY_AFFINITY environment
35 variable is false or undefined. For all values of the environment variable other than true or
36 false, the display action is implementation defined.
3 For this example, an OpenMP implementation displays thread affinity information during program
4 execution, in a format given by the affinity-format-var ICV. The following is a sample output:
5 nesting_level= 1, thread_num= 0, thread_affinity= 0,1
6 nesting_level= 1, thread_num= 1, thread_affinity= 2,3
7 Cross References
8 • Controlling OpenMP Thread Affinity, see Section 10.1.3
9 • OMP_AFFINITY_FORMAT, see Section 21.2.5
10 • affinity-format-var ICV, see Table 2.1
11 • display-affinity-var ICV, see Table 2.1
12 21.2.5 OMP_AFFINITY_FORMAT
13 The OMP_AFFINITY_FORMAT environment variable sets the initial value of the
14 affinity-format-var ICV which defines the format when displaying OpenMP thread affinity
15 information. The value of this environment variable is case sensitive and leading and trailing
16 whitespace is significant. Its value is a character string that may contain as substrings one or more
17 field specifiers (as well as other characters). The format of each field specifier is
18 %[[[0].] size ] type
19 where each specifier must contain the percent symbol (%) and a type, that must be either a single
20 character short name or its corresponding long name delimited with curly braces, such as %n or
21 %{thread_num}. A literal percent is specified as %%. Field specifiers can be provided in any
22 order. The behavior is implementation defined for field specifiers that do not conform to this format.
23 The 0 modifier indicates whether or not to add leading zeros to the output, following any indication
24 of sign or base. The . modifier indicates the output should be right justified when size is specified.
25 By default, output is left justified. The minimum field length is size, which is a decimal digit string
26 with a non-zero first digit. If no size is specified, the actual length needed to print the field will be
27 used. If the 0 modifier is used with type of A, {thread_affinity}, H, {host}, or a type that
28 is not printed as a number, the result is unspecified. Any other characters in the format string that
29 are not part of a field specifier will be included literally in the output.
30 Implementations may define additional field types. If an implementation does not have information
31 for a field type or an unknown field type is part of a field specifier, "undefined" is printed for this
32 field when displaying the OpenMP thread affinity information.
4 The above example causes an OpenMP implementation to display OpenMP thread affinity
5 information in the following form:
6 Thread Affinity: 001 0 0-1,16-17 nid003
7 Thread Affinity: 001 1 2-3,18-19 nid003
8 Cross References
9 • Controlling OpenMP Thread Affinity, see Section 10.1.3
10 • omp_get_ancestor_thread_num, see Section 18.2.18
11 • omp_get_level, see Section 18.2.17
12 • omp_get_num_teams, see Section 18.4.1
5 21.2.6 OMP_CANCELLATION
6 The OMP_CANCELLATION environment variable sets the initial value of the cancel-var ICV. The
7 value of this environment variable must be one of the following:
8 true|false
9 If the environment variable is set to true, the effects of the cancel construct and of cancellation
10 points are enabled (i.e., cancellation is enabled). If the environment variable is set to false,
11 cancellation is disabled and the cancel construct and cancellation points are effectively ignored.
12 The behavior of the program is implementation defined if OMP_CANCELLATION is set to neither
13 true nor false.
14 Cross References
15 • cancel directive, see Section 16.1
16 • cancel-var ICV, see Table 2.1
17 21.2.7 OMP_DEFAULT_DEVICE
18 The OMP_DEFAULT_DEVICE environment variable sets the device number to use in device
19 constructs by setting the initial value of the default-device-var ICV. The value of this environment
20 variable must be a non-negative integer value.
21 Cross References
22 • Device Directives and Clauses, see Chapter 13
23 • default-device-var ICV, see Table 2.1
24 21.2.8 OMP_TARGET_OFFLOAD
25 The OMP_TARGET_OFFLOAD environment variable sets the initial value of the target-offload-var
26 ICV. Its value must be one of the following:
27 mandatory | disabled | default
28 The mandatory value specifies that the effect of any device construct or device memory routine
29 that uses a device that is unavailable or not supported by the implementation, or uses a
30 non-conforming device number, is as if the omp_invalid_device device number was used.
6 Cross References
7 • Device Directives and Clauses, see Chapter 13
8 • Device Memory Routines, see Section 18.8
9 • target-offload-var ICV, see Table 2.1
10 21.2.9 OMP_MAX_TASK_PRIORITY
11 The OMP_MAX_TASK_PRIORITY environment variable controls the use of task priorities by
12 setting the initial value of the max-task-priority-var ICV. The value of this environment variable
13 must be a non-negative integer.
14 Example:
15 % setenv OMP_MAX_TASK_PRIORITY 20
16 Cross References
17 • max-task-priority-var ICV, see Table 2.1
20 21.3.1 OMP_TOOL
21 The OMP_TOOL environment variable sets the tool-var ICV, which controls whether an OpenMP
22 runtime will try to register a first party tool. The value of this environment variable must be one of
23 the following:
24 enabled | disabled
25 If OMP_TOOL is set to any value other than enabled or disabled, the behavior is unspecified.
26 If OMP_TOOL is not defined, the default value for tool-var is enabled.
27 Example:
28 % setenv OMP_TOOL enabled
29 Cross References
30 • OMPT Interface, see Chapter 19
31 • tool-var ICV, see Table 2.1
17 Cross References
18 • OMPT Interface, see Chapter 19
19 • ompt_start_tool, see Section 19.2.1
20 • tool-libraries-var ICV, see Table 2.1
21 21.3.3 OMP_TOOL_VERBOSE_INIT
22 The OMP_TOOL_VERBOSE_INIT environment variable sets the tool-verbose-init-var ICV, which
23 controls whether an OpenMP implementation will verbosely log the registration of a tool. The
24 value of this environment variable must be one of the following:
25 disabled | stdout | stderr | <filename>
26 If OMP_TOOL_VERBOSE_INIT is set to any value other than case insensitive disabled,
27 stdout, or stderr, the value is interpreted as a filename and the OpenMP runtime will try to
28 log to a file with prefix filename. If the value is interpreted as a filename, whether it is case
29 sensitive is implementation defined. If opening the logfile fails, the output will be redirected to
30 stderr. If OMP_TOOL_VERBOSE_INIT is not defined, the default value for tool-verbose-init-var
31 is disabled. Support for logging to stdout or stderr is implementation defined. Unless
32 tool-verbose-init-var is disabled, the OpenMP runtime will log the steps of the tool activation
33 process defined in Section 19.2.2 to a file with a name that is constructed using the provided
34 filename prefix. The format and detail of the log is implementation defined. At a minimum, the log
35 will contain one of the following:
36 • That the tool-var ICV is disabled;
11 Cross References
12 • OMPT Interface, see Chapter 19
13 • tool-verbose-init-var ICV, see Table 2.1
16 21.4.1 OMP_DEBUG
17 The OMP_DEBUG environment variable sets the debug-var ICV, which controls whether an
18 OpenMP runtime collects information that an OMPD library may need to support a tool. The value
19 of this environment variable must be one of the following:
20 enabled | disabled
21 If OMP_DEBUG is set to any value other than enabled or disabled then the behavior is
22 implementation defined.
23 Example:
24 % setenv OMP_DEBUG enabled
25 Cross References
26 • Enabling Runtime Support for OMPD, see Section 20.2.1
27 • OMPD Interface, see Chapter 20
28 • debug-var ICV, see Table 2.1
3 21.5.1 OMP_ALLOCATOR
4 The OMP_ALLOCATOR environment variable sets the initial value of the def-allocator-var ICV
5 that specifies the default allocator for allocation calls, directives and clauses that do not specify an
6 allocator. The following grammar describes the values accepted for the OMP_ALLOCATOR
7 environment variable.
8 The value can be an integer only if the trait accepts a numerical value, for the fb_data trait the
9 value can only be predef-allocator. If the value of this environment variable is not a predefined
10 allocator, then a new allocator with the given predefined memory space and optional traits is
11 created and set as the def-allocator-var ICV. If the new allocator cannot be created, the
12 def-allocator-var ICV will be set to omp_default_mem_alloc.
13 Example:
14 setenv OMP_ALLOCATOR omp_high_bw_mem_alloc
15 setenv OMP_ALLOCATOR omp_large_cap_mem_space:alignment=16,\
16 pinned=true
17 setenv OMP_ALLOCATOR omp_high_bw_mem_space:pool_size=1048576,\
18 fallback=allocator_fb,fb_data=omp_low_lat_mem_alloc
19 Cross References
20 • Memory Allocators, see Section 6.2
21 • def-allocator-var ICV, see Table 2.1
10 21.6.2 OMP_TEAMS_THREAD_LIMIT
11 The OMP_TEAMS_THREAD_LIMIT environment variable sets the maximum number of OpenMP
12 threads to use in each contention group created by a teams construct by setting the
13 teams-thread-limit-var ICV. The value of this environment variable must be a positive integer. The
14 behavior of the program is implementation defined if the requested value of
15 OMP_TEAMS_THREAD_LIMIT is greater than the number of threads that an implementation can
16 support, or if the value is not a positive integer.
17 Cross References
18 • teams directive, see Section 10.2
19 • teams-thread-limit-var ICV, see Table 2.1
20 21.7 OMP_DISPLAY_ENV
21 The OMP_DISPLAY_ENV environment variable instructs the runtime to display the information as
22 described in the omp_display_env routine section (Section 18.15). The value of the
23 OMP_DISPLAY_ENV environment variable may be set to one of these values:
24 true | false | verbose
25 If the environment variable is set to true, the effect is as if the omp_display_env routine is
26 called with the verbose argument set to false at the beginning of the program. If the environment
27 variable is set to verbose, the effect is as if the omp_display_env routine is called with the
28 verbose argument set to true at the beginning of the program. If the environment variable is
29 undefined or set to false, the runtime does not display any information. For all values of the
30 environment variable other than true, false, and verbose, the displayed information is
31 unspecified.
32 Example:
33 % setenv OMP_DISPLAY_ENV true
6 Chapter 1:
7 • Processor: A hardware unit that is implementation defined (see Section 1.2.1).
8 • Device: An implementation-defined logical execution engine (see Section 1.2.1).
9 • Device pointer: An implementation-defined handle that refers to a device address (see
10 Section 1.2.6).
11 • Supported active levels of parallelism: The maximum number of active parallel regions that
12 may enclose any region of code in the program is implementation defined (see Section 1.2.7).
13 • Deprecated features: For any deprecated feature, whether any modifications provided by its
14 replacement feature (if any) apply to the deprecated feature is implementation defined (see
15 Section 1.2.7).
16 • Memory model: The minimum size at which a memory update may also read and write back
17 adjacent variables that are part of another variable (as array elements or structure elements) is
18 implementation defined but is no larger than the base language requires. The manner in which a
19 program can obtain the referenced device address from a device pointer, outside the mechanisms
20 specified by OpenMP, is implementation defined (see Section 1.4.1).
21 Chapter 2:
22 • Internal control variables: The initial values of dyn-var, nthreads-var, run-sched-var, bind-var,
23 stacksize-var, wait-policy-var, thread-limit-var, max-active-levels-var, place-partition-var,
24 affinity-format-var, default-device-var, num-procs-var and def-allocator-var are implementation
25 defined (see Section 2.2).
26 Chapter 3:
C / C++
27 • A pragma directive that uses ompx as the first processing token is implementation defined (see
28 Section 3.1).
C / C++
20 Chapter 8:
21 • requires directive: Support for any feature specified by a requirement clause on a
22 requires directive is implementation defined (see Section 8.2).
23 Chapter 9:
24 • unroll construct: If no clauses are specified, if and how the loop is unrolled is
25 implementation defined. If the partial clause is specified without an unroll-factor argument
26 then the unroll factor is a positive integer that is implementation defined (see Section 9.2).
27 Chapter 10:
28 • Dynamic adjustment of threads: Providing the ability to adjust the number of threads
29 dynamically is implementation defined (see Section 10.1.1).
30 • Thread affinity: For the close thread affinity policy, if T > P and P does not divide T evenly,
31 the exact number of threads in a particular place is implementation defined. For the spread
32 thread affinity, if T > P and P does not divide T evenly, the exact number of threads in a
33 particular subpartition is implementation defined. The determination of whether the affinity
34 request can be fulfilled is implementation defined. If the affinity request cannot be fulfilled, then
35 the affinity of threads in the team is implementation defined (see Section 10.1.3).
36 • teams construct: The number of teams that are created is implementation defined, but it is
37 greater than or equal to the lower bound and less than or equal to the upper bound values of the
38 num_teams clause if specified. If the num_teams clause is not specified,r the number of
3 Chapter 17:
4 • None.
5 Chapter 18:
6 • Runtime Routine names that begin with the ompx_ prefix are implementation-defined extensions
7 to the OpenMP Runtime API (see Chapter 18).
C / C++
8 • Runtime library definitions: The enum types for omp_allocator_handle_t,
9 omp_event_handle_t, omp_interop_fr_t and omp_memspace_handle_t are
10 implementation defined. The integral or pointer type for omp_interop_t is implementation
11 defined. The value of the omp_invalid_device enumerator is implementation defined (see
12 Section 18.1).
C / C++
Fortran
13 • Runtime library definitions: Whether the include file omp_lib.h or the module omp_lib
14 (or both) is provided is implementation defined. Whether the omp_lib.h file provides
15 derived-type definitions or those routines that require an explicit interface is implementation
16 defined. Whether any of the OpenMP runtime library routines that take an argument are
17 extended with a generic interface so arguments of different KIND type can be accommodated is
18 implementation defined. The value of the omp_invalid_device named constant is
19 implementation defined (see Section 18.1).
Fortran
20 • omp_set_num_threads routine: If the argument is not a positive integer, the behavior is
21 implementation defined (see Section 18.2.1).
22 • omp_set_schedule routine: For implementation-specific schedule kinds, the values and
23 associated meanings of the second argument are implementation defined (see Section 18.2.11).
24 • omp_get_schedule routine: The value returned by the second argument is implementation
25 defined for any schedule kinds other than static, dynamic and guided (see
26 Section 18.2.12).
27 • omp_get_supported_active_levels routine: The number of active levels of
28 parallelism supported by the implementation is implementation defined, but must be positive (see
29 Section 18.2.14).
30 • omp_set_max_active_levels routine: If the argument is a negative integer then the
31 behavior is implementation defined. If the argument is less than the active-levels-var ICV, the
32 max-active-levels-var ICV is set to an implementation-defined value between the value of the
33 argument and the value of active-levels-var, inclusive (see Section 18.2.15).
28 Chapter 19:
29 • Tool callbacks: If a tool attempts to register a callback listed in Table 19.3), whether the
30 registered callback may never, sometimes or always invoke this callback for the associated events
31 is implementation defined (see Section 19.2.4).
32 • Device tracing: Whether a target device supports tracing or not is implementation defined; if a
33 target device does not support tracing, a NULL may be supplied for the lookup function to the
34 device initializer of a tool (see Section 19.2.5).
35 • ompt_set_trace_ompt and ompt_get_record_ompt runtime entry points: Whether
36 a device-specific tracing interface defines this runtime entry point, indicating that it can collect
24 Chapter 20:
25 • ompd_callback_print_string_fn_t callback type: The value of category is
26 implementation defined (see Section 20.4.5).
27 • ompd_parallel_handle_compare operation: The means by which parallel region
28 handles are ordered is implementation defined (see Section 20.5.6.5).
29 • ompd_task_handle_compare operation: The means by which task handles are ordered is
30 implementation defined (see Section 20.5.7.6).
31 Chapter 21:
32 • OMP_DYNAMIC environment variable: If the value is neither true nor false, the behavior
33 of the program is implementation defined (see Section 21.1.1).
34 • OMP_NUM_THREADS environment variable: If any value of the list specified leads to a number
35 of threads that is greater than the implementation can support, or if any value is not a positive
36 integer, then the behavior of the program is implementation defined (see Section 21.1.2).
A C
acquire flush, 30 cancel, 332
adjust_args, 195 cancellation constructs, 332
affinity, 228 cancel, 332
affinity, 264 cancellation point, 336
align, 174 cancellation point, 336
aligned, 169 canonical loop nest form, 85
allocate, 176, 178 capture, atomic, 311
allocator, 175 clause format, 56
allocators, 180 clauses
append_args, 196 adjust_args, 195
array sections, 64 affinity, 264
array shaping, 63 align, 174
assumes, 214, 215 aligned, 169
assumption clauses, 213 allocate, 178
assumption directives, 213 allocator, 175
at, 210 append_args, 196
atomic, 311 assumption, 213
atomic, 310 at, 210
atomic construct, 619 atomic, 310
attribute clauses, 108 attribute data-sharing, 108
attributes, data-mapping, 147, 148 bind, 258
attributes, data-sharing, 96 branch, 204
auto, 253 collapse, 93
copyin, 144
B copyprivate, 146
barrier, 301 data copying, 144
barrier, implicit, 303 data-sharing, 108
base language format, 74 default, 109
begin declare target, 207 defaultmap, 161
begin declare variant, 198 depend, 323
begin metadirective, 192 destroy, 73
begin assumes, 215 detach, 265
641
device, 276 priority, 261
device_type, 275 private, 111
dist_schedule, 256 proc_bind, 229
doacross, 326 reduction, 134
enter, 158 requirement, 212
exclusive, 143 safelen, 237
extended-atomic, 310 schedule, 252
filter, 239 severity, 217
final, 261 shared, 110
firstprivate, 112 simdlen, 237
from, 167 sizes, 220
full, 221 task_reduction, 137
grainsize, 269 thread_limit, 277
has_device_addr, 122 to, 166
hint, 296, 299 uniform, 168
if Clause, 72 untied, 260
in_reduction, 138 update, 321
inclusive, 143 use, 294
indirect, 209 use_device_addr, 123
init, 293 use_device_ptr, 121
initializer, 130 uses_allocators, 181
is_device_ptr, 120 when, 190
lastprivate, 115 collapse, 93
linear, 117 combined and composite directive
link, 159 names, 342
map, 150 combined construct semantics, 343
match, 194 compare, atomic, 311
memory-order, 309 compilation sentinels, 70, 71
mergeable, 260 compliance, 34
message, 217 composite constructs, 343
nocontext, 201 composition of constructs, 338
nogroup, 309 conditional compilation, 69
nontemporal, 236 consistent loop schedules, 95
novariants, 201 construct syntax, 48
nowait, 308 constructs
num_tasks, 270 allocators, 180
num_teams, 233 atomic, 311
num_threads, 227 barrier, 301
order, 233 cancel, 332
ordered, 94 cancellation constructs, 332
otherwise, 191 cancellation point, 336
parallelization-level, 331 combined constructs, 343
partial, 221 composite constructs, 343
Index 643
Declare Target, 204 OMP_THREAD_LIMIT, 601
declare target, 206 OMP_TOOL, 611
declare variant, 197 OMP_TOOL_LIBRARIES, 612
declare variant, 193 OMP_TOOL_VERBOSE_INIT, 612
error, 216 OMP_WAIT_POLICY, 607
memory management directives, 171 event, 414
metadirective, 189, 192 event callback registration, 446
nothing, 216 event callback signatures, 474
requires, 210 event routines, 414
scan Directive, 141 exclusive, 143
section, 244 execution model, 23
threadprivate, 101 extended-atomic, 310
variant directives, 183
dispatch, 200 F
dist_schedule, 256 features history, 625
distribute, 254 filter, 239
do, 251 final, 261
doacross, 326 firstprivate, 112
dynamic, 252 fixed source form conditional compilation
dynamic thread adjustment, 618 sentinels, 70
fixed source form directives, 54
E flush, 315
enter, 158 flush operation, 29
environment display routine, 438 flush synchronization, 30
environment variables, 599 flush-set, 29
OMP_AFFINITY_FORMAT, 608 for, 250
OMP_ALLOCATOR, 614 frames, 470
OMP_CANCELLATION, 610 free source form conditional compilation
OMP_DEBUG, 613 sentinel, 71
OMP_DEFAULT_DEVICE, 610 free source form directives, 55
OMP_DISPLAY_AFFINITY, 607 from, 167
OMP_DISPLAY_ENV, 615 full, 221
OMP_DYNAMIC, 600
OMP_MAX_ACTIVE_LEVELS, 601 G
OMP_MAX_TASK_PRIORITY, 611 glossary, 2
OMP_NESTED, 602 grainsize, 269
OMP_NUM_TEAMS, 615 guided, 252
OMP_NUM_THREADS, 600
OMP_PLACES, 602 H
OMP_PROC_BIND, 604 happens before, 30
OMP_SCHEDULE, 605 has_device_addr, 122
OMP_STACKSIZE, 606 header files, 345
OMP_TARGET_OFFLOAD, 610 hint, 299
OMP_TEAMS_THREAD_LIMIT, 615 history of features, 625
Index 645
omp_display_affinity, 370 omp_get_schedule, 356
OMP_DISPLAY_ENV, 615 omp_get_supported_active
omp_display_env, 438 _levels, 358
OMP_DYNAMIC, 600 omp_get_team_num, 373
omp_free, 430 omp_get_team_size, 361
omp_fulfill_event, 414 omp_get_teams_thread_limit, 376
omp_get_active_level, 362 omp_get_thread_limit, 357
omp_get_affinity_format, 369 omp_get_thread_num, 350
omp_get_ancestor_thread_num, 360 omp_get_wtick, 414
omp_get_cancellation, 353 omp_get_wtime, 413
omp_get_default_allocator, 428 omp_in_explicit_task, 377
omp_get_default_device, 382 omp_in_final, 378
omp_get_device_num, 384 omp_in_parallel, 351
omp_get_dynamic, 352 omp_init_allocator, 425
omp_get_initial_device, 385 omp_init_lock, 405, 406
omp_get_interop_int, 417 omp_init_nest_lock, 405, 406
omp_get_interop_name, 420 omp_is_initial_device, 384
omp_get_interop_ptr, 418 OMP_MAX_ACTIVE_LEVELS, 601
omp_get_interop_rc_desc, 421 OMP_MAX_TASK_PRIORITY, 611
omp_get_interop_str, 419 OMP_NESTED, 602
omp_get_interop_type_desc, 421 OMP_NUM_TEAMS, 615
omp_get_level, 360 OMP_NUM_THREADS, 600
omp_get_mapped_ptr, 402 omp_pause_resource, 378
omp_get_max_active_levels, 359 omp_pause_resource_all, 380
omp_get_max_task_priority, 377 OMP_PLACES, 602
omp_get_max_teams, 374 OMP_PROC_BIND, 604
omp_get_max_threads, 350 omp_realloc, 433
omp_get_nested, 354 OMP_SCHEDULE, 605
omp_get_num_devices, 383 omp_set_affinity_format, 368
omp_get_num_interop_properties, omp_set_default_allocator, 427
417 omp_set_default_device, 382
omp_get_num_places, 364 omp_set_dynamic, 352
omp_get_num_procs, 381 omp_set_lock, 408
omp_get_num_teams, 372 omp_set_max_active_levels, 358
omp_get_num_threads, 349 omp_set_nest_lock, 408
omp_get_partition_num_places, omp_set_nested, 353
367 omp_set_num_teams, 373
omp_get_partition_place_nums, omp_set_num_threads, 348
368 omp_set_schedule, 355
omp_get_place_num, 366 omp_set_teams_thread_limit, 375
omp_get_place_num_procs, 365 OMP_STACKSIZE, 606
omp_get_place_proc_ids, 365 omp_target_alloc, 385
omp_get_proc_bind, 363 omp_target_associate_ptr, 399
Index 647
ompt_callback_target_map_t, 504 release flush, 30
ompt_callback_target requirement, 212
_submit_emi_t, 506 requires, 210
ompt_callback_target reserved locators, 62
_submit_t, 506 resource relinquishing routines, 378
ompt_callback_target_t, 502 runtime, 253
ompt_callback_task_create_t, 481 runtime library definitions, 345
ompt_callback_task runtime library routines, 345
_dependence_t, 484
ompt_callback_task S
_schedule_t, 484 safelen, 237
ompt_callback_thread scan Directive, 141
_begin_t, 475 schedule, 252
ompt_callback_thread_end_t, 476 scheduling, 272
ompt_callback_work_t, 479 scope, 242
OpenMP allocator structured blocks, 77 section, 244
OpenMP argument lists, 60 sections, 243
OpenMP atomic structured blocks, 79 severity, 217
OpenMP compliance, 34 shared, 110
OpenMP context-specific structured simd, 235
blocks, 77 simdlen, 237
OpenMP function dispatch structured Simple Lock Routines, 404
blocks, 78 single, 240
OpenMP operations, 62 sizes, 220
OpenMP stylized expressions, 76 stand-alone directives, 54
OpenMP types, 74 static, 252
order, 233 strong flush, 29
ordered, 94, 328–330 structured blocks, 76
otherwise, 191 synchronization constructs, 296
synchronization constructs and clauses, 296
P synchronization hint type, 296
parallel, 223 synchronization hints, 296
parallelism generating constructs, 223 synchronization terminology, 10
parallelization-level, 331
partial, 221 T
priority, 261 target, 283
private, 111 target data, 279
proc_bind, 229 target memory routines, 385
target update, 289
R task, 262
read, atomic, 311 task scheduling, 272
initializer, 130 task-dependence-type, 321
reduction, 134 task_reduction, 137
reduction clauses, 124 taskgroup, 304
U
uniform, 168
unroll, 220
untied, 260
update, 321
update, atomic, 311
use, 294
use_device_addr, 123
use_device_ptr, 121
uses_allocators, 181
V
variables, environment, 599
variant directives, 183
W
wait identifier, 472
wall clock timer, 413
error, 216
Index 649