[OpenMP] Add pre sm_70 load hack back in #138589

jhuber6 · 2025-05-05T21:21:45Z

Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes #138560

Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560

llvmbot · 2025-05-05T21:22:19Z

@llvm/pr-subscribers-offload

Author: Joseph Huber (jhuber6)

Changes

Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes #138560

Full diff: https://github.com/llvm/llvm-project/pull/138589.diff

1 Files Affected:

(modified) offload/DeviceRTL/include/Synchronization.h (+4)

diff --git a/offload/DeviceRTL/include/Synchronization.h b/offload/DeviceRTL/include/Synchronization.h
index f9eb8d0d23198..7e7c8eacb9173 100644
--- a/offload/DeviceRTL/include/Synchronization.h
+++ b/offload/DeviceRTL/include/Synchronization.h
@@ -59,7 +59,11 @@ V add(Ty *Address, V Val, atomic::OrderingTy Ordering,
 template <typename Ty, typename V = utils::remove_addrspace_t<Ty>>
 V load(Ty *Address, atomic::OrderingTy Ordering,
        MemScopeTy MemScope = MemScopeTy::device) {
+#ifdef __NVPTX__
+  return __scoped_atomic_fetch_add(Address, V(0), Ordering, MemScope);
+#else
   return __scoped_atomic_load_n(Address, Ordering, MemScope);
+#endif
 }
 
 template <typename Ty, typename V = utils::remove_addrspace_t<Ty>>

ye-luo · 2025-05-06T00:01:58Z

@jhuber6 thank you. All works fine now.

ye-luo · 2025-05-06T03:12:37Z

/cherry-pick dfcb8cb

llvmbot · 2025-05-06T03:18:37Z

/pull-request #138626

Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560

Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560 (cherry picked from commit dfcb8cb)

[OpenMP] Add pre sm_70 load hack back in

0797f02

Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560

jhuber6 requested review from jdoerfert, shiltian and ye-luo May 5, 2025 21:21

llvmbot added the offload label May 5, 2025

shiltian approved these changes May 5, 2025

View reviewed changes

jhuber6 merged commit dfcb8cb into llvm:main May 5, 2025
10 of 11 checks passed

ye-luo modified the milestones: Clang C++20, LLVM 20.X Release May 6, 2025

github-project-automation bot added this to LLVM Release Status May 6, 2025

github-project-automation bot moved this to Needs Triage in LLVM Release Status May 6, 2025

ye-luo modified the milestone: LLVM 20.X Release May 6, 2025

llvmbot moved this from Needs Triage to Done in LLVM Release Status May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenMP] Add pre sm_70 load hack back in #138589

[OpenMP] Add pre sm_70 load hack back in #138589

Uh oh!

jhuber6 commented May 5, 2025

Uh oh!

llvmbot commented May 5, 2025

Uh oh!

Uh oh!

ye-luo commented May 6, 2025

Uh oh!

ye-luo commented May 6, 2025

Uh oh!

llvmbot commented May 6, 2025

Uh oh!

Uh oh!

[OpenMP] Add pre sm_70 load hack back in #138589

[OpenMP] Add pre sm_70 load hack back in #138589

Uh oh!

Conversation

jhuber6 commented May 5, 2025

Uh oh!

llvmbot commented May 5, 2025

Uh oh!

Uh oh!

ye-luo commented May 6, 2025

Uh oh!

ye-luo commented May 6, 2025

Uh oh!

llvmbot commented May 6, 2025

Uh oh!

Uh oh!