Skip to content

[OpenMP] Add pre sm_70 load hack back in #138589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2025
Merged

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented May 5, 2025

Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes #138560

Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes llvm#138560
@jhuber6 jhuber6 requested review from jdoerfert, shiltian and ye-luo May 5, 2025 21:21
@llvmbot llvmbot added the offload label May 5, 2025
@llvmbot
Copy link
Member

llvmbot commented May 5, 2025

@llvm/pr-subscribers-offload

Author: Joseph Huber (jhuber6)

Changes

Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes #138560


Full diff: https://github.com/llvm/llvm-project/pull/138589.diff

1 Files Affected:

  • (modified) offload/DeviceRTL/include/Synchronization.h (+4)
diff --git a/offload/DeviceRTL/include/Synchronization.h b/offload/DeviceRTL/include/Synchronization.h
index f9eb8d0d23198..7e7c8eacb9173 100644
--- a/offload/DeviceRTL/include/Synchronization.h
+++ b/offload/DeviceRTL/include/Synchronization.h
@@ -59,7 +59,11 @@ V add(Ty *Address, V Val, atomic::OrderingTy Ordering,
 template <typename Ty, typename V = utils::remove_addrspace_t<Ty>>
 V load(Ty *Address, atomic::OrderingTy Ordering,
        MemScopeTy MemScope = MemScopeTy::device) {
+#ifdef __NVPTX__
+  return __scoped_atomic_fetch_add(Address, V(0), Ordering, MemScope);
+#else
   return __scoped_atomic_load_n(Address, Ordering, MemScope);
+#endif
 }
 
 template <typename Ty, typename V = utils::remove_addrspace_t<Ty>>

@jhuber6 jhuber6 merged commit dfcb8cb into llvm:main May 5, 2025
10 of 11 checks passed
@ye-luo
Copy link
Contributor

ye-luo commented May 6, 2025

@jhuber6 thank you. All works fine now.

@ye-luo
Copy link
Contributor

ye-luo commented May 6, 2025

/cherry-pick dfcb8cb

@llvmbot
Copy link
Member

llvmbot commented May 6, 2025

/pull-request #138626

@llvmbot llvmbot moved this from Needs Triage to Done in LLVM Release Status May 6, 2025
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes llvm#138560
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request May 9, 2025
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes llvm#138560

(cherry picked from commit dfcb8cb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

[Offload] regression on sm_60
4 participants