-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[OpenMP] Add pre sm_70 load hack back in #138589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560
@llvm/pr-subscribers-offload Author: Joseph Huber (jhuber6) ChangesSummary: Fixes #138560 Full diff: https://github.com/llvm/llvm-project/pull/138589.diff 1 Files Affected:
diff --git a/offload/DeviceRTL/include/Synchronization.h b/offload/DeviceRTL/include/Synchronization.h
index f9eb8d0d23198..7e7c8eacb9173 100644
--- a/offload/DeviceRTL/include/Synchronization.h
+++ b/offload/DeviceRTL/include/Synchronization.h
@@ -59,7 +59,11 @@ V add(Ty *Address, V Val, atomic::OrderingTy Ordering,
template <typename Ty, typename V = utils::remove_addrspace_t<Ty>>
V load(Ty *Address, atomic::OrderingTy Ordering,
MemScopeTy MemScope = MemScopeTy::device) {
+#ifdef __NVPTX__
+ return __scoped_atomic_fetch_add(Address, V(0), Ordering, MemScope);
+#else
return __scoped_atomic_load_n(Address, Ordering, MemScope);
+#endif
}
template <typename Ty, typename V = utils::remove_addrspace_t<Ty>>
|
@jhuber6 thank you. All works fine now. |
/cherry-pick dfcb8cb |
/pull-request #138626 |
Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560
Summary: Different ordering modes aren't supported for an atomic load, so we just do an add of zero as the same thing. It's less efficient, but it works. Fixes llvm#138560 (cherry picked from commit dfcb8cb)
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.
Fixes #138560