Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] optimize and clean bit functions #3226

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

fbusato
Copy link
Contributor

@fbusato fbusato commented Dec 31, 2024

(partially) Fixes #2239

Description

Optimize and cleanup the following functions:

  • Countl_zero/Countr_zero
  • Countl_one, Countr_one
  • popcount
  • has_single_bit

Run-time optimizations are described in #2239.

Cleanup:

  • Use C++14 constexpr to avoid several function instantiations
  • Add _CCCL_NODISCARD to all functions
  • Add __detail namespace
  • Fully qualify function namespace

DO NOT MERGE

  • require C++17

@fbusato fbusato requested review from a team as code owners December 31, 2024 22:25
libcudacxx/include/cuda/std/__bit/clz.h Outdated Show resolved Hide resolved
@@ -30,124 +30,106 @@

_LIBCUDACXX_BEGIN_NAMESPACE_STD

_LIBCUDACXX_HIDE_FROM_ABI constexpr int __binary_clz2(uint64_t __x, int __c)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot break user code that relies on those being constexpr in C++11, even if we have deprecated it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. This is the reason I added [DO NOT MERGE]. The PR is intended for the next major update.

@fbusato fbusato self-assigned this Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

[FEA]: Provide optimized <bit> functions for device
2 participants