Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] ceil_div return common type and optmize #3229

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

fbusato
Copy link
Contributor

@fbusato fbusato commented Jan 1, 2025

Fixes #2845, #2391

Description

ceil_div returns the resulting type of the operation and has been optmized for CUDA

DO NOT MERGE

  • require C++17
  • breaking change in the API

@fbusato fbusato requested review from a team as code owners January 1, 2025 01:44
@fbusato fbusato requested review from wmaxey and alliepiper January 1, 2025 01:44
_CUDA_VSTD::enable_if_t<_CCCL_TRAIT(_CUDA_VSTD::is_integral, _Up), int> = 0>
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI _CCCL_CONSTEXPR_CXX14 _Tp ceil_div(const _Tp __a, const _Up __b) noexcept
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI _CCCL_CONSTEXPR_CXX14 decltype(_Tp{} / _Up{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should defnitely use common_type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this a breaking change. This is for the next major update

Copy link
Collaborator

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the previous implementation much simpler, please keep signed and unsigned separate

@fbusato
Copy link
Contributor Author

fbusato commented Jan 2, 2025

I find the previous implementation much simpler, please keep signed and unsigned separate

I don't quite agree...With two separate functions we need to duplicate 15 lines of code, which is not great.

template <class _Tp,
          class _Up,
          _CUDA_VSTD::enable_if_t<_CCCL_TRAIT(_CUDA_VSTD::is_integral, _Tp), int> = 0,
          _CUDA_VSTD::enable_if_t<_CCCL_TRAIT(_CUDA_VSTD::is_integral, _Up), int> = 0>
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI _CCCL_CONSTEXPR_CXX14 decltype(_Tp{} / _Up{})
ceil_div(const _Tp __a, const _Up __b) noexcept
{
  _CCCL_ASSERT(__b > _Up{0}, "cuda::ceil_div: b must be positive");
  using _Common  = decltype(_Tp{} / _Up{});
  using _UCommon = _CUDA_VSTD::make_unsigned_t<_Common>;
  if constexpr (_CUDA_VSTD::is_signed_v<_Tp>)
  {
    _CCCL_ASSERT(__a >= _Tp{0}, "cuda::ceil_div: a must be non negative");
  }
  auto __a1 = static_cast<_UCommon>(__a);
  auto __b1 = static_cast<_UCommon>(__b);

Copy link
Contributor

github-actions bot commented Jan 2, 2025

🟨 CI finished in 2h 02m: Pass: 79%/170 | Total: 3d 02h | Avg: 26m 16s | Max: 1h 23m | Hits: 36%/17647
  • 🟨 libcudacxx: Pass: 72%/48 | Total: 15h 01m | Avg: 18m 46s | Max: 1h 23m | Hits: 30%/7578

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  71%/46  | Total: 14h 17m | Avg: 18m 38s | Max:  1h 23m | Hits:  30%/7578  
      🟩 arm64              Pass: 100%/2   | Total: 43m 46s | Avg: 21m 53s | Max: 22m 50s
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 09m | Avg: 17m 24s | Max: 22m 43s
      🔍 nvcc               Pass:  70%/44  | Total: 13h 51m | Avg: 18m 53s | Max:  1h 23m | Hits:  30%/7578  
    🟨 ctk
      🟥 11.1               Pass:   0%/7   | Total: 58m 15s | Avg:  8m 19s | Max: 17m 41s
      🟩 12.5               Pass: 100%/2   | Total: 49m 28s | Avg: 24m 44s | Max: 34m 40s
      🟨 12.6               Pass:  84%/39  | Total: 13h 13m | Avg: 20m 20s | Max:  1h 23m | Hits:  30%/7578  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 09m | Avg: 17m 24s | Max: 22m 43s
      🟥 nvcc11.1           Pass:   0%/7   | Total: 58m 15s | Avg:  8m 19s | Max: 17m 41s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 49m 28s | Avg: 24m 44s | Max: 34m 40s
      🟨 nvcc12.6           Pass:  82%/35  | Total: 12h 03m | Avg: 20m 40s | Max:  1h 23m | Hits:  30%/7578  
    🟨 cxx
      🟥 Clang9             Pass:   0%/4   | Total: 26m 32s | Avg:  6m 38s | Max: 17m 41s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 45s | Avg:  4m 45s | Max:  4m 45s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 23s | Avg:  4m 23s | Max:  4m 23s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 Clang13            Pass: 100%/1   | Total: 21m 21s | Avg: 21m 21s | Max: 21m 21s
      🟩 Clang14            Pass: 100%/1   | Total: 22m 42s | Avg: 22m 42s | Max: 22m 42s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 30s | Avg:  4m 30s | Max:  4m 30s
      🟩 Clang16            Pass: 100%/1   | Total: 21m 34s | Avg: 21m 34s | Max: 21m 34s
      🟩 Clang17            Pass: 100%/1   | Total: 23m 37s | Avg: 23m 37s | Max: 23m 37s
      🟩 Clang18            Pass: 100%/8   | Total:  2h 44m | Avg: 20m 31s | Max: 44m 04s
      🟥 GCC6               Pass:   0%/2   | Total:  3m 39s | Avg:  1m 49s | Max:  1m 50s
      🟥 GCC7               Pass:   0%/2   | Total: 20m 01s | Avg: 10m 00s | Max: 17m 55s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 37s | Avg:  3m 37s | Max:  3m 37s
      🟨 GCC9               Pass:  33%/3   | Total: 40m 54s | Avg: 13m 38s | Max: 22m 57s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s
      🟩 GCC11              Pass: 100%/1   | Total: 23m 35s | Avg: 23m 35s | Max: 23m 35s
      🟩 GCC12              Pass: 100%/1   | Total: 22m 31s | Avg: 22m 31s | Max: 22m 31s
      🟨 GCC13              Pass:  80%/10  | Total:  4h 44m | Avg: 28m 28s | Max:  1h 23m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 22m 52s | Avg: 22m 52s | Max: 22m 52s
      🟥 MSVC14.16          Pass:   0%/1   | Total: 16m 54s | Avg: 16m 54s | Max: 16m 54s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 37m 57s | Avg: 37m 57s | Max: 37m 57s | Hits:  30%/2477  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 13m | Avg: 36m 30s | Max: 37m 15s | Hits:  29%/5101  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 49m 28s | Avg: 24m 44s | Max: 34m 40s
    🟨 cxx_family
      🟨 Clang              Pass:  80%/20  | Total:  4h 57m | Avg: 14m 52s | Max: 44m 04s
      🟨 GCC                Pass:  61%/21  | Total:  6h 43m | Avg: 19m 12s | Max:  1h 23m
      🟩 Intel              Pass: 100%/1   | Total: 22m 52s | Avg: 22m 52s | Max: 22m 52s
      🟨 MSVC               Pass:  75%/4   | Total:  2h 07m | Avg: 31m 58s | Max: 37m 57s | Hits:  30%/7578  
      🟩 NVHPC              Pass: 100%/2   | Total: 49m 28s | Avg: 24m 44s | Max: 34m 40s
    🟨 jobs
      🟨 Build              Pass:  73%/41  | Total: 10h 46m | Avg: 15m 46s | Max: 37m 57s | Hits:  30%/7578  
      🟨 NVRTC              Pass:  50%/4   | Total:  2h 05m | Avg: 31m 23s | Max: 38m 59s
      🟩 Test               Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 23m
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 54s | Avg:  1m 54s | Max:  1m 54s
    🟨 std
      🟥 11                 Pass:   0%/6   | Total: 42m 32s | Avg:  7m 05s | Max: 32m 46s
      🟥 14                 Pass:   0%/5   | Total:  1h 05m | Avg: 13m 00s | Max: 23m 40s
      🟨 17                 Pass:  84%/13  | Total:  4h 29m | Avg: 20m 46s | Max: 37m 57s | Hits:  30%/4954  
      🟩 20                 Pass: 100%/23  | Total:  8h 41m | Avg: 22m 41s | Max:  1h 23m | Hits:  29%/2624  
    🟨 gpu
      🟨 v100               Pass:  72%/48  | Total: 15h 01m | Avg: 18m 46s | Max:  1h 23m | Hits:  30%/7578  
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 13m 11s | Avg: 13m 11s | Max: 13m 11s
      🟩 90a                Pass: 100%/2   | Total: 20m 56s | Avg: 10m 28s | Max: 13m 09s
    
  • 🟨 cub: Pass: 76%/47 | Total: 1d 07h | Avg: 40m 03s | Max: 1h 13m | Hits: 27%/2349

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  75%/45  | Total:  1d 05h | Avg: 39m 07s | Max:  1h 13m | Hits:  27%/2349  
      🟩 arm64              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 06m
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 04m
      🔍 nvcc               Pass:  75%/45  | Total:  1d 05h | Avg: 39m 08s | Max:  1h 13m | Hits:  27%/2349  
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 50m 10s | Avg: 25m 05s | Max: 27m 26s
      🔍 v100               Pass:  75%/45  | Total:  1d 06h | Avg: 40m 43s | Max:  1h 13m | Hits:  27%/2349  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  72%/40  | Total:  1d 04h | Avg: 42m 24s | Max:  1h 13m | Hits:  27%/2349  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
      🟩 GraphCapture       Pass: 100%/1   | Total: 20m 38s | Avg: 20m 38s | Max: 20m 38s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 02m | Avg: 20m 54s | Max: 22m 44s
      🟩 TestGPU            Pass: 100%/2   | Total: 57m 57s | Avg: 28m 58s | Max: 29m 03s
    🟨 ctk
      🟥 11.1               Pass:   0%/7   | Total: 33m 37s | Avg:  4m 48s | Max: 18m 21s
      🟩 12.5               Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 13m
      🟨 12.6               Pass:  89%/38  | Total:  1d 04h | Avg: 44m 48s | Max:  1h 06m | Hits:  27%/2349  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 04m
      🟥 nvcc11.1           Pass:   0%/7   | Total: 33m 37s | Avg:  4m 48s | Max: 18m 21s
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 13m
      🟨 nvcc12.6           Pass:  88%/36  | Total:  1d 02h | Avg: 43m 55s | Max:  1h 06m | Hits:  27%/2349  
    🟨 cxx
      🟥 Clang9             Pass:   0%/4   | Total: 12m 14s | Avg:  3m 03s | Max:  3m 14s
      🟩 Clang10            Pass: 100%/1   | Total: 58m 01s | Avg: 58m 01s | Max: 58m 01s
      🟩 Clang11            Pass: 100%/1   | Total: 59m 56s | Avg: 59m 56s | Max: 59m 56s
      🟩 Clang12            Pass: 100%/1   | Total: 52m 27s | Avg: 52m 27s | Max: 52m 27s
      🟩 Clang13            Pass: 100%/1   | Total: 53m 29s | Avg: 53m 29s | Max: 53m 29s
      🟩 Clang14            Pass: 100%/1   | Total: 53m 47s | Avg: 53m 47s | Max: 53m 47s
      🟩 Clang15            Pass: 100%/1   | Total: 57m 41s | Avg: 57m 41s | Max: 57m 41s
      🟩 Clang16            Pass: 100%/1   | Total: 53m 41s | Avg: 53m 41s | Max: 53m 41s
      🟩 Clang17            Pass: 100%/1   | Total: 55m 59s | Avg: 55m 59s | Max: 55m 59s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 35m | Avg: 47m 52s | Max:  1h 04m
      🟥 GCC6               Pass:   0%/2   | Total:  4m 06s | Avg:  2m 03s | Max:  2m 04s
      🟥 GCC7               Pass:   0%/2   | Total:  5m 53s | Avg:  2m 56s | Max:  2m 57s
      🟩 GCC8               Pass: 100%/1   | Total: 58m 49s | Avg: 58m 49s | Max: 58m 49s
      🟨 GCC9               Pass:  33%/3   | Total: 59m 09s | Avg: 19m 43s | Max: 53m 48s
      🟩 GCC10              Pass: 100%/1   | Total: 56m 43s | Avg: 56m 43s | Max: 56m 43s
      🟩 GCC11              Pass: 100%/1   | Total: 58m 25s | Avg: 58m 25s | Max: 58m 25s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 45m | Avg: 35m 01s | Max: 54m 53s
      🟩 GCC13              Pass: 100%/8   | Total:  5h 23m | Avg: 40m 23s | Max:  1h 06m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟥 MSVC14.16          Pass:   0%/1   | Total: 18m 21s | Avg: 18m 21s | Max: 18m 21s
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 04m | Avg:  1h 04m | Max:  1h 04m | Hits:  28%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  27%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 13m
    🟨 cxx_family
      🟨 Clang              Pass:  78%/19  | Total: 13h 12m | Avg: 41m 42s | Max:  1h 04m
      🟨 GCC                Pass:  71%/21  | Total: 11h 11m | Avg: 31m 57s | Max:  1h 06m
      🟩 Intel              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟨 MSVC               Pass:  75%/4   | Total:  3h 31m | Avg: 52m 52s | Max:  1h 04m | Hits:  27%/2349  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 13m
    🟨 std
      🟥 11                 Pass:   0%/5   | Total: 13m 41s | Avg:  2m 44s | Max:  3m 14s
      🟥 14                 Pass:   0%/4   | Total: 26m 33s | Avg:  6m 38s | Max: 18m 21s
      🟨 17                 Pass:  83%/12  | Total: 10h 16m | Avg: 51m 21s | Max:  1h 13m | Hits:  28%/1566  
      🟩 20                 Pass: 100%/26  | Total: 20h 26m | Avg: 47m 09s | Max:  1h 12m | Hits:  27%/783   
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 50m 10s | Avg: 25m 05s | Max: 27m 26s
      🟩 90a                Pass: 100%/1   | Total: 27m 02s | Avg: 27m 02s | Max: 27m 02s
    
  • 🟨 thrust: Pass: 76%/46 | Total: 1d 00h | Avg: 32m 06s | Max: 1h 15m | Hits: 43%/7408

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  75%/44  | Total: 23h 21m | Avg: 31m 50s | Max:  1h 15m | Hits:  43%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 15m | Avg: 37m 59s | Max: 41m 06s
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 09m | Avg: 34m 56s | Max: 38m 30s
      🔍 nvcc               Pass:  75%/44  | Total: 23h 27m | Avg: 31m 58s | Max:  1h 15m | Hits:  43%/7408  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  72%/40  | Total: 22h 23m | Avg: 33m 35s | Max:  1h 15m | Hits:  24%/5556  
      🟩 TestCPU            Pass: 100%/3   | Total: 38m 05s | Avg: 12m 41s | Max: 21m 56s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 35m | Avg: 31m 42s | Max: 54m 30s
    🟨 ctk
      🟥 11.1               Pass:   0%/7   | Total: 36m 15s | Avg:  5m 10s | Max: 25m 00s
      🟩 12.5               Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
      🟨 12.6               Pass:  89%/37  | Total: 21h 37m | Avg: 35m 03s | Max:  1h 12m | Hits:  43%/7408  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 09m | Avg: 34m 56s | Max: 38m 30s
      🟥 nvcc11.1           Pass:   0%/7   | Total: 36m 15s | Avg:  5m 10s | Max: 25m 00s
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
      🟨 nvcc12.6           Pass:  88%/35  | Total: 20h 27m | Avg: 35m 04s | Max:  1h 12m | Hits:  43%/7408  
    🟨 cxx
      🟥 Clang9             Pass:   0%/4   | Total:  8m 43s | Avg:  2m 10s | Max:  2m 26s
      🟩 Clang10            Pass: 100%/1   | Total: 39m 37s | Avg: 39m 37s | Max: 39m 37s
      🟩 Clang11            Pass: 100%/1   | Total: 36m 48s | Avg: 36m 48s | Max: 36m 48s
      🟩 Clang12            Pass: 100%/1   | Total: 36m 20s | Avg: 36m 20s | Max: 36m 20s
      🟩 Clang13            Pass: 100%/1   | Total: 34m 48s | Avg: 34m 48s | Max: 34m 48s
      🟩 Clang14            Pass: 100%/1   | Total: 41m 57s | Avg: 41m 57s | Max: 41m 57s
      🟩 Clang15            Pass: 100%/1   | Total: 43m 30s | Avg: 43m 30s | Max: 43m 30s
      🟩 Clang16            Pass: 100%/1   | Total: 40m 17s | Avg: 40m 17s | Max: 40m 17s
      🟩 Clang17            Pass: 100%/1   | Total: 40m 59s | Avg: 40m 59s | Max: 40m 59s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 32m | Avg: 30m 23s | Max: 40m 27s
      🟥 GCC6               Pass:   0%/2   | Total:  3m 42s | Avg:  1m 51s | Max:  1m 54s
      🟥 GCC7               Pass:   0%/2   | Total:  4m 20s | Avg:  2m 10s | Max:  2m 11s
      🟩 GCC8               Pass: 100%/1   | Total: 41m 50s | Avg: 41m 50s | Max: 41m 50s
      🟨 GCC9               Pass:  33%/3   | Total: 44m 23s | Avg: 14m 47s | Max: 40m 50s
      🟩 GCC10              Pass: 100%/1   | Total: 43m 10s | Avg: 43m 10s | Max: 43m 10s
      🟩 GCC11              Pass: 100%/1   | Total: 41m 24s | Avg: 41m 24s | Max: 41m 24s
      🟩 GCC12              Pass: 100%/1   | Total: 46m 46s | Avg: 46m 46s | Max: 46m 46s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 22m | Avg: 32m 49s | Max: 54m 30s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 50m 01s | Avg: 50m 01s | Max: 50m 01s
      🟥 MSVC14.16          Pass:   0%/1   | Total: 25m 00s | Avg: 25m 00s | Max: 25m 00s
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m | Hits:  24%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 46m | Avg: 55m 38s | Max:  1h 12m | Hits:  49%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
    🟨 cxx_family
      🟨 Clang              Pass:  78%/19  | Total:  8h 55m | Avg: 28m 11s | Max: 43m 30s
      🟨 GCC                Pass:  68%/19  | Total:  8h 08m | Avg: 25m 41s | Max: 54m 30s
      🟩 Intel              Pass: 100%/1   | Total: 50m 01s | Avg: 50m 01s | Max: 50m 01s
      🟨 MSVC               Pass:  80%/5   | Total:  4h 19m | Avg: 51m 54s | Max:  1h 12m | Hits:  43%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
    🟨 std
      🟥 11                 Pass:   0%/5   | Total: 10m 05s | Avg:  2m 01s | Max:  2m 17s
      🟥 14                 Pass:   0%/4   | Total: 31m 23s | Avg:  7m 50s | Max: 25m 00s
      🟨 17                 Pass:  83%/12  | Total:  8h 10m | Avg: 40m 50s | Max:  1h 12m | Hits:  24%/3704  
      🟩 20                 Pass: 100%/23  | Total: 14h 54m | Avg: 38m 54s | Max:  1h 15m | Hits:  62%/3704  
    🟨 gpu
      🟨 v100               Pass:  76%/46  | Total:  1d 00h | Avg: 32m 06s | Max:  1h 15m | Hits:  43%/7408  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 50m 54s | Avg: 25m 27s | Max: 35m 34s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 26m 46s | Avg: 26m 46s | Max: 26m 46s
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 46m | Avg: 6m 25s | Max: 34m 19s | Hits: 90%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 32m | Avg:  6m 57s | Max: 34m 19s | Hits:  90%/312   
      🟩 arm64              Pass: 100%/4   | Total: 13m 55s | Avg:  3m 28s | Max:  3m 41s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 16m 58s | Avg:  5m 39s | Max:  9m 53s | Hits:  91%/156   
      🟩 12.5               Pass: 100%/2   | Total: 12m 04s | Avg:  6m 02s | Max:  6m 15s
      🟩 12.6               Pass: 100%/21  | Total:  2h 17m | Avg:  6m 33s | Max: 34m 19s | Hits:  90%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 16m 58s | Avg:  5m 39s | Max:  9m 53s | Hits:  91%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 04s | Avg:  6m 02s | Max:  6m 15s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  2h 17m | Avg:  6m 33s | Max: 34m 19s | Hits:  90%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 46m | Avg:  6m 25s | Max: 34m 19s | Hits:  90%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 37s | Avg:  4m 37s | Max:  4m 37s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 35s | Avg:  3m 35s | Max:  3m 35s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 09s | Avg:  4m 09s | Max:  4m 09s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 02s | Avg:  4m 02s | Max:  4m 02s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 54s | Avg:  3m 54s | Max:  3m 54s
      🟩 Clang18            Pass: 100%/4   | Total: 34m 48s | Avg:  8m 42s | Max: 23m 59s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 23s | Avg:  3m 23s | Max:  3m 23s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 37s | Avg:  3m 37s | Max:  3m 37s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s
      🟩 GCC12              Pass: 100%/2   | Total: 38m 27s | Avg: 19m 13s | Max: 34m 19s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 50s | Avg:  3m 27s | Max:  3m 41s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 53s | Avg:  9m 53s | Max:  9m 53s | Hits:  91%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 25s | Avg: 11m 25s | Max: 11m 25s | Hits:  90%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 04s | Avg:  6m 02s | Max:  6m 15s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 10m | Avg:  5m 24s | Max: 23m 59s
      🟩 GCC                Pass: 100%/9   | Total:  1h 03m | Avg:  7m 01s | Max: 34m 19s
      🟩 MSVC               Pass: 100%/2   | Total: 21m 18s | Avg: 10m 39s | Max: 11m 25s | Hits:  90%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 04s | Avg:  6m 02s | Max:  6m 15s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 46m | Avg:  6m 25s | Max: 34m 19s | Hits:  90%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 48m | Avg:  4m 31s | Max: 11m 25s | Hits:  90%/312   
      🟩 Test               Pass: 100%/2   | Total: 58m 18s | Avg: 29m 09s | Max: 34m 19s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 03s | Avg:  3m 03s | Max:  3m 03s
      🟩 90a                Pass: 100%/1   | Total:  3m 35s | Avg:  3m 35s | Max:  3m 35s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 22m 54s | Avg:  3m 49s | Max:  5m 49s
      🟩 20                 Pass: 100%/20  | Total:  2h 23m | Avg:  7m 11s | Max: 34m 19s | Hits:  90%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 55s | Avg: 4m 57s | Max: 7m 34s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 34s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 21s | Avg:  2m 21s | Max:  2m 21s
      🟩 Test               Pass: 100%/1   | Total:  7m 34s | Avg:  7m 34s | Max:  7m 34s
    
  • 🟩 python: Pass: 100%/1 | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 27m 52s | Avg: 27m 52s | Max: 27m 52s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 170)

# Runner
125 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
15 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@fbusato fbusato self-assigned this Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

[FEA]: ceil_div should return the resulting type of its operation
2 participants