Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expands support for more offset types in segmented benchmark #3231

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

elstehle
Copy link
Collaborator

@elstehle elstehle commented Jan 1, 2025

Description

When running benchmarks on segmented algorithms like segmented sort, we need to generate segment sizes with different distributions. When using different offset types, however, generators for these distributions were previously limited to int32 and int64. This PR expands support for also unsigned offset types, which is needed for experiments related to #3132

@elstehle elstehle requested review from a team as code owners January 1, 2025 19:07
Copy link
Contributor

github-actions bot commented Jan 1, 2025

🟩 CI finished in 52m 54s: Pass: 100%/96 | Total: 14h 01m | Avg: 8m 45s | Max: 31m 13s | Hits: 99%/12404
  • 🟩 cub: Pass: 100%/47 | Total: 7h 03m | Avg: 9m 01s | Max: 27m 55s | Hits: 99%/3144

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  6h 51m | Avg:  9m 08s | Max: 27m 55s | Hits:  99%/3144  
      🟩 arm64              Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max:  6m 36s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 42m 13s | Avg:  6m 01s | Max: 15m 57s | Hits:  99%/786   
      🟩 12.5               Pass: 100%/2   | Total: 19m 01s | Avg:  9m 30s | Max:  9m 40s
      🟩 12.6               Pass: 100%/38  | Total:  6h 02m | Avg:  9m 32s | Max: 27m 55s | Hits:  99%/2358  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  8m 44s | Avg:  4m 22s | Max:  4m 25s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 42m 13s | Avg:  6m 01s | Max: 15m 57s | Hits:  99%/786   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 19m 01s | Avg:  9m 30s | Max:  9m 40s
      🟩 nvcc12.6           Pass: 100%/36  | Total:  5h 53m | Avg:  9m 49s | Max: 27m 55s | Hits:  99%/2358  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 44s | Avg:  4m 22s | Max:  4m 25s
      🟩 nvcc               Pass: 100%/45  | Total:  6h 55m | Avg:  9m 13s | Max: 27m 55s | Hits:  99%/3144  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 21m 56s | Avg:  5m 29s | Max:  6m 45s
      🟩 Clang10            Pass: 100%/1   | Total:  7m 21s | Avg:  7m 21s | Max:  7m 21s
      🟩 Clang11            Pass: 100%/1   | Total:  6m 15s | Avg:  6m 15s | Max:  6m 15s
      🟩 Clang12            Pass: 100%/1   | Total:  6m 02s | Avg:  6m 02s | Max:  6m 02s
      🟩 Clang13            Pass: 100%/1   | Total:  6m 28s | Avg:  6m 28s | Max:  6m 28s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 59s | Avg:  5m 59s | Max:  5m 59s
      🟩 Clang15            Pass: 100%/1   | Total:  6m 43s | Avg:  6m 43s | Max:  6m 43s
      🟩 Clang16            Pass: 100%/1   | Total:  6m 09s | Avg:  6m 09s | Max:  6m 09s
      🟩 Clang17            Pass: 100%/1   | Total:  6m 07s | Avg:  6m 07s | Max:  6m 07s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 08m | Avg:  9m 49s | Max: 23m 55s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 28s | Avg:  4m 14s | Max:  4m 25s
      🟩 GCC7               Pass: 100%/2   | Total: 11m 38s | Avg:  5m 49s | Max:  5m 54s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
      🟩 GCC9               Pass: 100%/3   | Total: 15m 16s | Avg:  5m 05s | Max:  6m 15s
      🟩 GCC10              Pass: 100%/1   | Total:  6m 27s | Avg:  6m 27s | Max:  6m 27s
      🟩 GCC11              Pass: 100%/1   | Total:  6m 04s | Avg:  6m 04s | Max:  6m 04s
      🟩 GCC12              Pass: 100%/3   | Total: 27m 31s | Avg:  9m 10s | Max: 15m 59s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 57m | Avg: 14m 39s | Max: 27m 55s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  6m 40s | Avg:  6m 40s | Max:  6m 40s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 15m 57s | Avg: 15m 57s | Max: 15m 57s | Hits:  99%/786   
      🟩 MSVC14.29          Pass: 100%/1   | Total: 13m 11s | Avg: 13m 11s | Max: 13m 11s | Hits:  99%/786   
      🟩 MSVC14.39          Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 40s | Hits:  99%/1572  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 19m 01s | Avg:  9m 30s | Max:  9m 40s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  2h 21m | Avg:  7m 27s | Max: 23m 55s
      🟩 GCC                Pass: 100%/21  | Total:  3h 18m | Avg:  9m 27s | Max: 27m 55s
      🟩 Intel              Pass: 100%/1   | Total:  6m 40s | Avg:  6m 40s | Max:  6m 40s
      🟩 MSVC               Pass: 100%/4   | Total: 57m 53s | Avg: 14m 28s | Max: 15m 57s | Hits:  99%/3144  
      🟩 NVHPC              Pass: 100%/2   | Total: 19m 01s | Avg:  9m 30s | Max:  9m 40s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 20m 31s | Avg: 10m 15s | Max: 15m 59s
      🟩 v100               Pass: 100%/45  | Total:  6h 43m | Avg:  8m 57s | Max: 27m 55s | Hits:  99%/3144  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  4h 34m | Avg:  6m 51s | Max: 15m 57s | Hits:  99%/3144  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 27m 55s | Avg: 27m 55s | Max: 27m 55s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 15s | Avg: 17m 15s | Max: 17m 15s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 01m | Avg: 20m 20s | Max: 27m 28s
      🟩 TestGPU            Pass: 100%/2   | Total: 43m 21s | Avg: 21m 40s | Max: 23m 55s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 20m 31s | Avg: 10m 15s | Max: 15m 59s
      🟩 90a                Pass: 100%/1   | Total:  4m 56s | Avg:  4m 56s | Max:  4m 56s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 25m 00s | Avg:  5m 00s | Max:  6m 45s
      🟩 14                 Pass: 100%/4   | Total: 32m 40s | Avg:  8m 10s | Max: 15m 57s | Hits:  99%/786   
      🟩 17                 Pass: 100%/12  | Total:  1h 29m | Avg:  7m 29s | Max: 14m 05s | Hits:  99%/1572  
      🟩 20                 Pass: 100%/26  | Total:  4h 36m | Avg: 10m 37s | Max: 27m 55s | Hits:  99%/786   
    
  • 🟩 thrust: Pass: 100%/46 | Total: 6h 24m | Avg: 8m 21s | Max: 31m 13s | Hits: 99%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 37m 20s | Avg: 18m 40s | Max: 31m 13s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total:  6h 14m | Avg:  8m 30s | Max: 31m 13s | Hits:  99%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  9m 42s | Avg:  4m 51s | Max:  5m 10s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 46m 01s | Avg:  6m 34s | Max: 19m 53s | Hits:  99%/1852  
      🟩 12.5               Pass: 100%/2   | Total: 29m 17s | Avg: 14m 38s | Max: 15m 29s
      🟩 12.6               Pass: 100%/37  | Total:  5h 09m | Avg:  8m 21s | Max: 31m 13s | Hits:  99%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 16s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 46m 01s | Avg:  6m 34s | Max: 19m 53s | Hits:  99%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 29m 17s | Avg: 14m 38s | Max: 15m 29s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  4h 58m | Avg:  8m 32s | Max: 31m 13s | Hits:  99%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 16s
      🟩 nvcc               Pass: 100%/44  | Total:  6h 14m | Avg:  8m 30s | Max: 31m 13s | Hits:  99%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 21m 22s | Avg:  5m 20s | Max:  6m 45s
      🟩 Clang10            Pass: 100%/1   | Total:  6m 40s | Avg:  6m 40s | Max:  6m 40s
      🟩 Clang11            Pass: 100%/1   | Total:  5m 30s | Avg:  5m 30s | Max:  5m 30s
      🟩 Clang12            Pass: 100%/1   | Total:  5m 34s | Avg:  5m 34s | Max:  5m 34s
      🟩 Clang13            Pass: 100%/1   | Total:  5m 37s | Avg:  5m 37s | Max:  5m 37s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 32s | Avg:  5m 32s | Max:  5m 32s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 45s | Avg:  5m 45s | Max:  5m 45s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 37s | Avg:  5m 37s | Max:  5m 37s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 27s | Avg:  5m 27s | Max:  5m 27s
      🟩 Clang18            Pass: 100%/7   | Total: 45m 55s | Avg:  6m 33s | Max: 13m 06s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 15s | Avg:  4m 07s | Max:  4m 08s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 04s | Avg:  5m 02s | Max:  5m 11s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 32s | Avg:  5m 32s | Max:  5m 32s
      🟩 GCC9               Pass: 100%/3   | Total: 14m 56s | Avg:  4m 58s | Max:  6m 04s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 54s | Avg:  5m 54s | Max:  5m 54s
      🟩 GCC11              Pass: 100%/1   | Total:  6m 08s | Avg:  6m 08s | Max:  6m 08s
      🟩 GCC12              Pass: 100%/1   | Total:  5m 40s | Avg:  5m 40s | Max:  5m 40s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 22m | Avg: 10m 20s | Max: 31m 13s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  7m 22s | Avg:  7m 22s | Max:  7m 22s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 19m 53s | Avg: 19m 53s | Max: 19m 53s | Hits:  99%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 16m 45s | Avg: 16m 45s | Max: 16m 45s | Hits:  99%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 58m 54s | Avg: 19m 38s | Max: 24m 27s | Hits:  99%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 29m 17s | Avg: 14m 38s | Max: 15m 29s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  1h 52m | Avg:  5m 56s | Max: 13m 06s
      🟩 GCC                Pass: 100%/19  | Total:  2h 19m | Avg:  7m 19s | Max: 31m 13s
      🟩 Intel              Pass: 100%/1   | Total:  7m 22s | Avg:  7m 22s | Max:  7m 22s
      🟩 MSVC               Pass: 100%/5   | Total:  1h 35m | Avg: 19m 06s | Max: 24m 27s | Hits:  99%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total: 29m 17s | Avg: 14m 38s | Max: 15m 29s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  6h 24m | Avg:  8m 21s | Max: 31m 13s | Hits:  99%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  4h 45m | Avg:  7m 08s | Max: 19m 53s | Hits:  99%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 39m 58s | Avg: 13m 19s | Max: 24m 27s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 58m 36s | Avg: 19m 32s | Max: 31m 13s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 22m 46s | Avg:  4m 33s | Max:  5m 36s
      🟩 14                 Pass: 100%/4   | Total: 35m 56s | Avg:  8m 59s | Max: 19m 53s | Hits:  99%/1852  
      🟩 17                 Pass: 100%/12  | Total:  1h 41m | Avg:  8m 29s | Max: 17m 29s | Hits:  99%/3704  
      🟩 20                 Pass: 100%/23  | Total:  3h 06m | Avg:  8m 06s | Max: 24m 27s | Hits:  99%/3704  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 8m 47s | Avg: 4m 23s | Max: 6m 40s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  8m 47s | Avg:  4m 23s | Max:  6m 40s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 Test               Pass: 100%/1   | Total:  6m 40s | Avg:  6m 40s | Max:  6m 40s
    
  • 🟩 python: Pass: 100%/1 | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 24m 25s | Avg: 24m 25s | Max: 24m 25s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 96)

# Runner
71 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

1 participant