You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
70B-Prefill-2048-input-token unshared: ABORTED; the semaphore was aborted; while invoking native function hal.fence.await; while calling import;
#19569
When running 70B prefill, with 2048 token on unsharded model, it's failing with following error:
iree/runtime/src/iree/hal/drivers/hip/event_semaphore.c:673: ABORTED; the semaphore was aborted; while invoking native function hal.fence.await; while calling import;
[ 0] bytecode module.prefill_bs4:90 prefill_70b_unsharded.mlir:727:3
Abort (core dumped)
it's coming with iree-benchmark-module, iree-run-module works fine
Looking at rocm-smi, there is another process running on --hip://4, which is taking up 23% of the available ram. With 70b running iree-run-module (or benchmark-module with -benchmark-repetitions=1) we hit 98% memory usage. A larger number of repetitions increases the memory usage enough to hit the limit.
Using a GPU that is not currently in-use lets it complete.
What happened?
When running 70B prefill, with 2048 token on unsharded model, it's failing with following error:
it's coming with iree-benchmark-module, iree-run-module works fine
command :
Go to Shark MI300X machine and use above command.
Build : a43d893
Steps to reproduce your issue
No response
What component(s) does this issue relate to?
Runtime
Version information
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: