We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description We have ensemble ASR model, it gets crashed with segfault after 3-10 minutes of load with 15 generator threads at single L40s GPU.
Triton Information We've tested vanilla r24.08, r24.10 and debug build of r24.10. We run it in docker container.
To Reproduce Well, actually it is some sort of private ASR model setup, I'm not sure we can share it.
Model config is attached. Models inside ensemble models are mostly use onnxruntime, except preprocessing which uses python backend.
Expected behavior Well, expected behavior is not to crash.
We've get several core dumps, so I've attached backtraces from it.
Also, I've seen some very promising commit several days ago: f5e4f69#diff-78246a41bf5fef27235af811675a07f2262046074eae8da3985e98ae68602065. Probably we should try build this version of code and try setting some of TRITONSERVER_DELAY_* variable? config.txt stacktrace_01.txt stacktrace_02.txt stacktrace_03.txt
The text was updated successfully, but these errors were encountered:
The issue was reproduced on Triton Inference Server version r24.11.
r24.11
Sorry, something went wrong.
No branches or pull requests
Description
We have ensemble ASR model, it gets crashed with segfault after 3-10 minutes of load with 15 generator threads at single L40s GPU.
Triton Information
We've tested vanilla r24.08, r24.10 and debug build of r24.10. We run it in docker container.
To Reproduce
Well, actually it is some sort of private ASR model setup, I'm not sure we can share it.
Model config is attached. Models inside ensemble models are mostly use onnxruntime, except preprocessing which uses python backend.
Expected behavior
Well, expected behavior is not to crash.
We've get several core dumps, so I've attached backtraces from it.
Also, I've seen some very promising commit several days ago: f5e4f69#diff-78246a41bf5fef27235af811675a07f2262046074eae8da3985e98ae68602065. Probably we should try build this version of code and try setting some of TRITONSERVER_DELAY_* variable?
config.txt
stacktrace_01.txt
stacktrace_02.txt
stacktrace_03.txt
The text was updated successfully, but these errors were encountered: