We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description Same deployment but with different GPU (w/ or w/o MIG). DCGM unable to star when w/ MIG:
CacheManager Init Failed. Error: -17 W1225 10:48:27.718944 4706 metrics.cc:811] "DCGM unable to start: DCGM initialization error"
Similar to #3506 but not caused by inefficient memory. Triton Information nvcr.io/nvidia/tritonserver:24.11-py3
nvcr.io/nvidia/tritonserver:24.11-py3
To Reproduce GPUs w/ MIG
sudo docker run -it --rm --network=host --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -e NVIDIA_VISIBLE_DEVICES=0:0 nvcr.io/nvidia/tritonserver tritonserver --model-repository {my_model_path}
outputs:
I1225 10:48:25.952289 4706 pinned_memory_manager.cc:277] "Pinned memory pool is created at '0x7f24b8000000' with size 268435456" I1225 10:48:25.954209 4706 cuda_memory_manager.cc:107] "CUDA memory pool is created on device 0 with size 67108864" I1225 10:48:25.958281 4706 model_lifecycle.cc:473] "loading: onnx:1" I1225 10:48:25.960593 4706 onnxruntime.cc:2875] "TRITONBACKEND_Initialize: onnxruntime" I1225 10:48:25.960634 4706 onnxruntime.cc:2885] "Triton TRITONBACKEND API version: 1.19" I1225 10:48:25.960657 4706 onnxruntime.cc:2891] "'onnxruntime' TRITONBACKEND API version: 1.19" I1225 10:48:25.960665 4706 onnxruntime.cc:2921] "backend configuration:\n{\"cmdline\":{\"auto-complete-config\":\"true\",\"backend-directory\":\"/opt/tritonserver/backends\",\"min-compute-capability\":\"6.000000\",\"default-max-batch-size\":\"4\"}}" I1225 10:48:25.977518 4706 onnxruntime.cc:2986] "TRITONBACKEND_ModelInitialize: onnx (version 1)" I1225 10:48:25.978169 4706 onnxruntime.cc:984] "skipping model configuration auto-complete for 'onnx': inputs and outputs already specified" I1225 10:48:25.978790 4706 onnxruntime.cc:3051] "TRITONBACKEND_ModelInstanceInitialize: onnx_0_0 (GPU device 0)" I1225 10:48:27.703699 4706 model_lifecycle.cc:849] "successfully loaded 'onnx'" I1225 10:48:27.703793 4706 server.cc:604] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+ I1225 10:48:27.703839 4706 server.cc:631] +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ | Backend | Path | Config | +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6 | | | | .000000","default-max-batch-size":"4"}} | +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ I1225 10:48:27.703886 4706 server.cc:674] +-------+---------+--------+ | Model | Version | Status | +-------+---------+--------+ | onnx | 1 | READY | +-------+---------+--------+ CacheManager Init Failed. Error: -17 W1225 10:48:27.718944 4706 metrics.cc:811] "DCGM unable to start: DCGM initialization error" I1225 10:48:27.719361 4706 metrics.cc:783] "Collecting CPU metrics" I1225 10:48:27.719448 4706 tritonserver.cc:2598] +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.52.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens | | | or_data parameters statistics trace logging | | model_repository_path[0] | {my_model_path} | | model_control_mode | MODE_NONE | | strict_model_config | 0 | | model_config_name | | | rate_limit | OFF | | pinned_memory_pool_byte_size | 268435456 | | cuda_memory_pool_byte_size{0} | 67108864 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | | cache_enabled | 0 | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I1225 10:48:27.723652 4706 grpc_server.cc:2558] "Started GRPCInferenceService at 0.0.0.0:8001" I1225 10:48:27.723879 4706 http_server.cc:4729] "Started HTTPService at 0.0.0.0:8000" I1225 10:48:27.764810 4706 http_server.cc:362] "Started Metrics Service at 0.0.0.0:8002"
GPUs w/o MIG
sudo docker run -it --rm --network=host --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -e NVIDIA_VISIBLE_DEVICES=0 nvcr.io/nvidia/tritonserver tritonserver --model-repository {my_model_path}
I1225 10:41:12.658976 138 pinned_memory_manager.cc:277] "Pinned memory pool is created at '0x7f0058000000' with size 268435456" I1225 10:41:12.661708 138 cuda_memory_manager.cc:107] "CUDA memory pool is created on device 0 with size 67108864" I1225 10:41:12.667006 138 model_lifecycle.cc:473] "loading: onnx:1" I1225 10:41:12.671093 138 onnxruntime.cc:2875] "TRITONBACKEND_Initialize: onnxruntime" I1225 10:41:12.671117 138 onnxruntime.cc:2885] "Triton TRITONBACKEND API version: 1.19" I1225 10:41:12.671123 138 onnxruntime.cc:2891] "'onnxruntime' TRITONBACKEND API version: 1.19" I1225 10:41:12.671127 138 onnxruntime.cc:2921] "backend configuration:\n{\"cmdline\":{\"auto-complete-config\":\"true\",\"backend-directory\":\"/opt/tritonserver/backends\",\"min-compute-capability\":\"6.000000\",\"default-max-batch-size\":\"4\"}}" I1225 10:41:12.688318 138 onnxruntime.cc:2986] "TRITONBACKEND_ModelInitialize: onnx (version 1)" I1225 10:41:12.688871 138 onnxruntime.cc:984] "skipping model configuration auto-complete for 'onnx': inputs and outputs already specified" I1225 10:41:12.689461 138 onnxruntime.cc:3051] "TRITONBACKEND_ModelInstanceInitialize: onnx_0_0 (GPU device 0)" I1225 10:41:14.331226 138 model_lifecycle.cc:849] "successfully loaded 'onnx'" I1225 10:41:14.331320 138 server.cc:604] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+ I1225 10:41:14.331363 138 server.cc:631] +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ | Backend | Path | Config | +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6 | | | | .000000","default-max-batch-size":"4"}} | +-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ I1225 10:41:14.331410 138 server.cc:674] +-------+---------+--------+ | Model | Version | Status | +-------+---------+--------+ | onnx | 1 | READY | +-------+---------+--------+ I1225 10:41:14.357465 138 metrics.cc:890] "Collecting metrics for GPU 0: NVIDIA A30" I1225 10:41:14.365078 138 metrics.cc:783] "Collecting CPU metrics" I1225 10:41:14.365165 138 tritonserver.cc:2598] +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.52.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens | | | or_data parameters statistics trace logging | | model_repository_path[0] | {my_model_path} | | model_control_mode | MODE_NONE | | strict_model_config | 0 | | model_config_name | | | rate_limit | OFF | | pinned_memory_pool_byte_size | 268435456 | | cuda_memory_pool_byte_size{0} | 67108864 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | | cache_enabled | 0 | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I1225 10:41:14.369295 138 grpc_server.cc:2558] "Started GRPCInferenceService at 0.0.0.0:8001" I1225 10:41:14.369542 138 http_server.cc:4729] "Started HTTPService at 0.0.0.0:8000" I1225 10:41:14.410425 138 http_server.cc:362] "Started Metrics Service at 0.0.0.0:8002"
Expected behavior No DCGM error when w/ MIG
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Description
Same deployment but with different GPU (w/ or w/o MIG). DCGM unable to star when w/ MIG:
Similar to #3506 but not caused by inefficient memory.
Triton Information
nvcr.io/nvidia/tritonserver:24.11-py3
To Reproduce
GPUs w/ MIG
outputs:
GPUs w/o MIG
outputs:
Expected behavior
No DCGM error when w/ MIG
The text was updated successfully, but these errors were encountered: