You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Title: Engine Deserialization Error During Deployment on Triton Inference Server
Description:
I am encountering an issue while deploying a YOLOv11 model on Triton Inference Server. The model was successfully converted to a TensorRT engine and performed inference correctly using the YOLO command-line interface. However, when deploying the model on Triton Inference Server, I receive the following error:
ERROR: 1: [stdArchiveReader.cpp::StdArchiveReader::32] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match)
ERROR: 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)
Environment Details:
Operating System: Ubuntu 20.04
GPUs: NVIDIA V100 and NVIDIA RTX 3080 (tested on both)
CUDA Version: 11.7
TensorRT Versions Tested: 8.4.3.1, 8.2.0.5
Triton Server Versions Tested: 22.06, 24.11
Pytorch Versions Tested: 1.10.1, 2.0.0
Nvidia Driver Versions Tested: 515.105.01
Steps to Reproduce:
Convert the YOLOv11 model to a TensorRT engine using the following command:
Triton server includes a version of tensorrt, check platform matrix to see what version is being used. The model should be exported with the same trt version.
Title: Engine Deserialization Error During Deployment on Triton Inference Server
Description:
I am encountering an issue while deploying a YOLOv11 model on Triton Inference Server. The model was successfully converted to a TensorRT engine and performed inference correctly using the YOLO command-line interface. However, when deploying the model on Triton Inference Server, I receive the following error:
Environment Details:
Steps to Reproduce:
Convert the YOLOv11 model to a TensorRT engine using the following command:
The conversion completes without any issues!
Test the generated engine locally with the following command:
The engine performs inference successfully without any errors!
Deploy the engine on Triton Inference Server. During deployment, the above error is encountered.
Any specific compatibility requirements or configuration steps needed to resolve this issue.
Thank you for your assistance!
The text was updated successfully, but these errors were encountered: