Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用paddle2onnx将量化后的ppyoloe_plus_s导出为onnx后,onnxruntime推理结果与paddle静态图推理不符 #1441

Open
EvW1998 opened this issue Nov 20, 2024 · 12 comments
Assignees
Labels
Bug Something isn't working

Comments

@EvW1998
Copy link

EvW1998 commented Nov 20, 2024

问题描述

根据PaddleDetection提供的模型压缩文档,我将训练好的ppyoloe plus s模型进行量化训练。量化后的模型,以及导出为paddle静态图模型均推理无误。

但使用paddle2onnx将量化后的模型导出为onnx后,使用onnxruntime推理,无法得到任何检测框(使用paddle静态图推理可得到多个)

使用环境

使用的环境如下:
paddle2onnx == 1.2.11
paddledet == develop (>= 2.8.0)
paddlepaddle-gpu == 2.6.1.post116
paddleslim == 2.6.0
onnxruntime == 1.19.2
onnxruntime-gpu == 1.19.2

paddle静态模型及onnx文件

ppyoloe_s_qat_abnormal_object.zip

错误详情

原图:
yiwu
paddle静态图推理效果:
visualized_result

在使用以下命令,导出为onnx文件

paddle2onnx --model_dir /mnt/data01/wzj/code/PaddleDetection-develop/export/ppyoloe_s_data1118_quant_lr0000125/ppyoloe_s_qat_abnormal_object \
            --model_filename model.pdmodel \
            --params_filename model.pdiparams \
            --save_file /mnt/data01/wzj/code/PaddleDetection-develop/export/ppyoloe_s_data1118_quant_lr0000125/ppyoloe_s_qat_abnormal_object/ppyoloe_s_qat_241118.onnx

输出结果:

[Paddle2ONNX] Start to parse PaddlePaddle model...
[Paddle2ONNX] Model file path: /mnt/data01/wzj/code/PaddleDetection-develop/export/ppyoloe_s_data1118_quant_lr0000125/ppyoloe_s_qat_abnormal_object/model.pdmodel
[Paddle2ONNX] Parameters file path: /mnt/data01/wzj/code/PaddleDetection-develop/export/ppyoloe_s_data1118_quant_lr0000125/ppyoloe_s_qat_abnormal_object/model.pdiparams
[Paddle2ONNX] Start to parsing Paddle model...
[Paddle2ONNX] [Info] The Paddle model is a quantized model. 
[Paddle2ONNX] [reduce_mean: mean_0.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [reduce_mean: mean_1.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [reduce_mean: mean_2.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [reduce_mean: mean_3.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [multiclass_nms3: multiclass_nms3_0.tmp_1] Requires the minimal opset version of 10.
[Paddle2ONNX] Due to the operator: dequantize_linear, requires opset_version >= 13.
[Paddle2ONNX] Opset version will change to 13 from 9
[Paddle2ONNX] Use opset_version = 13 for ONNX export.
[WARN][Paddle2ONNX] [multiclass_nms3: multiclass_nms3_0.tmp_1] [WARNING] Due to the operator multiclass_nms3, the exported ONNX model will only supports inference with input batch_size == 1.
[Paddle2ONNX] Deploy backend is: onnxruntime
[Paddle2ONNX] PaddlePaddle model is exported as ONNX format now.

然而使用onnxruntime对导出的onnx文件推理,结果中没有任何检测框。请问是将量化后的模型导出时,哪里有精度丢失问题吗?非量化的模型按照此流程导出没有任何问题

@Zheng-Bicheng
Copy link
Collaborator

输入对齐过了嘛?

@Zheng-Bicheng Zheng-Bicheng added the Bug Something isn't working label Nov 22, 2024
@EvW1998
Copy link
Author

EvW1998 commented Nov 22, 2024

输入对齐了。图片尺寸均缩放为640,640,图片的预处理也一致

ppdet中的config为NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}

onnx的输入也以此处理了

frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB).astype(np.float32)
input_onnx = np.transpose(np.array([(frame-0)/255]), (0,3,1,2)) # n,c,h,w

@Zheng-Bicheng
Copy link
Collaborator

我的意思是,您应该把相同的数据同时用 PaddleInference 和 ONNXRuntime 推理一下,对比一下结果,抛开后处理

@EvW1998
Copy link
Author

EvW1998 commented Nov 22, 2024

谢谢回复,我明白您的意思了。但是ppyoloe plus模型,本身带有nms后处理,输出为直接的检测框结果和检测框个数。
image
您说抛开后处理,是指对比nms之前一个node的结果吗?

@Zheng-Bicheng
Copy link
Collaborator

是的,ONNXRuntime的NMS和Paddle静态图的NMS在实际效果上不一定等价的,有区别

@EvW1998
Copy link
Author

EvW1998 commented Nov 22, 2024

好的,我找到了nms的位置,会尝试对比这两个节点
image
想请教一下paddle与onnx有逐层比较精度的工具吗?或者说paddle静态图的裁剪工具

@Zheng-Bicheng
Copy link
Collaborator

这个没有,所以让您对比NMS前的两个节点

@Zheng-Bicheng
Copy link
Collaborator

一般来说量化后的精度差距1%内,量化前的差距0.1的-5次方内

@Zheng-Bicheng Zheng-Bicheng self-assigned this Nov 22, 2024
@EvW1998
Copy link
Author

EvW1998 commented Nov 27, 2024

您好,通过实验对比nms前的两个节点,量化后的精度差距比较大,对比结果:

Input shape: (1, 3, 640, 640) mean: 0.45289892 max: 1.0 min:  0.08627451
Paddle output0 shape: (1, 8400, 4) mean: 320.46814 max: 831.07166 min:  -280.5739
Paddle output1 shape: (1, 3, 8400) mean: 0.0025582889 max: 0.95241463 min:  1.0192251e-05
Onnx output0 shape: (1, 8400, 4) mean: 320.3321 max: 876.1374 min:  -249.33652
Onnx output1 shape: (1, 3, 8400) mean: 0.001652396 max: 0.007899523 min:  0.00082200766
Output0 mean cosine distance:  0.011558139 mean euclidean distance 75.20233
Output1 mean cosine distance:  0.6552932 mean euclidean distance 1.5684191

量化前的精度差距基本没有:

Input shape: (1, 3, 640, 640) mean: 0.45289892 max: 1.0 min:  0.08627451
Paddle output0 shape: (1, 8400, 4) mean: 320.18158 max: 815.7729 min:  -231.14545
Paddle output1 shape: (1, 3, 8400) mean: 0.0025046652 max: 0.94339174 min:  4.4206395e-06
Onnx output0 shape: (1, 8400, 4) mean: 320.18158 max: 815.7726 min:  -231.14482
Onnx output1 shape: (1, 3, 8400) mean: 0.002504666 max: 0.9433913 min:  4.440546e-06
Output0 mean cosine distance:  -7.947286e-09 mean euclidean distance 0.00020155673
Output1 mean cosine distance:  -7.947286e-08 mean euclidean distance 5.4948055e-06

还是可以确定是paddle2onnx过程中有精度损失。

我是通过在PaddleDetection动转静时加入exclude_nms命令去除nms节点的,静态图和onnx文件地址链接: https://pan.baidu.com/s/1wu8_b3gGsMCmA4rlZ73XVw 提取码: emb4

对比代码:

import paddle
import numpy as np
import cv2
import onnxruntime as rt

def cosine_distance(a, b):
    dot_product = np.sum(a * b, axis=-1)

    norm_a = np.linalg.norm(a, axis=-1)
    norm_b = np.linalg.norm(b, axis=-1)
    
    cosine_similarity = dot_product / (norm_a * norm_b)
    
    cosine_distance = 1 - cosine_similarity
    
    return np.mean(cosine_distance)


def euclidean_distance(a, b):
    diff = a - b
    
    distances = np.linalg.norm(diff, axis=-1)
    
    mean_distance = np.mean(distances)
    
    return mean_distance


#-----------------paddle推理---------------------

paddle.enable_static()

# 加载模型
# [inference_program, feed_target_names, fetch_targets] = paddle.static.load_inference_model(
#     path_prefix='./compare/ppyoloes_exclude_nms/ppyoloe_plus_crn_s_80e_abnormal_object',
#     model_filename='model.pdmodel',
#     params_filename='model.pdiparams',
#     executor=paddle.static.Executor()
# )

[inference_program, feed_target_names, fetch_targets] = paddle.static.load_inference_model(
    path_prefix='./compare/ppyoloes_qat_exclude_nms/ppyoloe_s_qat_abnormal_object',
    model_filename='model.pdmodel',
    params_filename='model.pdiparams',
    executor=paddle.static.Executor()
)

# 准备输入数据
frame = cv2.imread("./yiwu.png")
original_im = frame.copy()

frame = cv2.resize(frame, (640, 640))
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB).astype(np.float32)
input_image = np.transpose(np.array([(frame-0)/255]), (0,3,1,2)) # n,c,h,w

print("Input shape:", input_image.shape, "mean:", input_image.mean(), "max:", input_image.max(), "min: ", input_image.min())

input_scale = np.array([[1, 1]], dtype="float32")

# 创建执行器
place = paddle.CUDAPlace(0)
exe = paddle.static.Executor(place)

# 执行推理
results = exe.run(program=inference_program,
                  feed={feed_target_names[0]: input_image, feed_target_names[1]: input_scale},
                  fetch_list=fetch_targets)

# cv2.imwrite("./compare/yiwu.png", vis_detection(original_im, results[0]))
 
print("Paddle output0 shape:", results[0].shape, "mean:", results[0].mean(), "max:", results[0].max(), "min: ", results[0].min())
print("Paddle output1 shape:", results[1].shape, "mean:", results[1].mean(), "max:", results[1].max(), "min: ", results[1].min())

#-----------------onnx推理---------------------

# sess = rt.InferenceSession("./compare/ppyoloes_exclude_nms.onnx", providers=['CPUExecutionProvider'])
sess = rt.InferenceSession("./compare/ppyoloes_qat_exclude_nms.onnx", providers=['CPUExecutionProvider'])
output_names = [sess.get_outputs()[0].name, sess.get_outputs()[1].name]
onnx_pred = sess.run(output_names, {sess.get_inputs()[0].name: input_image, sess.get_inputs()[1].name: input_scale})

print("Onnx output0 shape:", onnx_pred[0].shape, "mean:", onnx_pred[0].mean(), "max:", onnx_pred[0].max(), "min: ", onnx_pred[0].min())
print("Onnx output1 shape:", onnx_pred[1].shape, "mean:", onnx_pred[1].mean(), "max:", onnx_pred[1].max(), "min: ", onnx_pred[1].min())

print("Output0 mean cosine distance: ", cosine_distance(results[0], onnx_pred[0]), "mean euclidean distance", euclidean_distance(results[0], onnx_pred[0]))
print("Output1 mean cosine distance: ", cosine_distance(results[1], onnx_pred[1]), "mean euclidean distance", euclidean_distance(results[1], onnx_pred[1]))

@Zheng-Bicheng
Copy link
Collaborator

复现这个问题了,排查需要一些时间。其实 ONNX 和 Paddle 量化模型对应不上是很正常的,两者推理框架有差异,但是差这么多我也是第一次见。

@EvW1998
Copy link
Author

EvW1998 commented Dec 9, 2024

明白了,非常感谢

@WindLWQ
Copy link

WindLWQ commented Dec 27, 2024

谢谢回复,我明白您的意思了。但是ppyoloe plus模型,本身带有nms后处理,输出为直接的检测框结果和检测框个数。 image 您说抛开后处理,是指对比nms之前一个node的结果吗?

您好,请问你的这个问题解决了吗?我这边遇到了相同的问题呢,想问问您有什么解决方法吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants