src: cpu: aarch64: Re-enable matmul static quantisation through ACL. #2308

renato-arantes · 2024-12-22T17:57:28Z

Description

This is to re-enable matmul static quantization operations through ACL. Currently, the supported data type combinations are s8:s8:s8 and u8:s8:u8. Neoverse-N1 is skipped until failing tests are fixed.

Previous PR: #2198

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

renato-arantes · 2024-12-23T09:36:22Z

Hi @mgouicem, it's the same as before but more restrictive with Neoverse-N1 until the failing tests are fixed.

src/cpu/aarch64/acl_gemm_convolution.cpp

Radu2k · 2025-01-08T12:31:46Z

@mgouicem @dzarukin Could we have someone review this please?

src/common/convolution_pd.hpp

src/cpu/aarch64/acl_convolution_utils.hpp

src/cpu/aarch64/acl_gemm_convolution.cpp

src/cpu/aarch64/matmul/acl_lowp_matmul_sq.cpp

src/cpu/aarch64/matmul/acl_lowp_matmul_sq.hpp

src/cpu/cpu_convolution_list.cpp

src/cpu/aarch64/matmul/acl_lowp_matmul_sq.hpp

mgouicem · 2025-01-09T11:48:12Z

Hi @renato-arantes I see that you added a merge commit, please rebase your branch instead as we try to keep a linear history.

vpirogov · 2025-01-09T21:43:52Z

@renato-arantes, please resolve conflicts.

renato-arantes · 2025-01-13T09:40:22Z

Hi @vpirogov,

The conflicts are there because a commit was reverted, and my work for conv is based on it. I will drop my work related to convolution from this PR and keep only the matmul one. When the issue that caused the reverse is solved, I will create a new PR for the conv work.

renato-arantes · 2025-01-14T16:40:39Z

Hi @mgouicem, Can we please merge this patch? It is the same as before, but this time without the conv work, as described here.

vpirogov · 2025-01-15T22:34:28Z

src/cpu/aarch64/matmul/acl_lowp_matmul_sq.cpp

+            arm_compute::QuantizationInfo(*src_scale, -src_zero_point, true));
+    acl_obj.wei_tensor.info()->set_quantization_info(
+            arm_compute::QuantizationInfo(*wei_scale, -wei_zero_point, true));
+    // for efficiency reasons, OneDNN saves the inverse of the destination


It's oneDNN, not OneDNN. Fixup is here.

renato-arantes requested review from a team as code owners December 22, 2024 17:57

renato-arantes force-pushed the static_quant2 branch 2 times, most recently from 88cf5e0 to 5cec04d Compare December 22, 2024 20:47

theComputeKid approved these changes Dec 23, 2024

View reviewed changes

src/cpu/aarch64/acl_gemm_convolution.cpp Outdated Show resolved Hide resolved

github-actions bot added the platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 label Dec 23, 2024

theComputeKid added this to the v3.7 milestone Jan 8, 2025

vpirogov reviewed Jan 8, 2025

View reviewed changes

src/common/convolution_pd.hpp Outdated Show resolved Hide resolved

vpirogov reviewed Jan 8, 2025

View reviewed changes

src/common/convolution_pd.hpp Outdated Show resolved Hide resolved

dzarukin reviewed Jan 8, 2025

View reviewed changes

mgouicem reviewed Jan 9, 2025

View reviewed changes

src/cpu/aarch64/matmul/acl_lowp_matmul_sq.hpp Outdated Show resolved Hide resolved

renato-arantes force-pushed the static_quant2 branch 4 times, most recently from ce7f10b to 66d9ea4 Compare January 9, 2025 12:47

renato-arantes force-pushed the static_quant2 branch from 66d9ea4 to b322d6d Compare January 13, 2025 14:12

renato-arantes changed the title ~~src: cpu: aarch64: Re-enable matmul and convolution static quantisation through ACL.~~ src: cpu: aarch64: Re-enable matmul static quantisation through ACL. Jan 13, 2025

src: cpu: aarch64: Enable matmul static quantisation.

38d8255

renato-arantes force-pushed the static_quant2 branch from b322d6d to 38d8255 Compare January 13, 2025 14:25

mgouicem approved these changes Jan 15, 2025

View reviewed changes

theComputeKid merged commit 4052b9b into oneapi-src:main Jan 15, 2025
16 checks passed

vpirogov reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: cpu: aarch64: Re-enable matmul static quantisation through ACL. #2308

src: cpu: aarch64: Re-enable matmul static quantisation through ACL. #2308

renato-arantes commented Dec 22, 2024 •

edited

Loading

renato-arantes commented Dec 23, 2024

Radu2k commented Jan 8, 2025

mgouicem commented Jan 9, 2025

vpirogov commented Jan 9, 2025

renato-arantes commented Jan 13, 2025

renato-arantes commented Jan 14, 2025

vpirogov Jan 15, 2025

src: cpu: aarch64: Re-enable matmul static quantisation through ACL. #2308

src: cpu: aarch64: Re-enable matmul static quantisation through ACL. #2308

Conversation

renato-arantes commented Dec 22, 2024 • edited Loading

Description

General

renato-arantes commented Dec 23, 2024

Radu2k commented Jan 8, 2025

mgouicem commented Jan 9, 2025

vpirogov commented Jan 9, 2025

renato-arantes commented Jan 13, 2025

renato-arantes commented Jan 14, 2025

vpirogov Jan 15, 2025

Choose a reason for hiding this comment

renato-arantes commented Dec 22, 2024 •

edited

Loading