-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CANN Backend support #1606
base: master
Are you sure you want to change the base?
CANN Backend support #1606
Conversation
e7c01a1
to
8ce20f6
Compare
Co-authored-by: kandrio <[email protected]>
for segment in segments:
File "/root/exit/envs/python39/lib/python3.9/site-packages/faster_whisper/transcribe.py", line 884, in restore_speech_timestamps
for segment in segments:
File "/root/exit/envs/python39/lib/python3.9/site-packages/faster_whisper/transcribe.py", line 396, in generate_segments
encoder_output = self.encode(segment)
File "/root/exit/envs/python39/lib/python3.9/site-packages/faster_whisper/transcribe.py", line 574, in encode
return self.model.encode(features, to_cpu=True)
RuntimeError: not implemented in CANN 我已经编译完成,但是在使用过程中报这个错误。请问有好的解决思路吗? |
In order for faster-whisper to work, additional tensor operators have to be implemented in CANN. This is a task that's already completed from our side. Nevertheless, we didn't push it to GitHub yet due to change in priorities. |
我已经编译完成,CANN 7.0.0.beta1,使用文档示例,过程中遇到此问题:
请问cann仅支持CANN 7.0.RC1.alpha001吗 |
@3manifold Could you upload this part of the code (related to Whisper) that is not implemented on CANN to your ct2-cann branch?It would be a great help.Thank you. |
CANN Backend support
Introduction
CANN
(Compute Architecture of Neural Networks), developed by Huawei, is a heterogeneous computing architecture for AI scenarios.It provides multi-layer programming interfaces to help users quickly build AI applications and services based on the Ascend platform.
CANN backend in
CTranslate2
, enables running AI models on the Ascend NPU extending the existingCPU
&CUDA
workflows. One can find more on Ascend NPU and CANN library here.Examples of projects that support CANN include ONNX Runtime & OpenCV.
resolves #1609
Notes
c
output shape incorrectly when inputa
is transpose? #1583 .Implementation
CANN
backend support implementation introducesDevice::CANN
similarly toCPU
&CUDA
.CANN
workflow can be enabled using-DWITH_CANN=ON
incmake
configuration (seeexamples/cann
). As toCUDA
,CANN
can coexist alongsideCPU
workflow.CANN
workflow is accessible through examples (examples/cann/main.cc
),cli
orPython
module.Operators & primitives were implemented for
CANN
in order for the end-to-end example in ctranslate2 documentation to run successfully.Tests
Tests were extended for
Device::CANN
& respectiveDataType
. Additional tests were also implemented involving extra/edge cases. Gtest output: gtest_cann.logEnvironment Setup
AArch64.run
category (current implementation involvedCANN 7.0.RC1.alpha001
).docker/cann
.For details about how to set up the development environment and operating environment, see Development and Operating Environment Setup
and CANN Software Installation Guide.
Build CANN Python module
CANN Python module is expected to be built using the respective
Docker
files. Nevertheless, here we provide a quick way for building, ideal for testing and benchmarking.Build CANN C++ example
Samples
Python
C++
Execution example in
C++
can be found inexamples/cann
.CLI
echo "▁H ello ▁world !" | ./ct2-translator --model "./ende_ctranslate2/"
Benchmark
We conducted several runs measuring translation latency using all 192 CPU cores and 1 NPU device for a single batch.
In specific, experiments demonstrate results for 4 consecutive runs involving 4 and 306 tokens respectively. NPU proved
faster in all cases.
Input tokens
4 tokens
306 tokens
Hardware
Experiments