ROCM support for AMD GPUs #566

TheMaddax · 2023-11-08T15:17:26Z

I've been using whisper with my AMD 6600 XT, which functioned well with ROCM support. However, with whisperx, it seems like it is not supported to use AMD GPUs. Is there any chance for ROCM support to be implemented with whisperx?

kandeshvari · 2023-12-14T22:01:04Z

As I understand the reason is in the ctranslate2 package. There is an issue with AMD support but with no progress :( OpenNMT/CTranslate2#1072

radna0 · 2024-07-01T17:16:32Z

Any update on this?

arlo-phoenix · 2024-08-06T18:01:20Z

@radna0 You can test my CTranslate2 ROCm fork. See OpenNMT/CTranslate2#1072 (comment). Works very well on my RX6800 GPU and should probably work on every AMD GPU that supports ROCm.

Medhatt21 · 2024-12-29T10:31:36Z

@arlo-phoenix Worked. Thanks a lot. Is there a chance to publish a docker image of whisperx with rocm support?

TibixDev · 2025-01-05T10:51:18Z

I spent a few hours trying to get this to work, but unfortunately I couldn't quite get it to run. I feel like I am pretty close but the last error I get I really have zero clue about. Still, I hope this may be useful to someone, because there were a ton of other problems that I managed to solve. The only thing you will need to replace is https://your.url/pytorch_model.bin with a URL where the VAD model is hosted, since in this version there was a hardcoded S3 url which the author disabled, download it from the archive and upload pytorch_model.bin to some web server / CDN. Put your samples in a directory called audio which should reside in the same folder where you placed the compose file.

Dockerfile:

FROM rocm/pytorch:latest

ENV PYTORCH_ROCM_ARCH=gfx1032
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0

SHELL ["/bin/bash", "-c"]
RUN conda init bash && \
    echo "conda activate py_3.9" >> ~/.bashrc

RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    wget \
    libomp5 \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /opt/whisperx

RUN git clone https://github.com/arlo-phoenix/CTranslate2-rocm.git --recurse-submodules && \
    cd CTranslate2-rocm && \
    source ~/.bashrc && \
    CLANG_CMAKE_CXX_COMPILER=clang++ \
    CXX=clang++ \
    HIPCXX="$(hipconfig -l)/clang" \
    HIP_PATH="$(hipconfig -R)" \
    cmake -S . -B build \
        -DWITH_MKL=OFF \
        -DWITH_HIP=ON \
        -DCMAKE_HIP_ARCHITECTURES=$PYTORCH_ROCM_ARCH \
        -DBUILD_TESTS=ON \
        -DWITH_CUDNN=ON && \
    cmake --build build -- -j$(nproc) && \
    cd build && \
    cmake --install . --prefix /opt/conda/envs/py_3.9 && \
    ldconfig

RUN source ~/.bashrc && \
    cd /opt/whisperx/CTranslate2-rocm/python && \
    pip install -r install_requirements.txt && \
    CPLUS_INCLUDE_PATH=/opt/conda/envs/py_3.9/include \
    LIBRARY_PATH=/opt/conda/envs/py_3.9/lib \
    python setup.py bdist_wheel && \
    pip install dist/*.whl

RUN source ~/.bashrc && \
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1 --force-reinstall && \
    pip install transformers pandas nltk pyannote.audio==3.1.1 faster-whisper==1.0.1 -U && \
    pip install whisperx==3.1.1 --no-deps

# Patch the asr.py file to fix the bug
RUN sed -i '/"suppress_numerals": False/a \    "max_new_tokens": None,\n    "clip_timestamps": None,\n    "hallucination_silence_threshold": None,' \
    /opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/asr.py

# Patch the vad.py file to update the VAD_SEGMENTATION_URL
RUN sed -i 's|https://whisperx.s3.eu-west-2.amazonaws.com/model_weights/segmentation/0b5b3216d60a2d32fc086b47ea8c67589aaeb26b7e07fcbe620d6d0b83e209ea/pytorch_model.bin|https://your.url/pytorch_model.bin|' \
    /opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/vad.py

# Patch the checksum validation in vad.py
RUN sed -i '/if hashlib.sha256(model_bytes).hexdigest() != VAD_SEGMENTATION_URL.split/,+3d' \
    /opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/vad.py

ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/conda/envs/py_3.9/lib/

# Create symlink for libiomp5
RUN ln -s /opt/rocm-6.3.1/lib/llvm/lib/libiomp5.so /usr/lib/libiomp5.so && \
    ldconfig

# Create an entry script
RUN echo '#!/bin/bash\nsource ~/.bashrc\nwhile true; do sleep 86400; done' > /entrypoint.sh && \
    chmod +x /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]

docker-compose.yml

version: '3.8'
services:
  whisperx:
    image: whisperx-rocm
    tty: true
    stdin_open: true
    devices:
      - /dev/kfd
      - /dev/dri
    group_add:
      - video
    volumes:
      - ./audio:/audio
    entrypoint: ["/bin/bash", "-c"]
    command: ["source ~/.bashrc && while true; do sleep 1; done"]

After building the image and creating a container with compose, enter open a shell inside the container, I did it with lazydocker but use whatever you prefer, and execute whisperx /audio/yourfile.wav.

Here's the last error I got that I was unable the resolve:

  warnings.warn(
/opt/conda/envs/py_3.10/lib/python3.10/site-packages/asteroid_filterbanks/param_sinc_fb.py:94: UserWarning: Attempting to use hipBLASLt on an unsupported architecture! Overriding blas backend to hipblas (Triggered internally at ../aten/src/ATen/Context.cpp:296.)
  ft_low = torch.matmul(low, self.n_)
Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/bin/whisperx", line 8, in <module>
    sys.exit(cli())
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/transcribe.py", line 176, in cli
    result = model.transcribe(audio, batch_size=batch_size, chunk_size=chunk_size, print_progress=print_progress)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/asr.py", line 194, in transcribe
    language = language or self.detect_language(audio)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/asr.py", line 252, in detect_language
    encoder_output = self.model.encode(segment)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/whisperx/asr.py", line 86, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: parallel_for failed: hipErrorInvalidDeviceFunction: invalid device function
(py_3.9) root@2bd0988e5bb6:/opt/whisperx#

I tried many things, but due to a myriad of factors, I wasn't able to use a newer version of whisperx or faster-whisper. Still, I hope this helps someone and maybe work can continue from this base.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROCM support for AMD GPUs #566

ROCM support for AMD GPUs #566

TheMaddax commented Nov 8, 2023

kandeshvari commented Dec 14, 2023

radna0 commented Jul 1, 2024

arlo-phoenix commented Aug 6, 2024

Medhatt21 commented Dec 29, 2024

TibixDev commented Jan 5, 2025

ROCM support for AMD GPUs #566

ROCM support for AMD GPUs #566

Comments

TheMaddax commented Nov 8, 2023

kandeshvari commented Dec 14, 2023

radna0 commented Jul 1, 2024

arlo-phoenix commented Aug 6, 2024

Medhatt21 commented Dec 29, 2024

TibixDev commented Jan 5, 2025