-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v4.5.0 is not compatible with torch>=2.*.*+cu121
#1806
Comments
torch==2.5.0+cu121
torch>=2.4.0+cu121
Update: it's compatible with |
Thanks to me and my analysis here...I want full citation and credit please: ;-) SYSTRAN/faster-whisper#1082 (comment) @MahmoudAshraf97 let me know if you need to know how I prefer citations to my contributions made please. Thanks. |
torch>=2.4.0+cu121
torch>=2.*.*+cu121
Thank you for the list of compatible versions. I have the following versions installed in a Docker container, but I still get the cuDNN error. Am I missing something? OS: nvidia-smi: Name: torch Name: ctranslate2 Name: nvidia-cudnn-cu12 Name: nvidia-cuda-cupti-cu12 Name: nvidia-cublas-cu12 Name: nvidia-cuda-runtime-cu12 |
which cudnn error are you getting exactly? |
@MahmoudAshraf97 |
@MarkusGehrig Can you try If you found the lib, you can check if this lib is in the runtime search path. If not, try to add it into LD_LIBRARY_PATH and see what happens. |
Thanks for the quick response. But I still get the error.
ldconfig -p | grep cudnn_cnn printenv LD_LIBRARY_PATH I also have tried to install it with pip, with no change. |
Try to set only And what is the result of this: |
It works like a charm now. It was a path that needed to be set to LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/lib/x86_64-linux-gnu/ Thanks! |
For everybody's future reference, to investigate possible compatibility issues with other CUDA libraries or platforms in general, you can go here. In the lower left you can select the cuDNN version and it will bring up a page that explains all the nuances: https://docs.nvidia.com/deeplearning/cudnn/v9.1.1/reference/support-matrix.html This might help troubleshooting based on platform, etc. Also, if you need to correlate what version of cuDNN you have you can go to either of these places: https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/ |
I noticed that this still says it's using CUDA 12.2 while at the same time using cuDNN 9.1? Prior to 4.5.0, it was cuDNN 8.8.0 and CUDA 12.2? https://github.com/OpenNMT/CTranslate2/blob/master/python/tools/prepare_build_environment_windows.sh I'm guessing that's because CUDA 12.2 is compatible with cuDNN 8.8.0 still? However, Ctranslate2 4.5.0+ requires cuDNN 9.1 and is only compatible with CUDA 12.4. (assuming you're also using torch, which only has builds for 12.1 and 12.4). Not sure if it's redundant because I already posted on Ctranslate2 3.24.0 - last to use cuDNN 8.1.1 with CUDA 11.2.2 by default torch 2.5.0 - supports CUDA 11.8, 12.1, and 12.4 cuDNN 8.9.7 supports CUDA 11 through 12.2 CORRECTION:Beginning with cuDNN 9.1.0 I believe that cuDNN in general is forward compatible with CUDA 12.x Here are my sources: https://docs.nvidia.com/deeplearning/cudnn/v9.1.0/release-notes.html#cudnn-9-1-0
https://docs.nvidia.com/deeplearning/cudnn/v9.1.0/reference/support-matrix.html
Regardless...right now...unless you're using |
For the same case with google colab users . @MahmoudAshraf97 Could you check if with FYI, cudnn 9 should be compatible with all cuda version, see here. It's weird that it does not work with some torch's versions . In my opinion, it could be the problem of finding path. |
apparently there are two cudnn versions installed in colab ldconfig -p | grep cudnn_cnn
libcudnn_cnn_train.so.8 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8
libcudnn_cnn_train.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn_train.so
libcudnn_cnn_infer.so.8 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8
libcudnn_cnn_infer.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn_infer.so pip show nvidia-cudnn-cu12
Name: nvidia-cudnn-cu12
Version: 9.5.0.50
Summary: cuDNN runtime libraries
Home-page: https://developer.nvidia.com/cuda-zone
Author: Nvidia CUDA Installer Team
Author-email: [[email protected]](mailto:[email protected])
License: NVIDIA Proprietary Software
Location: /usr/local/lib/python3.10/dist-packages
Requires: nvidia-cublas-cu12
Required-by: adding the pip cudnn path to |
@BBC-Esq the problem here is not a cuda vs cudnn compatibility because as per the compatibility matrix we should not be having this issue, it's an environment setup issue that might involve a mismatch between cuda/cudnn pathes |
I partially understand I think...Will ponder further...In the meantime...you mentioned possibly using the pip CUDA libraries... I spent the time to aggregate the versions of ALL CUDA-RELATED LIBRARIES on pypi.org by Nvidia release. Here are the versions of all such libraries organized by CUDA Toolkit official Release Number: LONG LIST HERE
These libraries, while available in the
With that being said, you might also consider this script or a modification of it...I'm still customizing it for my own program, actually. Here's the script: https://github.com/NVIDIA/build-system-archive-import-examples/blob/main/parse_redist.py Anyhow...I am confident in this versioning...I downloaded all official releases as seen here: I inspected all files that the installer extracts to a temporary directory...there is always a "version.json" file within a sub-folder named "CUDAToolkit" that contains the correlating versions. Here is a brief excerpt of one such file. The version numbers ALWAYS correlate to the versions list on pypi.org. Moreover, they ALWAYS correlate to "distributables" on Nvidia's website. |
@MahmoudAshraf97 please note...you use the If you use the aforementioned download script, however, you will get: This could make a difference if, for example, |
In the meantime, pursuant to our discussion, I've created a script that will download that appropriate CUDA Toolkit files (by version) or you can choose to download the cuDNN files. Remember, you must still set the appropriate PATH and other variables...and you must still make sure that the cuDNN version you're using is compatible with cuDNN and/or Torch and/or Ctranslate2 and/or any other library you plan to use in your program. You will only need to FULL SCRIPT HERE``` __version__ = "0.5.0" minimum = "3.8"import sys import argparse ARCHIVES = {} CUDA_RELEASES = { CUDNN_RELEASES = [ PRODUCTS = { OPERATING_SYSTEMS = { ARCHITECTURES = { VARIANTS = { COMPONENTS = { def err(msg): def fetch_file(full_path, filename): def fix_permissions(directory): def flatten_tree(src, dest, tag=None):
def parse_artifact(
def fetch_action(
def post_action(output_dir, collapse=True):
class DownloadWorker(QThread):
class DownloaderGUI(QMainWindow):
def main():
if name == "main":
|
Official compatibility matrix that I found at: https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix
|
Hello, I’m running WhisperX on Databricks and encountering the same error:
Databricks Cluster Runtime: Packages - version:
I’ve tried running:
But there was no change. Could someone help me here? I’ve been trying to find a solution for almost a week. |
Here is an ongoing updated .txt that I've been working on if it helps... RIDICULOUSLY LONG TXT FILE HEREFollowing is the Release Compatibility Matrix for PyTorch releases:
CUDNN: 8.9.7.29 = pip install nvidia-cudnn-cu12==8.9.7.29 # cuda 12.2 update 1 recommended for newer GPUs CUDA 12.6.2: pip install nvidia-cuda-runtime-cu12==12.6.77 pip install nvidia-cuda-cupti-cu12==12.6.80 CUDA 12.6.1: pip install nvidia-cuda-runtime-cu12==12.6.68 pip install nvidia-cuda-cupti-cu12==12.6.68 CUDA 12.6.0: pip install nvidia-cuda-runtime-cu12==12.6.37 pip install nvidia-cuda-cupti-cu12==12.6.37 CUDA 12.5.1: pip install nvidia-cuda-runtime-cu12==12.5.82 pip install nvidia-cuda-cupti-cu12==12.5.82 CUDA 12.5.0: pip install nvidia-cuda-runtime-cu12==12.5.39 pip install nvidia-cuda-cupti-cu12==12.5.39 CUDA 12.4.1: pip install nvidia-cuda-runtime-cu12==12.4.127 pip install nvidia-cuda-cupti-cu12==12.4.127 CUDA 12.4.0: pip install nvidia-cuda-runtime-cu12==12.4.99 pip install nvidia-cuda-cupti-cu12==12.4.99 CUDA 12.2.2: pip install nvidia-cuda-runtime-cu12==12.2.140 pip install nvidia-cuda-cupti-cu12==12.2.142 CUDA 12.1.1: pip install nvidia-cuda-runtime-cu12==12.1.105 pip install nvidia-cuda-cupti-cu12==12.1.105 CUDA 12.1.0: pip install nvidia-cuda-runtime-cu12==12.1.55 pip install nvidia-cuda-cupti-cu12==12.1.62 CUDA 11.8.0: pip install nvidia-cuda-runtime-cu11==11.8.89 pip install nvidia-cuda-cupti-cu11==11.8.87 nvidia-npp-cu12: Version 12.3.1.54 was shared between CUDA 12.6.2 and CUDA 12.6.1 nvidia-nvjpeg-cu12: Version 12.3.3.54 was shared between CUDA 12.6.2 and CUDA 12.6.1 nvidia-cuda-opencl-cu12: Version 12.5.39 was shared between CUDA 12.5.1 and CUDA 12.5.0 If your situation involves having to set PATH and what not, however, I'm afraid I can't help there. I'm on Windows...
Also, it's built with CUDA 12.2 in mind, which is seen here... https://github.com/OpenNMT/CTranslate2/blob/master/python/tools/prepare_build_environment_linux.sh Here is what AI says when I fed it your question and my matrices... You'll have to analyze all of this because that's as far as I've gotten. Hope it helps... |
I was having this issue with CUDA 12.4.1 and Torch 2.5.1. It was resolved by manually setting LD_LIBRARY_PATH, but why is that necessary? Under normal circumstances I would expect it to find the lib automatically |
Because now that |
See my pull request here to try and address some compatibility issues and for generally useful info: |
Hello
the last release works great with versions of pytorch that use python cuda packages, but when torch that is precompiled with cuda binaries is installed, this error appears:
the only solution so far is to downgrade to 4.4.0 although CuDNN v9.1 is installed both using pip and bundled with pytorch
jhj0517/Whisper-WebUI#348
Update:
as per @BBC-Esq research,
ctranslate2>=4.5.0
uses CuDNN v9 which requires CUDA >= 12.3.Since most issues occur from a conflicting
torch
andctranslate2
installations these are tested working combinations:2.*.*+cu121
<=4.4.0
2.*.*+cu124
>=4.5.0
>=2.4.0
>=4.5.0
<2.4.0
<4.5.0
For google colab users, the quick solution is to downgrade to
4.4.0
as of 24/10/2024 as colab usestorch==2.5.0+cu12.1
The text was updated successfully, but these errors were encountered: