-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider providing interim instructions for Linux "happy path" using Docker #802
Comments
Edited because I'm occasionally Very Dumb(TM) and forgot to actually run this with the GPU passthrough. The following comment is now accurate. Edit 2: Except this doesn't work with the PyTorch example in I need to basically remove the
In addition, it's recommended to add |
I'm not sure if you got it working, but I'm trying to learn ML while using this lib and this is my dockerfile dev env: FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
# basic tools
RUN apt update \
&& apt install -y --no-install-recommends \
git vim openssh-client gnupg curl wget ca-certificates unzip zip less zlib1g sudo coreutils sed grep
#
# cargo/rust
ENV RUSTUP_HOME=/usr/local/rustup
ENV CARGO_HOME=/usr/local/cargo
ENV PATH=/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# https://blog.rust-lang.org/2022/06/22/sparse-registry-testing.html
ENV CARGO_UNSTABLE_SPARSE_REGISTRY=true
RUN set -eux; \
apt update \
&& apt install -y --no-install-recommends \
ca-certificates gcc build-essential; \
url="https://static.rust-lang.org/rustup/dist/x86_64-unknown-linux-gnu/rustup-init"; \
wget "$url"; \
chmod +x rustup-init; \
./rustup-init -y --no-modify-path --default-toolchain nightly; \
rm rustup-init; \
chmod -R a+w $RUSTUP_HOME $CARGO_HOME; \
rustup --version; \
cargo --version; \
rustc --version;
#
# https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#environment-setup
RUN echo "export PATH=/usr/local/cuda-12.1/bin${PATH:+:${PATH}}" >> ~/.bashrc Thats for: [dependencies.dfdx]
version = "0.13.0"
default-features = false
features = [
"std",
"fast-alloc",
"cpu",
"cuda",
"cudnn",
"safetensors",
"numpy",
"nightly",
] |
Hello,
I spent part of this afternoon banging my head against a wall with getting
dfdx
with thecuda
feature enabled up and running on my computer. It turns a big part of this appeared to be that my version (11.2) doesn't really appear to work well with thebuild.rs
script, with errors appearing in multiple steps. As I think I may have mentioned in previous issues, my set-up isn't particularly exotic (just the recent Pop!_OS release with the default NVIDIA drivers), so I suspect that other folks may run into the same issue.According to System76's docs, the recommended way of dealing with a CUDA version mismatch is just to use Docker. While this isn't ideal (I don't love having to rely on Docker), I can confirm that this solved most of my build issues, by first following the GPU-enabled container instructions in the link above, then building a
dfdx
-specific container using the Dockerfile below (which takes a hot minute to build).I was just thinking that it might be worth considering adding this kind of process into the crate's documentation to help other people that may run into the same issue, at least until it becomes clear that the base NVIDIA-enabled system configurations being shipped with distro's like Pop!_OS/Ubuntu are able to support the
dfdx
's build script.The text was updated successfully, but these errors were encountered: