Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a simple rule for executing pytorch in conda environment. (fix #3) #5

Closed
wants to merge 12 commits into from

Conversation

KSoumya
Copy link
Collaborator

@KSoumya KSoumya commented Aug 15, 2024

Fixes #3.

@KSoumya KSoumya requested a review from balhoff August 15, 2024 16:31
Copy link
Member

@balhoff balhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KSoumya there are a lot of extraneous changes in this PR, which I think were already merged in #4. Can you update your branch by merging the main branch into it?

@balhoff
Copy link
Member

balhoff commented Aug 15, 2024

Also, put something like "fixes #3" in the PR description.

@KSoumya KSoumya changed the title Added a simple rule for executing pytorch in conda environment. Added a simple rule for executing pytorch in conda environment. (fix #3) Aug 15, 2024
@KSoumya
Copy link
Collaborator Author

KSoumya commented Aug 15, 2024

updated PR description

Copy link
Member

@balhoff balhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left a few inline questions and comments.

snakemake_conda/env.yaml Outdated Show resolved Hide resolved
snakemake_conda/env.yaml Outdated Show resolved Hide resolved
snakemake_conda/Snakefile Outdated Show resolved Hide resolved
snakemake_conda/Snakefile Outdated Show resolved Hide resolved
snakemake_conda/Snakefile Outdated Show resolved Hide resolved
@balhoff
Copy link
Member

balhoff commented Aug 16, 2024

I just realized this is a new Snakefile in a subfolder. Why not add to our existing file?

@balhoff
Copy link
Member

balhoff commented Aug 16, 2024

I tried running and get this error:

Traceback (most recent call last):
  File "/home/balhoff/test-rule/char-sim/snakemake_conda/.snakemake/scripts/tmplbi3nw0_.sample_script.py", line 5, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'

I don't see torch in the env.yaml; should it be there?

@KSoumya
Copy link
Collaborator Author

KSoumya commented Aug 19, 2024

I just realized this is a new Snakefile in a subfolder. Why not add to our existing file?

the subfolder is now removed and the existing Snakefile is updated with a new rule.

@balhoff
Copy link
Member

balhoff commented Aug 20, 2024

@KSoumya thanks for the updates; I am trying it out.

@balhoff
Copy link
Member

balhoff commented Aug 21, 2024

@KSoumya when I run I get this error:

Traceback (most recent call last):
  File "/home/balhoff/test-rule/char-sim/.snakemake/scripts/tmp1a2ojqu_.create_train_data.py", line 8, in <module>
    import pandas as pd
ModuleNotFoundError: No module named 'pandas'

I see that pandas is in environment.yaml, but under 'pip' rather than directly in 'dependencies'. What is the difference?

@hlapp
Copy link
Member

hlapp commented Aug 21, 2024

I see that pandas is in environment.yaml, but under 'pip' rather than directly in 'dependencies'. What is the difference?

The difference is only for how it gets installed (via pip from PyPi or via conda via a conda channel). The error suggests that you either didn't create the conda environment or that it isn't activated for the particular step that requires it.

@balhoff
Copy link
Member

balhoff commented Aug 21, 2024

@KSoumya @hlapp now that I actually have conda installed, this is working for me (I installed miniforge and mamba). But I needed to edit environment.yaml. I initially got some conflicts which seemed to be between the version of snakemake I have (presumably one of the newest) and a very old version of python (3.8.19) that is specified in environment.yaml.

My intuition (without being familiar with snakemake/conda practices) would be that the environment should be specified in the most minimal way possible. But I'm not sure if this file is supposed to act as a statement of the direct dependencies or instead like a lock file. But as written it didn't work for me; maybe it would have if I had a specific version of conda or snakemake?

@KSoumya based on the shell snippet you sent me, I think your background environment may have more installed into it, rather than setting up the environment in the rule:

snakemake --cores 4 --use-singularity id12_desc12_simGIC.tsv.gz

I needed to use --use-conda so that the environment was created when the rule was run:

snakemake -c4 --show-failed-logs --use-singularity --use-conda id12_desc12_simGIC.tsv.gz

Maybe this is why you didn't run into these issues in your own runs.

@KSoumya
Copy link
Collaborator Author

KSoumya commented Aug 21, 2024

@KSoumya @hlapp now that I actually have conda installed, this is working for me (I installed miniforge and mamba). But I needed to edit environment.yaml. I initially got some conflicts which seemed to be between the version of snakemake I have (presumably one of the newest) and a very old version of python (3.8.19) that is specified in environment.yaml.

My intuition (without being familiar with snakemake/conda practices) would be that the environment should be specified in the most minimal way possible. But I'm not sure if this file is supposed to act as a statement of the direct dependencies or instead like a lock file. But as written it didn't work for me; maybe it would have if I had a specific version of conda or snakemake?

@KSoumya based on the shell snippet you sent me, I think your background environment may have more installed into it, rather than setting up the environment in the rule:

snakemake --cores 4 --use-singularity id12_desc12_simGIC.tsv.gz

I needed to use --use-conda so that the environment was created when the rule was run:

snakemake -c4 --show-failed-logs --use-singularity --use-conda id12_desc12_simGIC.tsv.gz

Maybe this is why you didn't run into these issues in your own runs.

@balhoff your snakemake command does entirely make sense, indeed --use-conda needs to be enabled. I will check how to make the environment.yaml more geeneric.

@balhoff
Copy link
Member

balhoff commented Aug 21, 2024

@KSoumya I also forgot to say—in the snakemake docs it says that without that flag, the conda environment property in a rule is entirely ignored.

@KSoumya
Copy link
Collaborator Author

KSoumya commented Aug 21, 2024

@KSoumya I also forgot to say—in the snakemake docs it says that without that flag, the conda environment property in a rule is entirely ignored.

that's right, since I have the env defined and activated during the runs I didn't come across this requirement. Thanks for sharing this.

@hlapp
Copy link
Member

hlapp commented Aug 21, 2024

My intuition (without being familiar with snakemake/conda practices) would be that the environment should be specified in the most minimal way possible. But I'm not sure if this file is supposed to act as a statement of the direct dependencies or instead like a lock file.

It can work as either. In the form exported using conda env export all versions are "locked". This is often desirable, as installing a later version for some dependency when run at a later time not only will result in a different environment, but can (and in practice often does) break code that's not forward compatible.

I do agree that Python 3.8.x is relatively old at this point, and that Python shouldn't need to be held at this version. That is, unless, I think, we're using Tensorflow 1. But thought we're using Torch, and 3.11 is generally supported by recent versions of TensorFlow, Torch, etc.

Copy link
Collaborator Author

@KSoumya KSoumya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snakemake completes execution, but exits with OSerror:
Complete log: .snakemake/log/2024-10-31T134614.865141.snakemake.log
Traceback (most recent call last):
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/weakref.py", line 642, in _exitfunc
f()
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/weakref.py", line 566, in call
return info.func(*info.args, **(info.kwargs or {}))
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/tempfile.py", line 826, in _cleanup
cls._rmtree(name)
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/tempfile.py", line 822, in _rmtree
_shutil.rmtree(name, onerror=onerror)
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/shutil.py", line 718, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/shutil.py", line 655, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/shutil.py", line 655, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/shutil.py", line 655, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/shutil.py", line 659, in _rmtree_safe_fd
onerror(os.rmdir, fullname, sys.exc_info())
File "/home/skar/anaconda3/envs/snakemake_env/lib/python3.8/shutil.py", line 657, in _rmtree_safe_fd
os.rmdir(entry.name, dir_fd=topfd)
OSError: [Errno 39] Directory not empty: 'char-sim'

Working on resolving it.

@balhoff
Copy link
Member

balhoff commented Jan 7, 2025

Closing this since the main branch has gotten ahead. Will add back any missing pieces as needed.

@balhoff balhoff closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add simple rule that utilizes a conda environment
3 participants