-
login to prince (tutorial)
-
once your are in your home folder on prince clone this repo
git clone [email protected]:JackLangerman/simpleprince.git
-
run
setup.sbatch
as a batch job from the simpleprince folder to setup and install miniconda and setup the ml environment. Feel free to add your favorite packages to environment.yml before running this step (you can do it later too).cd simpleprince sbatch setup.sbatch
to launch a jupyter notebook on prince (the ml environment exists):
On CPU (what you will want to use most of the time)
sbatch run-jupyter.sbatch
on GPU (if you need it)
sbatch run-jupyterGPU.sbatch
Also rembember you can use a jupyter notebook with GPU (or even TPU) for free at Google Colab
-
to check the status of your job and where it is running
squeue -u YOUR_NET_ID
-
once you use squeue to check where your job is running you can get a terminal on that machine by SSHing into it ex:
squeue -u YOUR_NET_ID JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 15997001 p40_4 setup jl4722 R 0:18 1 gpu-52
we see the job is running on
gpu-52
so we can ssh in usingssh gpu-52
-
once you ssh into the machine your job is running on you can monitor gpu status using
watch nvidia-smi
(exit by pressing cntrl-c) -
add new packages to your environment by putting them in your environment.yml file and then calling
conda env update --prune
from inside the simpleprince folder -
you can always access your ml environment on prince from any compute job by loading anaconda and then the environment
module load tensorflow/python3.6/1.5.0 module swap python3/intel anaconda3/5.3.1 source activate ml
-
to check that GPUs are working in Jupyter:
-
run a Jupyter GPU notebook after following steps above
-
make a new terminal from Jupyter
-
run the test script
python ~/simpleprince/testgpu.py
OR from a standard terminal on prince:
ssh gpu-## module load tensorflow/python3.6/1.5.0 module swap python3/intel anaconda3/5.3.1 source activate ml
-