Working with Conda environments on Shaheen 2

On Shaheen 2, /home is not mounted on compute nodes. This means, even though you can access yours files in your $HOME directory from login nodes but not from compute nodes. Another point to note is that as of January 2022, the /project directory is read-only from compute node. Both these changes have consequence if you are a conda user on Shaheen.

Here are some tips to keep the ball rolling and use conda environments on login and compute nodes on Shaheen.

In the documentation below, I am using my username and my project ID. Please replace is with your corresponding credentials in the paths where it applies.

Install Conda in project directory

You will need to install your miniconda in your project directory so it is accessible from reading on compute nodes. Please create a directory by following the steps below in your project directory:

mkdir -p /project/k01/$USER/miniconda3 lfs setstripe -c 1 /project/k01/$USER/miniconda3

 

Here is me installing miniconda3 in my project directory as an example. Please use your Porject ID of choice in installation path:

cd /project/k01/$USER/miniconda3 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh -u Welcome to Miniconda3 py39_4.11.0 In order to continue the installation process, please review the license agreement. Please, press ENTER to continue >>> .... Notice of Third Party Software Licenses Do you accept the license terms? [yes|no] [no] >>> yes Miniconda3 will now be installed into this location: /home/shaima0d/miniconda3 - Press ENTER to confirm the location - Press CTRL-C to abort the installation - Or specify a different location below [/home/shaima0d/miniconda3] >>>/project/k01/shaima0d/miniconda3 ..... Do you wish the installer to initialize Miniconda3 by running conda init? [yes|no] no You have chosen to not have conda modify your shell scripts at all. ..... Thank you for installing Miniconda3!

conda is now installed in our project directory.

Creating environments

For new environments, they should also be created in project directory so that they are accessible from compute nodes. To this end, you will need to set the following environment variable CONDA_PKGS_DIRS to direct the packages to be installed in a prescribed location other than the default which is $HOME/.cache

For example:

mkdir -p /scratch/shaima0d/conda_cache lfs setstripe -c 4 /scratch/shaima0d/conda_cache export CONDA_PKGS_DIRS=/scratch/shaima0d/conda_cache

To create a new environment you need to first initialize miniconda3 environment by sourcing it from the installation path and then create the new environment of your choice:

Once the environment is created, let’s check where our Tensorflow has been installed.

Interactively using Conda environments

For testing interactively on a Shaheen compute node and launch the python interpreter freshly installed in our conda environment:

Jobscript for batch processing

The following is an example of activating your conda environments in a SLRUM jobscript for batch processing: (tf_test.py is the target python script to run)

Steps involving sourcing the miniconda3 and activating the specific environment is suppose to be done every time you use an installed package. When installing a new package you should have the CONDA_PKGS_DIR set to your project directory.

 

Install environment in specific path

It is sometimes useful to relocate an environment and install it in a custom path/directory (.e.g in your scratch directory). For this it is suggested that you write an environment.yaml file to list all the software to install. Once done, the following will install the environment in a specific path:

Here is an example of environment.yaml :

 

To activate the above environment, you will source it as absolute path:

 

It is the user’s responsibility to cleanup temporary files after any package installation. This command can be used to clean the cache:

conda clean --all