Singularity MPI containers on Shaheen and Ibex

On this page we demonstrate how to run a containerized MPI application with singularity container platform on Shaheen and Ibex. Both KSL system are different in supporting the MPI environment.

Shaheen

Shaheen has MPICH compatible cray-mpich installed which leverages the advanced features of cray-arise interconnect. For corner cases we have also installed openmpi-4.x on Shaheen compute nodes.

MPICH container

Here is an example to launch an image with MPICH on Shaheen. srun launches two MPI processes on two Shaheen compute nodes:

#!/bin/bash #SBATCH -p haswell #SBATCH -N 2 #SBATCH -n 4 #SBATCH -t 00:05:00 module load singularity/3.5.1 export IMAGE=mpich_psc_latest.sif BIND="-B /opt,/var,/usr/lib64,/etc,/sw" srun -n 2 -N 2 --mpi=pmi2 singularity exec ${BIND} ${IMAGE} /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency

OpenMPI container

The following is an example to run an image with openmpi to launch an MPI job. Here we bind mount the host OpenMPI. We use mpirun to launch the jobs because of the unavailability of pmix integration of SLURM on host.

#!/bin/bash #SBATCH -p haswell #SBATCH -N 2 #SBATCH -n 4 #SBATCH -t 00:05:00 module swap PrgEnv-$(echo $PE_ENV | tr [:upper:] [:lower:]) PrgEnv-gnu module load openmpi/4.0.3 module load singularity export LD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:$LD_LIBRARY_PATH export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cray/wlm_detect/default/lib64:/etc/alternatives:/usr/lib64:/usr/lib export SINGULARITENV_APPEND_PATH=$PATH export IMAGE=/project/k01/shaima0d/singularity_test/images/openmpi401_latest.sif export BIND_MOUNT="-B /sw,/usr/lib64,/opt,/etc,/var" echo "On same node" mpirun -n 2 -N 2 hostname mpirun -n 2 -N 2 singularity exec ${BIND_MOUNT} ${IMAGE} /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency echo "Now trying inside a singularity container" mpirun -n 2 -N 1 hostname mpirun -n 2 -N 1 singularity exec ${BIND_MOUNT} ${IMAGE} /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency

 

IBEX

CPU job

On Ibex openmpi is installed on host. It is generally suited to launch the singularity MPI jobs with mpirun due to the unavailability of pmix integration of SLURM on host.

#!/bin/bash #SBATCH --ntasks=4 #SBATCH --nodes=2 #SBATCH --gres=gpu:v100:2 #SBATCH --time=00:05:00 #SBATCH --account=ibex-cs module load singularity module load openmpi/4.0.3-cuda10.2 module list export OMPI_MCA_btl=openib export OMPI_MCA_btl_openib_allow_ib=1 export IMAGE=/ibex/scratch/shaima0d/scratch/singularity_mpi_testing/images/osu_cuda_openmpi403_563.sif export EXE_lat=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency export EXE_bw=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw echo "On same node" mpirun -n 2 --map-by ppr:2:node hostname mpirun -n 2 --map-by ppr:2:node singularity exec ${IMAGE} ${EXE_lat} mpirun -n 2 --map-by ppr:2:node singularity exec --nv ${IMAGE} ${EXE_bw} echo "On two nodes" mpirun -n 2 --map-by ppr:1:node hostname mpirun -n 2 --map-by ppr:1:node singularity exec ${IMAGE} ${EXE_lat} mpirun -n 2 --map-by ppr:1:node singularity exec ${IMAGE} ${EXE_bw}

GPU job

The following SLURM jobscript demonstrates run a container with MPI application running on Ibex GPUs leveraging GPU Direct RDMA feature to get close to maximum theoretical bandwidth available from a Host Channel Adapter(HCA).