Parallelism tools ( Greasy )

On Ibex there are a couple of useful tools to run parallelized very simple tasks or the use of a particular script. On this page, we will show the use of greasy , with a very simple script

Greasy is a very direct application to use using a SLURM script. It will generate three main files (along with the results your program generates) error

We can start with the usual header of the SLURM script.

We define the name of the job, the name of the out and error files that we can consult if there is a failure in the code.

#!/bin/bash #SBATCH --partition=batch #SBATCH --job-name=MyJob #SBATCH --output=out_folder/greasy-%j.out #SBATCH --error=err_folder/greasy-%j.err

The second part consist in indicating the number of CPUs according to your needs. Keep in mind that one task is reserved for the master if using the MPI engine. In this example we are running 8 task, each task is going to one CPU in the one node we are asking. Since this is a very simple example 5 minutes is more than enough

#SBATCH --nodes=1 #SBATCH --ntasks=8 #SBATCH --ntasks-per-core=1 #SBATCH --time=00:05:00

 

We load the greasy module from the Ibex app stack

module load greasy/2.2.3_ompi4.0.3

 

By default greasy will generate a report of the usage of the computational resources with the name greasy-<jobid>.log if inside SLURM job. You can specify the path to a place to write the logs by adding this line to the script

export GREASY_LOGFILE=greasy.log

 

I prefer this way when using SLURM, putting the log file into its own folder is more neat and clean :

export GREASY_LOGFILE=log_folder/greasy_${SLURM_JOB_ID}.log

 

Then we define the task or tasks to execute in parallel. In this example, we create a list of task using hostname. The second task is invoking sleep. The last task is running the script my_script.py

echo "Running hostname" for n in {1..8} ; do echo "hostname"; done > inputs/hostname.txt echo "Running another kind of job" for n in {1..8} ; do echo "/bin/sleep ${n}"; done > inputs/sleep.txt echo "Running a job with scripts" for n in {1..8} ; do echo "python src/my_script.py ${n}"; done > inputs/my_python_job.txt

 

The python script looks more complicated , but it is just asking for information of the system.

import os import sys """ get all the variables of the enviroment""" def get_env_variables(): env_variables = os.environ.keys() return env_variables my_env = get_env_variables() #for key in my_env: # print(key, os.environ[key]) ## if else statement for a menu my_input = int(sys.argv[1]) if my_input == 1: print (" the value of HOME is ", os.environ['HOME']) elif my_input == 2: print (" the value of HISTFILESIZE is ", os.environ['HISTFILESIZE']) elif my_input == 3: print (" the value of KAUST_NODETYPE is ", os.environ['KAUST_NODETYPE']) elif my_input == 4: print (" the value of PWD is ", os.environ['PWD']) elif my_input == 5: print (" the value of LOGNAME is ", os.environ['LOGNAME']) elif my_input == 6: print (" the value of MPI_HOME is ", os.environ['MPI_HOME']) elif my_input == 7: print (" the value of SHELL is ", os.environ['SHELL']) else: print (" the value of USER is ", os.environ['USER'])

 

Then Run greasy! , feed greasy with the inputs produces with the for loops

greasy inputs/hostname.txt greasy inputs/sleep.txt greasy inputs/my_python_job.txt

 

Since each of these files output the to the command line all will be in a single output file

 

 

You can check put more details on the original GitHub repository :

https://github.com/BSC-Support-Team/GREASY