ARM Forge is a one of the few Debugging and Performance profiling tools available on Shaheen. Developers can make use of these tools to find bugs in their code by live or offline debugging and also create performance profiles of parallel, sequential and even Python applications running on compute nodes of Shaheen.
For interactive/live debugging, we encourage use of reverse connecting to Shaheen compute nodes. For this you will need to download an appropriate version of Remote client from ARM Forge and install it on your workstation/laptop. There is version alignment needed between ARM Forge installed on Shaheen and the remote client you install. At present the following versions are installed on Shaheen
Firstly you need to instrument your source code for debugging by compiling it with -g flag enabled. Debuggers such as gdb, idb, TotalView or Arm Forge ddt require this instrumentation. Once you have (re) compiled the application, you can launch ddt as batch job as follows:
Once the job starts executing, you are ready to fire up your Remote client.
Configuring Remote client
To reverse connect to a ddt or map session running on Shaheen compute nodes, we first need to configure our remote client. This is a one-off configuration and will be remembered by your remote client application installed on your workstation/laptop.
Shaheen has 4 login nodes and 2 Gateway nodes. The cdl nodes (login nodes) 1 & 3 connect to gateway1 whereas cdl nodes 2&4 connect to gateway2. The following example demonstrates who to configure for even number of cdl nodes (you can maintain a separate configuration in your remote client for cdl nodes 1&3).
Select Configure from Remote launch menu.
The configuration requires you to update three fields,
a name for the configuration
the hostname (and gateway) to connect to
the path to the executables of ARM-Forge e.g. ddt and map (you may log into Shaheen, load the module and see where it is installed e.g. which ddt)
After configuring the ARM-Froge client to connect to Shaheen, we can do a two step action to start a debugging session on a Shaheen compute node.
Step 1: Submit a job on Shaheen to start the debugger server on a compute node
Step 2: Connect your client to Shaheen
Your credentials will be needed to login (use your usual username/password/OTP). The second password inquiry corresponds to hopping on to the gateway node. Use your Shaheen password here too.
When the connection is successful, the reverse connection request should pop up on your ARM-Forge client. In addition, a successful connection also implies that you should be able to see the license number in the left bottom corner of your GUI where it would have otherwise shown “No license found”.
After accepting, your ARM-Forge client will be connected to a debugging session running on Shaheen compute node.
To run your application, click on “Run” and it will open a console for you to describe the configuration of your application launch:
In the MPI section, choose the number of MPI ranks your application will launch on. Also confirm that the “Implementation” attributed is set to “SLURM MPMD”. If no, you can click on “Change” button and select it from the list presented. Also make sure the path to MPI launcher is set correctly.
Once done you can click the button “Run” and start debugging. The MPI processes will launch and will your source code should appear in the debugger console.
Common Debugging actions
You can now click play to run your application. Should you wish to set breakpoints or set a watch on variables you may directly do so by going to the line number in the editor pane and opt for the action. Breakpoints can be set by “Right click” on the line and select the option. When set, the breakpoint should appear in the bottom console in Breakpoints tab:
During the run you can dive into a function/subroutine by clicking on “Step Into” button and move out of a function scope using “Step Out” to return back to the caller program
During the simulation if you wish to see the values of a variable or an N-dimensional array, you can select it from the variable pane according to its scope (either variable is on stack or heap) and view it: