Programming Environment (Building Software)

The Delta programming environment supports the GNU, AMD (AOCC), Intel and NVIDIA HPC compilers. Support for the HPE/Cray Programming environment is forthcoming.

Modules provide access to the compiler + MPI environment.

The default environment includes the GCC 11.4.0 compiler + OpenMPI; nvcc is in the CUDA module and is loaded by default.

AMD recommended compiler flags for GNU, AOCC, and Intel compilers for Milan processors can be found in the AMD Compiler Options Quick Reference Guide for Epyc 7xx3 processors.

Serial

To build (compile and link) a serial program in Fortran, C, and C++:

Serial Program Commands

GCC

AOCC

NVHPC

gfortran myprog.f

flang myprog.f

nvfortran myprog.f

gcc myprog.c

clang myprog.c

nvc myprog.c

g++ myprog.cc

clang myprog.cc

nvc++ myprog.cc

MPI

To build (compile and link) a MPI program in Fortran, C, and C++:

MPI Program Commands

MPI Implementation

Module Files for MPI/Compiler

Build Commands

OpenMPI

gcc openmpi
    openmpi/5.0.5+cuda
    (GPU-direct)
aocc openmpi
nvhpc openmpi+cuda
(GPU-direct)
intel openmpi
  • Fortran 77: mpif77 myprog.f

  • Fortran 90: mpif90 myprog.f90

  • C: mpicc myprog.c

  • C++: mpic++ myprog.cc

  • link using mpi compilers for cuda codes compiled with nvc or nvc++

Cray MPICH (unsupported)

  • PrgEnv-gnu (unsupported) or

  • PrgEnv-cray (unsupported)

PrgEnv-gnu cuda craype-x86-milan \
craype-accel-ncsa
(GPU-direct)
  • Fortran 77: fortran myprog.f

  • Fortran 90: fortran myprog.f90

  • C: cc myprog.c

  • C++: CC myprog.cc

OpenMP

To build an OpenMP program, use the -fopenmp / -mp option.

OpenMP Program Commands

GCC

AOCC

NVHPC

gfortran -fopenmp myprog.f

flang -fopenmp myprog.f

nvfortran -mp myprog.f

gcc -fopenmp myprog.c

clang -fopenmp myprog.c

nvc -mp myprog.c

g++ -fopenmp myprog.cc

clang -fopenmp myprog.cc

nvc++ -mp myprog.cc

Hybrid MPI/OpenMP

To build an MPI/OpenMP hybrid program, use the -fopenmp / -mp option with the MPI compiling commands.

Hybrid MPI/OpenMP Program Commands

GCC

PGI/NVHPC

mpif77 -fopenmp myprog.f

mpif77 -mp myprog.f

mpif90 -fopenmp myprog.f90

mpif90 -mp myprog.f90

mpicc -fopenmp myprog.c

mpicc -mp myprog.c

mpic++ -fopenmp myprog.cc

mpic++ -mp myprog.cc

Cray xthi.c Sample Code

Document - XC Series User Application Placement Guide CLE6..0UP01 S-2496 | HPE Support

This code can be compiled using the methods show above. The code appears in some of the batch script examples below to demonstrate core placement options.

#define _GNU_SOURCE

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sched.h>
#include <mpi.h>
#include <omp.h>

/* Borrowed from util-linux-2.13-pre7/schedutils/taskset.c */
static char *cpuset_to_cstr(cpu_set_t *mask, char *str)
{
  char *ptr = str;
  int i, j, entry_made = 0;
  for (i = 0; i < CPU_SETSIZE; i++) {
    if (CPU_ISSET(i, mask)) {
      int run = 0;
      entry_made = 1;
      for (j = i + 1; j < CPU_SETSIZE; j++) {
        if (CPU_ISSET(j, mask)) run++;
        else break;
      }
      if (!run)
        sprintf(ptr, "%d,", i);
      else if (run == 1) {
        sprintf(ptr, "%d,%d,", i, i + 1);
        i++;
      } else {
        sprintf(ptr, "%d-%d,", i, i + run);
        i += run;
      }
      while (*ptr != 0) ptr++;
    }
  }
  ptr -= entry_made;
  *ptr = 0;
  return(str);
}

int main(int argc, char *argv[])
{
  int rank, thread;
  cpu_set_t coremask;
  char clbuf[7 * CPU_SETSIZE], hnbuf[64];

  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  memset(clbuf, 0, sizeof(clbuf));
  memset(hnbuf, 0, sizeof(hnbuf));
  (void)gethostname(hnbuf, sizeof(hnbuf));
  #pragma omp parallel private(thread, coremask, clbuf)
  {
    thread = omp_get_thread_num();
    (void)sched_getaffinity(0, sizeof(coremask), &coremask);
    cpuset_to_cstr(&coremask, clbuf);
    #pragma omp barrier
    printf("Hello from rank %d, thread %d, on %s. (core affinity = %s)\n",
            rank, thread, hnbuf, clbuf);
  }
  MPI_Finalize();
  return(0);
}

A version of xthi is also available from ORNL:

% git clone https://github.com/olcf/XC30-Training/blob/master/affinity/Xthi.c

OpenACC

To build an OpenACC program, use the -acc option and the -mp option for multi-threaded:

OpenACC Program Commands

Non-Multi-threaded

Multi-threaded

nvfortran -acc myprog.f

nvfortran -acc -mp myprog.f

nvc -acc myprog.c

nvc -acc -mp myprog.c

nvc++ -acc myprog.cc

nvc++ -acc -mp myprog.cc

CUDA

CUDA compilers (nvcc) are included in the CUDA module which is loaded by default under modtree/gpu. For the CUDA Fortran compiler and other NVIDIA development tools, load the nvhpc module.

nv* commands when nvhpc is loaded

[arnoldg@dt-login03 namd]$ nv
nvaccelerror             nvidia-bug-report.sh     nvlink
nvaccelinfo              nvidia-cuda-mps-control  nv-nsight-cu
nvc                      nvidia-cuda-mps-server   nv-nsight-cu-cli
nvc++                    nvidia-debugdump         nvprepro
nvcc                     nvidia-modprobe          nvprof
nvcpuid                  nvidia-persistenced      nvprune
nvcudainit               nvidia-powerd            nvsize
nvdecode                 nvidia-settings          nvunzip
nvdisasm                 nvidia-sleep.sh          nvvp
nvextract                nvidia-smi               nvzip
nvfortran                nvidia-xconfig

See the NVIDIA HPC SDK page for more information.

Note: The Multi-Process Service (MPS) is not enabled on Delta and there are no plans to support it in the future.

HIP/ROCm (AMD MI100)

Note

If using hipcc on the login nodes, add –offload-arch=gfx908 to the flags to match the gpu on the MI100 node.

To access the development environment for the gpuMI100x8 partition, start a job on the node with srun or sbatch.

Next, set your PATH to prefix /opt/rocm/bin where the HIP and ROCm tools are installed.

A sample batch script to obtain an xterm (interactive xterm batch script for Slurm) is shown below:

#!/bin/bash -x

MYACCOUNT=$1
GPUS=--gpus-per-node=1
PARTITION=gpuMI100x8-interactive
srun --tasks-per-node=1 --nodes=1 --cpus-per-task=4 \
  --mem=16g \
  --partition=$PARTITION \
  --time=00:30:00 \
  --account=account_name \    # <- match to a "Project" returned by the "accounts" command
  $GPUS --x11 \
  xterm

AMD HIP development environment on gpud01 (setting the path on the compute node):

[arnoldg@gpud01 bin]$ export PATH=/opt/rocm/bin:$PATH
[arnoldg@gpud01 bin]$ hipcc
No Arguments passed, exiting ...
[arnoldg@gpud01 bin]$

See the AMD HIP documentation and AMD ROCm documentation for more information.

Visual Studio Code

Note

The Code Server (VS Code) app in Open OnDemand provides an easy method to use VS Code in a web browser.

The following pages provide step-by-step instructions on how to use VS Code, in different configurations, on Delta.