Early User Info
This page lists the new versions of software and new features available as Delta transitions to RedHat 9 OS stack in the fall of 2025.
What is not Changing?
These should function as usual with no noticable changes:
job scheduler: no change to partition names
filesystems: no changes to /projects and /work. See the IME section regarding /ime going away.
containers: apptainer with existing containers
What is Changing?
OS Update
The operating system was updated from RedHat (RH) 8.8 to 9.4. The table below contains the change in OS kernel, glibc and OS version of the GCC compiler.
OS |
Linux Kernel |
glibc |
OS GCC |
|
Old |
RH 8.8 |
4.18.0 |
2.28 |
8.5.0 |
New |
RH 9.4 |
5.14.0 |
2.34 |
11.4.1 |
NVIDIA Driver and CUDA
The NVIDIA driver and base CUDA is updated to provide more recent (but not the latest) release as shown in the following table:
NVIDIA Driver |
Base CUDA |
|
Old |
550.163.01 |
12.4 |
New |
570.148.08 |
12.8 |
Note that CUDA 11.8 will also be available from the new programming environment but is not yet ready for testing.
Default Programming Environment
There are larger changes to the default programming environment compared to the current production environment on Delta.
New Programming Environment (PE)
The compiler, MPI implementation and other base packages will be provided by the Cray Programming Environment (CrayPE), similar to how the default programming environment is provided on DeltaAI.
The default environment will be based on the GNU CrayPE PrgEnv-gnu. The default MPI implementation will be Cray’s MPICH.
GCC compiler |
Module names |
CUDA |
MPI |
Module name |
|
Old PE |
gcc |
gcc/11.4.0 |
cuda/11.8.0 |
OpenMPI |
openmpi |
New PE |
gcc |
PrgEnv-gnu gcc-native/13.2 |
cudatoolkit/25.3_12.8 |
Cray MPICH |
cray-mpich |
Other Cray PEs are available such as PrgEnv-nvidia (NVIDIA HPC SDK compilers) and PrgEnv-cray (Cray compilers). All programming environments are set by default to the the cray-mpich module.
Use the modules command to view the default loaded modules
[gbauer@dt-testlogin01 ~]$ module list
Currently Loaded Modules:
1) gcc-native/13.2 6) cray-libsci/25.03.0 11) craype-accel-nvidia80
2) craype/2.7.34 7) PrgEnv-gnu/8.6.0 12) cue-login-env/1.1
3) libfabric/1.22.0 8) cray-dsmml/0.3.1 13) slurm-env/0.1
4) craype-network-ofi 9) craype-x86-milan 14) default
5) cray-mpich/8.1.32 10) cudatoolkit/25.3_12.8
Use of Compiler Wrappers
The CrayPE compiler wrappers cc, CC and ftn are recommended when building C, C++ and Fortran libraries and applications. The wrappers automatically include paths to include files and libraries for MPI and GPU RDMA and CUDA.
The CrayPE provides MPI compiler wrappers mpicc, mpicxx/mpic++ and mpifort/mpif77/mpif90 can be used with CPU MPI codes but require some additional include path and libraries when compiling GPU aware libraries for GPU RDMA as mentioned in the mpi man page (see: man mpi).
The following environment variables have been set to help use the compiler wrappers:
Environment Variable |
Default Setting |
CC |
cc |
CXX |
CC |
FC |
ftn |
MPICC |
mpicc |
MPICXX |
mpicxx |
MPIF77 |
mpif77 |
MPIF90 |
mpif90 |
CMAKE_C_COMPILER |
cc |
CMAKE_CXX_COMPILER |
CC |
CMAKE_Fortran_COMPILER |
ftn |
Support for GPU RDMA
The Cray Programming Environments: PrgEnv-gnu, PrgEnv-nvidia and PrgEnv-cray support GPU RDMA. Compiler and runtime support is configured by default for PrgEnv-gnu and PrgEnv-nvidia.
To enable support for GPU RDMA the environment variable MPICH_GPU_SUPPORT_ENABLED needs to be set
export MPICH_GPU_SUPPORT_ENABLED=1
If you see
aborting job:
MPIDI_CRAY_init: GPU_SUPPORT_ENABLED is requested, but GTL library is not linked
then you set the environment variable but did not properly link the executable with -lmpi_gtl_cuda or use the cc, CC or ftn compiler wrappers.
NCCL
Please load the aws-ofi-nccl module so that NCCL will use the appropriate high-speed network provider for NCCL.
This module provides the AWS OFI network transport plugin for NCCL,
optimized for Cray systems with CXI interconnect.
Dependencies (must be loaded first):
- cudatoolkit/25.3_12.8
- libfabric/1.22.0
Python
Several python packages are available for use:
miniforge3-python
pytorch-conda/2.8
tensorflow-conda/2.18
Use the module spider command to find packages with modules.
When installing Python packages, especially mpi4py with GPU support we recommend setting the MPICC environment variable as follows:
for GPU Python
MPICC="cc -shared" pip install mpi4py
Open OnDemand
The OnDemand instance for Jupyter and the Desktop applications is in internal testing. We will make an announcement once it is available.
What is going away?
IME
The /ime caching front-end to /work will not be available on Delta after the upgrade to RH9. Please use /work/hdd or request space on /work/nvme if you have a use case that required use of /ime.
OpenMPI
At the moment only the cray-mpich module is supported. OpenMPI performance is less than one-half of what we see with the cray-mpich implementation, so priority to redeploy OpenMPI is reduced.
How to Access
Login Node Access
Please use the following two login nodes for access to the new configuration.
Login nodes |
dt-login04.delta.ncsa.illinois.edu |
How to Run Jobs
You must be logged into one of the test login nodes in order to run jobs with the new configuration.
There are compute nodes booted with the new configuration and a default Slurm reservation called RH9 has been added to your environment to direct jobs to those nodes.
At the moment there are 1/4 of the CPU nodes, 1/4 of the A100 nodes and 1/4 of the A40 nodes available for use.
To list the nodes available in the reservation:
[gbauer@dt-login04 ~]$ sinfo --long | grep $SLURM_RESERVATION
PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE RESERVATION NODELIST
cpu up 2-00:00:00 1-infinite no NO all 2 reserved RH9 cn[001,027]
cpu-interactive up 1:00:00 1-4 no NO all 2 reserved RH9 cn[001,027]
cpu-preempt up 2-00:00:00 1-infinite no NO all 2 reserved RH9 cn[001,027]
full up 1-00:00:00 1-infinite no NO all 6 reserved RH9 cn[001,027],gpua[007,010],gpub[054,075]
gpuA100x4* up 2-00:00:00 1-infinite no NO all 2 reserved RH9 gpua[007,010]
gpuA100x4-interactive up 1:00:00 1-4 no NO all 2 reserved RH9 gpua[007,010]
gpuA100x4-preempt up 2-00:00:00 1-infinite no NO all 2 reserved RH9 gpua[007,010]
gpuA40x4 up 2-00:00:00 1-infinite no NO all 2 reserved RH9 gpub[054,075]
gpuA40x4-interactive up 1:00:00 1-4 no NO all 2 reserved RH9 gpub[054,075]
gpuA40x4-preempt up 2-00:00:00 1-infinite no NO all 2 reserved RH9 gpub[054,075]