Containers

Containerization is a modern software packaging and execution technology that allows scripts and executables to be distributed with libraries and other dependencies, and a complete Linux operating system environment. Unlike virtual machines, which run a separate kernel on virtual processors, containerized applications share the same kernel as the host and therefore suffer practically no overhead.

Campus Cluster supports containers via Apptainer (formerly Singularity), which is like Docker but specialized for traditional HPC environments. Apptainer distinguishes itself in that root/sudo authorization is not required to either run or (as of version 1.1) build containers; technical details can be found in Apptainer Without Setuid - Dave Dykstra.

Apptainer 1.3.4 is installed on Campus Cluster login and compute nodes at /usr/bin/apptainer. In interpreting the Apptainer documentation it is occasionally helpful to know that Apptainer on Campus Cluster runs in non-suid mode.

Using Docker Images with Apptainer

  • Option 1 - Just run it:

    apptainer run docker://rockylinux:8
    

    Images are cached in $APPTAINER_CACHEDIR if set, or in $HOME/.apptainer/cache by default.

  • Option 2 - Download to Singularity Image Format (SIF) file and run:

    apptainer pull docker://rockylinux:8
    
    apptainer run rockylinux_8.sif
    

    A SIF file can also be run directly (assuming execute permission):

    ./rockylinux_8.sif
    
  • Option 3 - Download to local sandbox directory and modify:

    apptainer build --sandbox /tmp/rocky docker://rockylinux:8
    
    apptainer exec --fakeroot --writable /tmp/rocky yum install -y which
    
    apptainer run --fakeroot --writable /tmp/rocky
    

    You can test the sandbox as a normal user in read-only mode:

    apptainer run /tmp/rocky
    

    The Lustre home and projects filesystems lack xattr support, which results in a long stream of error messages from apptainer build and causes yum install transaction failures. It is therefore necessary to use a writable local filesystem (/tmp) for sandboxes, and then convert the image to a SIF file on a cross-node filesystem for future use:

    apptainer build --fakeroot newrocky.sif /tmp/rocky
    
  • Option 4 - Convert Dockerfile to Apptainer definition file and build:

    Singularity Python provides a recipe converter from Dockerfile format to Apptainer definition file format. The converter greatly simplifies the process but isn’t perfect, particularly when files are copied using relative paths.

    pip3 install spython --user
    
    spython recipe Dockerfile image.def
    
    apptainer build image.sif image.def
    

Interacting with Host Filesystems

Apptainer will bind-mount $HOME, $PWD, and /tmp into the container by default. Additional directories may be mounted with --bind src[:dest[:ro]] and default mounts suppressed with --no-mount home,cwd,tmp or --contain. Note that --no-mount home or --no-home will only disable mounting of the home directory if it is not also the current working directory.

The caller’s current user and group will appear unchanged, but all other users and groups will appear as nobody. With the --fakeroot option, $HOME will be mounted as /root and the caller’s user and group will be mapped to root. Regardless of apparent user and group, processes inside a container have the caller’s full read and write capabilities on mounted host filesystems.

See the Apptainer user guide - Bind Paths and Mounts for details.

Mounting Images of Many-File Datasets

Shared network filesystems, such as Lustre used for home and projects, incur much higher latencies opening and closing files than local filesystems do. For this reason, workflows that process many small files can run orders of magnitude slower on a cluster than on a desktop workstation.

As described in the Apptainer user guide - Image Mounts, Apptainer can bind-mount image files in standard ext3 and squashfs formats as well as its own SIF format. An image file can contain millions of tiny files while providing the simplicity and performance of a single large file. Each image file can safely be mounted either read-write by a single container or read-only by many containers (but not both at the same time).

Running with GPU Acceleration

Apptainer GPU support is described in detail in the Apptainer user guide - GPU Support; adding --nv should just work, assuming that GPUs were correctly requested in the Slurm submission options. Devices visible with nvidia-smi outside a container should be visible inside a container launched with --nv.

Images based on Alpine Linux may not work correctly with --nv (report nvidia-smi: not found). If this happens, try an image based on another Linux distribution such as Ubuntu.

The NVIDIA HPC SDK container distribution includes directions for running with Singularity that can be used as-is with Apptainer (/usr/bin/singularity is a symbolic link to apptainer). Note that by default Apptainer passes through most environment variables, including CC, CXX, FC, and F77 from the gcc module and MPICC, MPICXX, MPIF77, and MPIF90 from the openmpi module, which will mislead cmake and configure scripts into attempting to use compilers in /usr/local/... that are not available in the container. This can be prevented by either running module unload gcc openmpi or running Apptainer with the --cleanenv option.