TensorFlow on DeltaAI
Summary
The options to run TensorFlow are NGC containers such as:
tensorflow_24.09-tf2-py3.sif
(in/sw/user/NGC_containers
) and our module:python/miniforge3_tensorflow_cuda
.Power users can install tensorflow in their own venv or conda environments via:
pip install --extra-index-url=https://developer.download.nvidia.com/compute/redist nvidia_tensorflow==2.17.0+nv24.11
jupyter-notebook
is in the container and our module.Remember to add the
--nv
flag to the srun apptainer command line when using any NGC container.
Customization
The container does not support python venv
(it’s not installed), and conda
is not available inside the container. Instead, use the PYTHONUSERBASE
environment variable to specify a (possibly shared) path where you will install additions to the tensorflow container’s python. If you are using a jupyter notebook you will need to “restart kernel” from the menu to make your changes visible to jupyter. See also: PYTHONUSERBASE:
Installing from within the Container
arnoldg@gh001:~> export PYTHONUSERBASE=/projects/bbka/arnoldg/tensorflow_modules
arnoldg@gh001:~> apptainer shell --bind /projects /sw/user/NGC_containers/tensorflow_24.09-tf2-py3.sif
Apptainer> pip install --user matplotlib
...
Successfully installed contourpy-1.2.1 cycler-0.12.1 fonttools-4.53.1 kiwisolver-1.4.5 matplotlib-3.9.0 pillow-10.4.0
Apptainer> python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Could not open PYTHONSTARTUP
FileNotFoundError: [Errno 2] No such file or directory: '/etc/pythonstart'
>>> import matplotlib
>>> exit()
Apptainer> echo $PYTHONUSERBASE
/projects/bbka/arnoldg/tensorflow_modules
Apptainer> ls $PYTHONUSERBASE/lib/python3.10/site-packages/
PIL fontTools mpl_toolkits
__pycache__ fonttools-4.53.1.dist-info pillow-10.4.0.dist-info
contourpy kiwisolver pillow.libs
contourpy-1.2.1.dist-info kiwisolver-1.4.5.dist-info pylab.py
cycler matplotlib
cycler-0.12.1.dist-info matplotlib-3.9.0.dist-info
Apptainer>
Package Install Location
arnoldg@gh001:~/.local/lib/python3.10/site-packages> pwd
/u/arnoldg/.local/lib/python3.10/site-packages
arnoldg@gh001:~/.local/lib/python3.10/site-packages> ls
contourpy fontTools matplotlib pillow-10.4.0.dist-info
contourpy-1.2.1.dist-info fonttools-4.53.1.dist-info matplotlib-3.9.1.dist-info pillow.libs
cycler kiwisolver mpl_toolkits __pycache__
cycler-0.12.1.dist-info kiwisolver-1.4.5.dist-info PIL pylab.py
Runtime Items of Note
Use some CPU cores with this container or module (--cpus-per-task=64
). It takes quite a few ARM cores to keep the H100 GPUs working at peak.