Job Accounting
The charge unit for DeltaAI is the service unit (SU). This corresponds to the equivalent use of one Grace Hopper GH200 for 1 hour. Charges are based on the resources that are reserved for your job and do not necessarily reflect how the resources are used. Charges are based on the number of GH200 GPUs needed to fullfill the requested memory or CPU cores, whichever is larger.
For example if a job requests 300 GB of CPU memory, 16 CPU cores and 1 GPU then two GH200s will be allocated to the job even though only 1 GPU is requested. Note that each GH2100 has 120GB CPU memory and 72 CPU cores.
Also note that 1 GB of memory here means 1e9 bytes (1,000,000,000), not 2^30 bytes (1,073,741,824).
Local Account Charging
Use the accounts
command to list the accounts available for charging.
$ accounts
Project Summary for User 'A User':
Account Balance (Hours) Deposited (Hours) Project
------------ ----------------- ------------------- ---------------------------
abcd-dtai-gh 999672 1000000 a research allocation
Job Accounting Considerations
A node-exclusive job that runs on a DeltaAI compute node for one hour will be charged 4 GH200 hours as there are 4 GH200s per node (4 GPUs per node).
QOSGrpBillingMinutes
If you see QOSGrpBillingMinutes under the Reason column for the squeue
command, as in:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1204221 ghx4 myjob .... PD 0:00 5 (QOSGrpBillingMinutes)
Then the resource allocation specified for the job (e.g., xyzt-dtai-gh) does not have sufficient balance to run the job based on the number of resources requested and the wall-clock time. Sometimes it may be other jobs from the same project, also in the QOSGrpBillingMinutes state, using the same resource allocation, that are preventing a job that would normally “fit” from running. To resolve this, the PI of the project needs to put in a supplement request using the same XRAS proposal system that was used for the current award, see Allocation Supplements and Extensions.
Reviewing Job Charges for a Project (jobcharge)
jobcharge
in /sw/user/scripts/ will show job charges by user for a project. Example usage:
jobcharge_grp.py (click to expand/collapse)
The example jobcharge
commands are showing results for the abcd-dtai-gh
account. Accounts available to you are listed under “Project” when you run the accounts
command.
Under Construction
Refunds
Refunds are considered, when appropriate, for jobs that failed due to circumstances beyond user control.
To request a refund, submit a support request. Please include the batch job IDs and the standard error and output files produced by the job(s).