Job Accounting

The charge unit for DeltaAI is the service unit (SU). This corresponds to the equivalent use of one Grace Hopper GH200 for 1 hour. Charges are based on the resources that are reserved for your job and do not necessarily reflect how the resources are used. Charges are based on the number of GH200 GPUs needed to fullfill the requested memory or CPU cores, whichever is larger.

For example if a job requests 300 GB of CPU memory, 16 CPU cores and 1 GPU then two GH200s will be allocated to the job even though only 1 GPU is requested. Note that each GH2100 has 120GB CPU memory and 72 CPU cores.

Also note that 1 GB of memory here means 1e9 bytes (1,000,000,000), not 2^30 bytes (1,073,741,824).

Local Account Charging

Use the accounts command to list the accounts available for charging.

$ accounts
Project Summary for User 'A User':

Account         Balance (Hours)    Deposited (Hours)  Project
------------  -----------------  -------------------  ---------------------------
abcd-dtai-gh             999672              1000000  a research allocation

Job Accounting Considerations

  • A node-exclusive job that runs on a DeltaAI compute node for one hour will be charged 4 GH200 hours as there are 4 GH200s per node (4 GPUs per node).

QOSGrpBillingMinutes

If you see QOSGrpBillingMinutes under the Reason column for the squeue command, as in:

  JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
1204221      ghx4    myjob     .... PD       0:00      5 (QOSGrpBillingMinutes)

Then the resource allocation specified for the job (e.g., xyzt-dtai-gh) does not have sufficient balance to run the job based on the number of resources requested and the wall-clock time. Sometimes it may be other jobs from the same project, also in the QOSGrpBillingMinutes state, using the same resource allocation, that are preventing a job that would normally “fit” from running. To resolve this, the PI of the project needs to put in a supplement request using the same XRAS proposal system that was used for the current award, see Allocation Supplements and Extensions.

Reviewing Job Charges for a Project (jobcharge)

jobcharge in /sw/user/scripts/ will show job charges by user for a project. Example usage:

jobcharge_grp.py (click to expand/collapse)

The example jobcharge commands are showing results for the abcd-dtai-gh account. Accounts available to you are listed under “Project” when you run the accounts command.

Under Construction

Refunds

Refunds are considered, when appropriate, for jobs that failed due to circumstances beyond user control.

To request a refund, submit a support request. Please include the batch job IDs and the standard error and output files produced by the job(s).