Lustre Configuration
Capacity to Inode Ratio
Ratio: 1TB Quota to 1,500,000 inodes
An inode is a record that describes a file, directory, or link. This information is stored in a dedicated flash pool on Taiga with a finite capacity. To ensure that the inode pool doesn’t run out of space before the capacity pools, this quota ratio is implemented.
For example, if your project has a 10TB quota on Taiga, it has a quota of 15.0 million inodes.
Block Size
File System Block Size: 2MB
For a balance of throughput performance and file space efficiency, a block size of 2MB has been chosen for the Taiga filesystem. Larger file sizes help larger-streaming data movement go faster. In general, large I/O to filesystems is encouraged, when possible.
Checking Quota Utilization
The quota
command on Illinois Campus Cluster will show all of your allocation usage on Taiga (via the /projects
section of the output). The Delta and DeltaAI systems have the taigaquota
command which looks like the following:
[username@dt-login01 ~]$ taigaquota Quota usage for user yourusername: --------------------------------------------------------------------------------------------------------------------- | Directory Path | Block | Soft | Hard | Files | Soft | Hard | | | Used | Quota | Limit | Used | Quota | Limit | --------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------- | /taiga/nsf/delta/xxxx | 1.692T | 1.953T | 2.002T | 1821185 | 3000000 | 3300000 | | /taiga/industry/yy/yyy | 6.033P | 6.348P | 6.348P | 794314501 | 9750000000 | 9750000000 | | /taiga/illinois/ovcri/ncsa/shared | 36.71T | 62T | 62T | 17242328 | 93000000 | 93000000 | | /taiga/illinois/eng/cs/pinetid | 5.109T | 10T | 10T | 2516541 | 15000000 | 15000000 | | /taiga/illinois/las/stat/pinetid | 59.43G | 1T | 1T | 56009 | 1500000 | 1500000 | ---------------------------------------------------------------------------------------------------------------------
Another method for checking your quota is to leverage the fact that Lustre informs the df
utility of project quotas, which is how Taiga partitions capacity. To see your usage vs. your quota, change into your allocation directory and run df
with the -h
and -i
flags to check block and file utilization, respectively, like the below:
[user@client]$ cd /taiga/illinois/eng/ece/bobsmith [user@client]$ df -h . Filesystem Size Used Avail Use% Mounted on 172.30.32.2@o2ib,172.30.32.3@o2ib:..172.30.32.8@o2ib,172.30.32.9@o2ib:/taiga 5.0T 489G 4.6T 10% /taiga [user@client]$ cd /taiga/illinois/eng/ece/bobsmith [user@client]$ df -i . Filesystem Inodes IUsed IFree IUse% Mounted on 172.30.32.2@o2ib,172.30.32.3@o2ib:..172.30.32.8@o2ib,172.30.32.9@o2ib:/taiga 7500000 90212 7409788 2% /taiga
Default Stripe Size
Stripe Size: 1
Number of Flash OSTs in Taiga: 32
Number of HDD OSTs in Taiga: 38
Lustre is capable of striping data over multiple object storage targets (OSTs) to increase performance and help balance data across the disks. The default stripe for Taiga is set to 1 but this value is overridden as the file being written gets larger; this behavior is determined by the progressive file layout (PFL) configured for Taiga.
Run lfs getstripe
to see how many OSTs a file is striped across. The following example shows a file on Delta that has FIDs in 3 components since is 500MB in size
[user@client]$ lfs getstripe /taiga/nsf/delta/bbka/$user/testfiles/file_0 /taiga/nsf/delta/bbka/$user/testfiles/file_0 lcm_layout_gen: 6 lcm_mirror_count: 1 lcm_entry_count: 4 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 65536 lmm_stripe_count: 1 lmm_stripe_size: 65536 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 11 lmm_pool: ddn_nvme lmm_objects: - 0: { l_ost_idx: 11, l_fid: [0x9c00013a0:0x2926ad:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 65536 lcme_extent.e_end: 536870912 lmm_stripe_count: 1 lmm_stripe_size: 2097152 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 173 lmm_pool: ddn_hdd lmm_objects: - 0: { l_ost_idx: 173, l_fid: [0xdc00013a1:0x30e07:0x0] } lcme_id: 3 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 536870912 lcme_extent.e_end: 8589934592 lmm_stripe_count: 4 lmm_stripe_size: 2097152 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 175 lmm_pool: ddn_hdd lmm_objects: - 0: { l_ost_idx: 175, l_fid: [0xe800013a1:0x30cd3:0x0] } - 1: { l_ost_idx: 155, l_fid: [0x4c00013a1:0x28bcf:0x0] } - 2: { l_ost_idx: 165, l_fid: [0xa400013a1:0x29058:0x0] } - 3: { l_ost_idx: 172, l_fid: [0xe400013a1:0x30d77:0x0] } lcme_id: 4 lcme_mirror_id: 0 lcme_flags: 0 lcme_extent.e_start: 4294967296 lcme_extent.e_end: EOF lmm_stripe_count: -1 lmm_stripe_size: 2097152 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: -1 lmm_pool: ddn_hdd
Progressive File Layout (PFL)
Taiga deploys a PFL that performs these two key functions:
Allows us to keep the initial 64KB of every file on NVME flash.
This increases performance for small file I/O by keeping it on faster media and keeps that noisy traffic off the spinning media that prefer larger I/O patterns. This helps improve the throughput for workloads doing large I/O by letting them have clearer access to the HDDs that make up the bulk of Taiga’s capacity.
Allows us to dynamically set the stripe size of files so that the bigger a file grows the more stripes it gets.
This helps improve the performance of the system and helps keep the OST usage rates more balanced which leads to better overall system responsiveness. The stripe count of a file can be overridden by using either
lfs setstripe
or by usinglfs migrate
to change an existing file’s stripe count; however, these actions are very much discouraged. You should use the system defaults except in rare cases.
PFL Implementation Details:
Component 1: - Size: 0B to 64KB - Pool: NVME - Stripe Count: 1 Component 2: - Size: 64KB to 512MB - Pool: HDD - Stripe Count: 1 Component 3: - Size: 512MB to 8GB - Pool: HDD - Stripe Count: 4 Component 4: - Size: 8GB+ - Pool: HDD - Stripe Count: 16 (~ 50% HDD OSTs)