Lustre Configuration
Capacity to Inode Ratio
Ratio: 1TB Quota to 1,500,000 inodes
An inode is a record that describes a file, directory, or link. This information is stored in a dedicated flash pool on Taiga with a finite capacity. To ensure that the inode pool doesn’t run out of space before the capacity pools, this quota ratio is implemented.
For example, if your project has a 10TB quota on Taiga, it has a quota of 15.0 million inodes.
Block Size
File System Block Size: 2MB
For a balance of throughput performance and file space efficiency, a block size of 2MB has been chosen for the Taiga filesystem. Larger file sizes help larger-streaming data movement go faster. In general, large I/O to filesystems is encouraged, when possible.
Checking Quota Utilization
Lustre informs the df
utility of project quotas which is how Taiga partitions capacity. To see your usage vs. your quota, change into your allocation directory and run df
with the -h
and -i
flags to check block and file utilization respectively like the below:
[user@client]$ cd /taiga/illinois/eng/ece/bobsmith [user@client]$ df -h . Filesystem Size Used Avail Use% Mounted on 172.30.32.2@o2ib,172.30.32.3@o2ib:..172.30.32.8@o2ib,172.30.32.9@o2ib:/taiga 5.0T 489G 4.6T 10% /taiga [user@client]$ cd /taiga/illinois/eng/ece/bobsmith [user@client]$ df -i . Filesystem Inodes IUsed IFree IUse% Mounted on 172.30.32.2@o2ib,172.30.32.3@o2ib:..172.30.32.8@o2ib,172.30.32.9@o2ib:/taiga 7500000 90212 7409788 2% /taiga
Default Stripe Size
Stripe Size: 1
Number of Flash OSTs in Taiga: 24
Number of HDD OSTs in Taiga: 28
Lustre is capable of striping data over multiple object storage targets (OSTs) to increase performance and help balance data across the disks. The default stripe for Taiga is set to 1 but this value is overridden as the file being written gets larger; this behavior is determined by the progressive file layout (PFL) configured for Taiga.
Run lfs getstripe
to see how many OSTs a file is striped across. The following example shows a file on Delta that has FIDs in 3 components since is 500MB in size
[user@client]$ lfs getstripe /taiga/nsf/delta/bbka/$user/testfiles/file_0 /taiga/nsf/delta/bbka/$user/testfiles/file_0 lcm_layout_gen: 6 lcm_mirror_count: 1 lcm_entry_count: 4 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 65536 lmm_stripe_count: 1 lmm_stripe_size: 65536 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 11 lmm_pool: ddn_nvme lmm_objects: - 0: { l_ost_idx: 11, l_fid: [0x9c00013a0:0x2926ad:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 65536 lcme_extent.e_end: 268435456 lmm_stripe_count: 1 lmm_stripe_size: 2097152 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 173 lmm_pool: ddn_hdd lmm_objects: - 0: { l_ost_idx: 173, l_fid: [0xdc00013a1:0x30e07:0x0] } lcme_id: 3 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 268435456 lcme_extent.e_end: 4294967296 lmm_stripe_count: 4 lmm_stripe_size: 2097152 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 175 lmm_pool: ddn_hdd lmm_objects: - 0: { l_ost_idx: 175, l_fid: [0xe800013a1:0x30cd3:0x0] } - 1: { l_ost_idx: 155, l_fid: [0x4c00013a1:0x28bcf:0x0] } - 2: { l_ost_idx: 165, l_fid: [0xa400013a1:0x29058:0x0] } - 3: { l_ost_idx: 172, l_fid: [0xe400013a1:0x30d77:0x0] } lcme_id: 4 lcme_mirror_id: 0 lcme_flags: 0 lcme_extent.e_start: 4294967296 lcme_extent.e_end: EOF lmm_stripe_count: -1 lmm_stripe_size: 2097152 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: -1 lmm_pool: ddn_hdd
Progressive File Layout (PFL)
Taiga deploys a PFL that performs these two key functions:
Allows us to keep the initial 64KB of every file on NVME flash.
This increases performance for small file I/O by keeping it on faster media and keeps that noisy traffic off the spinning media that prefer larger I/O patterns. This helps improve the throughput for workloads doing large I/O by letting them have clearer access to the HDDs that make up the bulk of Taiga’s capacity.
Allows us to dynamically set the stripe size of files so that the bigger a file grows the more stripes it gets.
This helps improve the performance of the system and helps keep the OST usage rates more balanced which leads to better overall system responsiveness. The stripe count of a file can be overridden by using either
lfs setstripe
or by usinglfs migrate
to change an existing file’s stripe count; however, these actions are very much discouraged. Users should use the system defaults except in rare cases.
PFL Implementation Details:
Component 1: - Size: 0B to 64KB - Pool: NVME - Sripe Count: 1 Component 2: - Size: 64KB to 256MB - Pool: HDD - Stripe Count: 1 Component 3: - Size: 256MB to 4GB - Pool: HDD - Stripe Count: 4 Component 4: - Size: 4GB+ - Pool: HDD - Stripe Count: 28 (all HDD OST)