Lustre Configuration

Capacity to Inode Ratio

Ratio: 1TB Quota to 1,500,000 inodes

An inode is a record that describes a file, directory, or link. This information is stored in a dedicated flash pool on Taiga with a finite capacity. To ensure that the inode pool doesn’t run out of space before the capacity pools, this quota ratio is implemented.

For example, if your project has a 10TB quota on Taiga, it has a quota of 15.0 million inodes.

Block Size

File System Block Size: 2MB

For a balance of throughput performance and file space efficiency, a block size of 2MB has been chosen for the Taiga filesystem. Larger file sizes help larger-streaming data movement go faster. In general, large I/O to filesystems is encouraged, when possible.

Checking Quota Utilization

Lustre informs the df utility of project quotas which is how Taiga partitions capacity. To see your usage vs. your quota, change into your allocation directory and run df with the -h and -i flags to check block and file utilization respectively like the below:

[user@client]$ cd /taiga/illinois/eng/ece/bobsmith
[user@client]$ df -h .
Filesystem                                                                    Size  Used Avail Use% Mounted on
172.30.32.2@o2ib,172.30.32.3@o2ib:..172.30.32.8@o2ib,172.30.32.9@o2ib:/taiga  5.0T  489G  4.6T  10% /taiga

[user@client]$ cd /taiga/illinois/eng/ece/bobsmith
[user@client]$ df -i .
Filesystem                                                                    Inodes  IUsed IFree   IUse% Mounted on
172.30.32.2@o2ib,172.30.32.3@o2ib:..172.30.32.8@o2ib,172.30.32.9@o2ib:/taiga  7500000 90212 7409788    2% /taiga

Default Stripe Size

  • Stripe Size: 1

  • Number of Flash OSTs in Taiga: 24

  • Number of HDD OSTs in Taiga: 28

Lustre is capable of striping data over multiple object storage targets (OSTs) to increase performance and help balance data across the disks. The default stripe for Taiga is set to 1 but this value is overridden as the file being written gets larger; this behavior is determined by the progressive file layout (PFL) configured for Taiga.

Run lfs getstripe to see how many OSTs a file is striped across. The following example shows a file on Delta that has FIDs in 3 components since is 500MB in size

[user@client]$ lfs getstripe /taiga/nsf/delta/bbka/$user/testfiles/file_0
   /taiga/nsf/delta/bbka/$user/testfiles/file_0
     lcm_layout_gen:    6
     lcm_mirror_count:  1
     lcm_entry_count:   4
       lcme_id:             1
       lcme_mirror_id:      0
       lcme_flags:          init
       lcme_extent.e_start: 0
       lcme_extent.e_end:   65536
         lmm_stripe_count:  1
         lmm_stripe_size:   65536
         lmm_pattern:       raid0
         lmm_layout_gen:    0
         lmm_stripe_offset: 11
         lmm_pool:          ddn_nvme
         lmm_objects:
         - 0: { l_ost_idx: 11, l_fid: [0x9c00013a0:0x2926ad:0x0] }

       lcme_id:             2
       lcme_mirror_id:      0
       lcme_flags:          init
       lcme_extent.e_start: 65536
       lcme_extent.e_end:   268435456
         lmm_stripe_count:  1
         lmm_stripe_size:   2097152
         lmm_pattern:       raid0
         lmm_layout_gen:    0
         lmm_stripe_offset: 173
         lmm_pool:          ddn_hdd
         lmm_objects:
         - 0: { l_ost_idx: 173, l_fid: [0xdc00013a1:0x30e07:0x0] }

       lcme_id:             3
       lcme_mirror_id:      0
       lcme_flags:          init
       lcme_extent.e_start: 268435456
       lcme_extent.e_end:   4294967296
         lmm_stripe_count:  4
         lmm_stripe_size:   2097152
         lmm_pattern:       raid0
         lmm_layout_gen:    0
         lmm_stripe_offset: 175
         lmm_pool:          ddn_hdd
         lmm_objects:
         - 0: { l_ost_idx: 175, l_fid: [0xe800013a1:0x30cd3:0x0] }
         - 1: { l_ost_idx: 155, l_fid: [0x4c00013a1:0x28bcf:0x0] }
         - 2: { l_ost_idx: 165, l_fid: [0xa400013a1:0x29058:0x0] }
         - 3: { l_ost_idx: 172, l_fid: [0xe400013a1:0x30d77:0x0] }

       lcme_id:             4
       lcme_mirror_id:      0
       lcme_flags:          0
       lcme_extent.e_start: 4294967296
       lcme_extent.e_end:   EOF
         lmm_stripe_count:  -1
         lmm_stripe_size:   2097152
         lmm_pattern:       raid0
         lmm_layout_gen:    0
         lmm_stripe_offset: -1
         lmm_pool:          ddn_hdd

Progressive File Layout (PFL)

Taiga deploys a PFL that performs these two key functions:

  • Allows us to keep the initial 64KB of every file on NVME flash.

    This increases performance for small file I/O by keeping it on faster media and keeps that noisy traffic off the spinning media that prefer larger I/O patterns. This helps improve the throughput for workloads doing large I/O by letting them have clearer access to the HDDs that make up the bulk of Taiga’s capacity.

  • Allows us to dynamically set the stripe size of files so that the bigger a file grows the more stripes it gets.

    This helps improve the performance of the system and helps keep the OST usage rates more balanced which leads to better overall system responsiveness. The stripe count of a file can be overridden by using either lfs setstripe or by using lfs migrate to change an existing file’s stripe count; however, these actions are very much discouraged. Users should use the system defaults except in rare cases.

PFL Implementation Details:

Component 1:
- Size: 0B to 64KB
- Pool: NVME
- Sripe Count: 1

Component 2:
- Size: 64KB to 256MB
- Pool: HDD
- Stripe Count: 1

Component 3:
- Size: 256MB to 4GB
- Pool: HDD
- Stripe Count: 4

Component 4:
- Size: 4GB+
- Pool: HDD
- Stripe Count: 28 (all HDD OST)