lustre module cannot obtain ost list efficiently
We are using Lustre's ioctl interface to get striping information about each file that is opened. There are a few options to get stripe information:
-
LL_IOC_LOV_GETSTRIPE
only needs an open file handle, but only returns stripe size and width -
IOC_MDC_GETFILESTRIPE
needs an open handle on the parent directory as well as the file basename as a string. It returns stripe size, width, and exact layout.
Darshan already has the file handle for option 1, but has to issue an open(2)
call to use option 2. Our tests show that having each MPI process issue this additional open
introduces introduces overhead that scales with MPI process count.
I made some simple code that demonstrates the two ioctl calls' behaviors here:
https://github.com/glennklockwood/nersc-ssio/blob/master/llapi-perf/test-getstripe.c
And here's some example output of the above tool to illustrate:
$ ./test-getstripe /scratch1/scratchdirs/glock/random.bin
This is with ioctl LL_IOC_LOV_GETSTRIPE:
Stripe size: 1048576
Stripe count: 2
OST idx: 0
os_id: 0 oi_seq: 0 f_seq: 0 f_oid: 0 f_ver: 0 l_ost_gen: 0 l_ost_idx: 0
os_id: 0 oi_seq: 0 f_seq: 0 f_oid: 0 f_ver: 0 l_ost_gen: 0 l_ost_idx: 0
$ ./test-getstripe /scratch1/scratchdirs/glock/ random.bin
This is with IOC_MDC_GETFILESTRIPE:
Stripe size: 1048576
Stripe count: 2
OST idx: 54
os_id: 76620879 oi_seq: 0 f_seq: 76620879 f_oid: 0 f_ver: 0 l_ost_gen: 0 l_ost_idx: 54
os_id: 76656970 oi_seq: 0 f_seq: 76656970 f_oid: 0 f_ver: 0 l_ost_gen: 0 l_ost_idx: 22
We need to find a way to get the stripe layout from an open file handle alone so we don't have to issue any additional open(2)
s.