Summary of NAASC Lustre FID-in-direct Activation Activity

This page provides a summary of the activity conducted at NAASC on Fri/Sat April 21/22, 2017.

1 Overview

1.1 General info

  • The goal of this activity was to resolve file system traversal issues that had prevented robinhood scans, backups, and access to certain files.
  • The activation of the FID-in-direct feature (available in 2.x lustre) appears to have resolved this problem.

2 Activities on Friday, April 21, after EOB (5pm Eastern)

2.1 Lnet activities

  • To confirm that our issues were not connectivity-related, we ran a few tests using the lnet test suite.
  • The output suggested strongly that there were no connectivity problems.
#### From cvpost005, Running a modified version of krowe's lnet_test script (modified so you can plug in different nids as command-line args) ###
#### No errors, I think this is enough to rule out lnet nonsense... 

[root@cvpost005 ~]# sh lnet-self-test.sh 10.7.17.132@o2ib 10.7.17.134@o2ib 10.7.17.135@o2ib
10.7.17.132@o2ib  cvpost005
10.7.17.134@o2ib  cvpost007
10.7.17.135@o2ib  cvpost008
SESSION: read_write FEATURES: 0 TIMEOUT: 300 FORCE: No
10.7.17.132@o2ib are added to session
10.7.17.134@o2ib are added to session
10.7.17.135@o2ib are added to session
Test was added successfully
Test was added successfully
Test was added successfully
Test was added successfully
thisbatch is running now
[LNet Rates of servers]
[R] Avg: 21002    RPC/s Min: 21002    RPC/s Max: 21002    RPC/s
[W] Avg: 23781    RPC/s Min: 23781    RPC/s Max: 23781    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 3016.25  MB/s  Min: 3016.25  MB/s  Max: 3016.25  MB/s
[W] Avg: 2780.35  MB/s  Min: 2780.35  MB/s  Max: 2780.35  MB/s

[LNet Rates of servers]
[R] Avg: 21157    RPC/s Min: 21157    RPC/s Max: 21157    RPC/s
[W] Avg: 23958    RPC/s Min: 23958    RPC/s Max: 23958    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 3035.47  MB/s  Min: 3035.47  MB/s  Max: 3035.47  MB/s
[W] Avg: 2803.87  MB/s  Min: 2803.87  MB/s  Max: 2803.87  MB/s

[LNet Rates of servers]
[R] Avg: 20996    RPC/s Min: 20996    RPC/s Max: 20996    RPC/s
[W] Avg: 23771    RPC/s Min: 23771    RPC/s Max: 23771    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 3013.24  MB/s  Min: 3013.24  MB/s  Max: 3013.24  MB/s
[W] Avg: 2778.04  MB/s  Min: 2778.04  MB/s  Max: 2778.04  MB/s

[LNet Rates of servers]
[R] Avg: 20980    RPC/s Min: 20980    RPC/s Max: 20980    RPC/s
[W] Avg: 23752    RPC/s Min: 23752    RPC/s Max: 23752    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 3014.84  MB/s  Min: 3014.84  MB/s  Max: 3014.84  MB/s
[W] Avg: 2775.64  MB/s  Min: 2775.64  MB/s  Max: 2775.64  MB/s

[LNet Rates of servers]
[R] Avg: 20960    RPC/s Min: 20960    RPC/s Max: 20960    RPC/s
[W] Avg: 23732    RPC/s Min: 23732    RPC/s Max: 23732    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 3011.14  MB/s  Min: 3011.14  MB/s  Max: 3011.14  MB/s
[W] Avg: 2774.24  MB/s  Min: 2774.24  MB/s  Max: 2774.24  MB/s

[LNet Rates of servers]
[R] Avg: 21168    RPC/s Min: 21168    RPC/s Max: 21168    RPC/s
[W] Avg: 23975    RPC/s Min: 23975    RPC/s Max: 23975    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 3033.27  MB/s  Min: 3033.27  MB/s  Max: 3033.27  MB/s
[W] Avg: 2809.47  MB/s  Min: 2809.47  MB/s  Max: 2809.47  MB/s

servers:
Total 0 error nodes in servers
readers:
Total 0 error nodes in readers
writers:
Total 0 error nodes in writers
session is ended
[root@cvpost005 ~]# sh lnet-self-test.sh 10.7.17.132@o2ib 10.7.17.8@o2ib 10.7.17.16@o2ib
10.7.17.132@o2ib  cvpost005
10.7.17.8@o2ib  asimov
10.7.17.16@o2ib naasc-oss-1
SESSION: read_write FEATURES: 0 TIMEOUT: 300 FORCE: No
10.7.17.132@o2ib are added to session
10.7.17.8@o2ib are added to session
10.7.17.16@o2ib are added to session
Test was added successfully
Test was added successfully
Test was added successfully
Test was added successfully
thisbatch is running now
[LNet Rates of servers]
[R] Avg: 18402    RPC/s Min: 18402    RPC/s Max: 18402    RPC/s
[W] Avg: 21218    RPC/s Min: 21218    RPC/s Max: 21218    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 2607.55  MB/s  Min: 2607.55  MB/s  Max: 2607.55  MB/s
[W] Avg: 2828.83  MB/s  Min: 2828.83  MB/s  Max: 2828.83  MB/s

[LNet Rates of servers]
[R] Avg: 18346    RPC/s Min: 18346    RPC/s Max: 18346    RPC/s
[W] Avg: 21164    RPC/s Min: 21164    RPC/s Max: 21164    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 2601.00  MB/s  Min: 2601.00  MB/s  Max: 2601.00  MB/s
[W] Avg: 2822.60  MB/s  Min: 2822.60  MB/s  Max: 2822.60  MB/s

[LNet Rates of servers]
[R] Avg: 18377    RPC/s Min: 18377    RPC/s Max: 18377    RPC/s
[W] Avg: 21218    RPC/s Min: 21218    RPC/s Max: 21218    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 2609.61  MB/s  Min: 2609.61  MB/s  Max: 2609.61  MB/s
[W] Avg: 2839.01  MB/s  Min: 2839.01  MB/s  Max: 2839.01  MB/s

[LNet Rates of servers]
[R] Avg: 18450    RPC/s Min: 18450    RPC/s Max: 18450    RPC/s
[W] Avg: 21286    RPC/s Min: 21286    RPC/s Max: 21286    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 2602.72  MB/s  Min: 2602.72  MB/s  Max: 2602.72  MB/s
[W] Avg: 2842.42  MB/s  Min: 2842.42  MB/s  Max: 2842.42  MB/s

[LNet Rates of servers]
[R] Avg: 18463    RPC/s Min: 18463    RPC/s Max: 18463    RPC/s
[W] Avg: 21305    RPC/s Min: 21305    RPC/s Max: 21305    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 2612.32  MB/s  Min: 2612.32  MB/s  Max: 2612.32  MB/s
[W] Avg: 2840.92  MB/s  Min: 2840.92  MB/s  Max: 2840.92  MB/s

[LNet Rates of servers]
[R] Avg: 18336    RPC/s Min: 18336    RPC/s Max: 18336    RPC/s
[W] Avg: 21164    RPC/s Min: 21164    RPC/s Max: 21164    RPC/s
[LNet Bandwidth of servers]
[R] Avg: 2599.50  MB/s  Min: 2599.50  MB/s  Max: 2599.50  MB/s
[W] Avg: 2831.70  MB/s  Min: 2831.70  MB/s  Max: 2831.70  MB/s

servers:
Total 0 error nodes in servers
readers:
Total 0 error nodes in readers
writers:
Total 0 error nodes in writers
session is ended

2.2 Activate FID-in-direct and run an oi_scrub lfsck

  • While the lfsck is run on a mounted filesystem, a tune2fs command is necessary on the MDT device first, which means that the system must be partially taken down:
    • Clients were turned off or unmounted
    • MDT and OSTs were unmounted
  • Command tune2fs -O dirdata /dev/md127 is issued
  • MDT is remounted, OSTs are remounted
  • lfsck is commenced on the MDT
    • See here for instructions and explanation
  • The oi_scrub finished at 8:30pm, having checked 34,650,419 inodes... Yes, that's 34+ MILLION!

I had to reboot asimov to get it to actually unmount the mdt, which did not fill me with joy.

Thus when I ran the tune2fs command initially, it prompted me to replay the journal (not surprisingly)...

[root@asimov ~]# e2fsck -fp /dev/md127
naaschpc-MDT0000: recovering journal
naaschpc-MDT0000: 34650573/61046784 files (0.1% non-contiguous), 14833722/61035136 blocks
[root@asimov ~]# tune2fs -O dirdata /dev/md127
tune2fs 1.42.13.wc4 (28-Nov-2015)

At this point, we were able to start the oi_scrub.

Here's a snapshot of the output (you simply 'watch' the statistics file) 

Every 15.0s: cat /proc/fs/lustre/osd-ldiskfs/naaschpc-MDT0000/oi_scrub                                                                              Fri Apr 21 19:41:49 2017

name: OI_scrub
magic: 0x4c5fd252
oi_files: 64
status: scanning
flags: inconsistent
param:
time_since_last_completed: 24151988 seconds
time_since_latest_start: 2080 seconds
time_since_last_checkpoint: 40 seconds
latest_start_position: 12
last_checkpoint_position: 23801804
first_failure_position: 15
checked: 13793917
updated: 0
failed: 1
prior_updated: 0
noscrub: 64
igif: 3620871
success_count: 1
run_time: 2080 seconds
average_speed: 6631 objects/sec
real-time_speed: 6749 objects/sec
current_position: 24128150
lf_scanned: 1
lf_reparied: 0
lf_failed: 0

We also tried running an oi_scrub on an OST and it just finishes in a few seconds. So that may not be a thing--or if it is a thing, maybe it's better to do it after the MDT finishes? Who knows....

2.3 Checking for resolution

  • An ls of the two previously non-ls-able files specified in Ambulance tickets #93553 and #94590 worked.
  • A tree command issued against a large user directory completed successfully.
  • A robinhood scan (essentially a complete directory traversal) completed successfully.
  • The lfs setstripe command that is used to put all directories into the 'all' pool is (as of this writing) still running.
    • It would be nice if this would complete--it would server as another verification of the fix.
  • It was agreed, based on the evidence, that the problem was resolved. At this point, we simply brought the lustre clients back up and concluded the activity.
Topic revision: r1 - 2017-04-24, JessicaOtey
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback