Catch and Release

TL;DR remove an OST from a pool on CV lustre

  • On any lustre client, check the status of OST's:
    • lfs df | grep cvlustre-OST |sort -g -k5 | grep -v "/.lustre/cv"
  • On sauron, you can see what's in the pool, not in the pool, and remove OST's from the pool:
    • What's in the pool?
      • inpool
    • What's not in the pool?
      • notinpool
    • Remove cvlustre-OST0033 from the pool:
      • lctl pool_remove cvlustre.all cvlustre-OST0033

1 The way to monitor and balance OSTs without using functions or aliases:

  • As any user on any client with lustre mounted and lustre file system tools installed, view the OSTs sorted by used capacity:
lfs df | sort -nk 5 | grep -v '/.lu'

2 What is "catch and release?"

  • Catch and release is Jessica's term of art for removing and returning OSTs to the 'all' pool
  • She has created tools to easily accomplish this on the MDS and one dedicated client.

3 Tools on the client (in user's .bashrc)

As root on any Lustre client with the LFS tools installed:

/lustre/naasc/admin/root/.bash_naasc/lustreclient

#catch and release
fishlist(){
   # which naaschpc OSTs are at or above 90% full
   if [ $# -eq 0 ]; then
   lfs df | grep naaschpc-OST | sort -g -k5 | awk '$5>=90'
   else
   lfs df | grep naaschpc-OST | sort -g -k5 | awk -v full="$1" '$5>=full'
   fi
}

checkost(){
   # check fullness of a given OST, e.g., naaschpc-OST003b
   lfs df | grep $1
}

4 Tools on the MDS (in user's .bashrc)

#catch and release
notinpool(){
   # take the list of all devices and give me just naasc osts and then remove the first 8 rows which represent inactive ones
   lctl dl | awk '{print $4}' | grep -v naaschpc-MDT | grep naaschpc-OST00 | sort -k13 | sed s/-osc-MDT0000//g | awk 'NR>8' > /tmp/allhex
   # take the pool
   lctl pool_list naaschpc.all | grep -vi pool | sort -k13 | sed s/_UUID//g > /tmp/allpool
   # both lists have been reduced to just naaschpc-OST#### so they can be compared
   echo "OSTs not in naasc.all pool:"; diff /tmp/allpool /tmp/allhex | grep \>
}

catch(){
# removes an OST from the naasc all pool
# usage pool_remove <fsname>.<poolname> <ostname indexed list>
   lctl pool_remove naaschpc.all $1 
}

release(){
# adds an OST to the naasc all pool
# usage pool_add <fsname>.<poolname> <ostname indexed list>
   lctl pool_add naaschpc.all $1 
}

5 Catch and release in action, including use of drain script

  • On the designated client, determine what OSTs are eligible using fishlist
  • On the MDS (asimov), "catch" the eligible OST, e.g.: catch OST003b
# help I'm lazy I want to catch them all!
#do fishlist on a client, and paste it into a variable on the mdt.
# you will need to replace naaschpc with your fsname
for i in $(echo "$fishlist" | awk '{print $1}'|sed 's/naaschpc-\(.*\)_UUID/\1/g'); do catch $i; done
  • On the MDS (asimov), verify that your OST is no longer in the pool using notinpool
  • On the designated client, start a screen session, e.g.: screen -S naaschpc-OST003b
  • In the screen session, initiate the drain, e.g. sh /lustre/naasc/admin/OST-percent-naasc-drain.sh OST002f 80
  • Outside the screen, follow the OST's progress, e.g.: checkost naaschpc-OST003b
  • On the MDS, once the drain has completed, "release" the node back into the pool, e.g.: release OST003b
  • On the MDS, verify that your OST is once again in the pool using notinpool

5.1 Notes on these tools

  • fishlist as shown here is configured for our naaschpc filesystem
  • checkost as shown here is filesystem agnostic--thus you need to include the fsname in your query

6 Scripts will require localization for your filesystem setup!

Topic attachments
I Attachment Action Size Date Who Comment
OST-percent-naasc-drain.shsh OST-percent-naasc-drain.sh manage 2 K 2017-01-13 - 10:47 JessicaOtey revised localize naasc percent drain
Topic revision: r12 - 2019-04-12, CjAllen
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback