Lustre Training January 24-27th 2012 Denver Colorado

Network

  • Check Lustre cabling, look for errors or inexplicable client evictions
  • Ensure IB cards are in 8x PCI-e slots
  • Possible that QOS could be beneficial
  • Consider replacing multihomed clients with LNET router
    • Allows OSSes to be tuned for IB with TCP tuned on the routers
    • Use 2 or more routers for fail over and resource pooling
    • See slide 11 module 4 of handout
  • Evictions are probably more concerning than I'd assumed
  • oss_num_threads (already set this but will be affected by lnet router)

Lustre kernel and modules

  • Lustre 1.8.x limted to RHEL 5.x

Filesystem creation and maintenance

  • If necessary can set OST index with --index rather than relying on mount order (but don't)
  • If possible shutdown clients, then mdt then osts. Simplifies recovery process on startup
  • /proc/fs/lustre/lov/*/qos_threshold_rr controls balance between QOS (0) and round robin (100) default is 20.

Failover and recovery

  • STONITH required
  • Need multiple methods of up/down confirmation to ensure no multiple demands on shared storage

Client tuning

  • Max read ahead (currently do at block level)
    /proc/fs/lustre/llite//max_read_ahead_whole_mb
    /proc/fs/lustre/llite//max_read_ahead_mb (default 40MB)
  • Client cache per OST
    /proc/fs/lustre/osc//max_dirty_mb danger if set to high that a block requires a flush
  • Client RPCs in flight (already do)
    /proc/fs/lustre/osc/*/max_rpcs_in_flight
    /proc/fs/lustre/osc/*/rpc_stats (to monitor)
    /proc/fs/lustre/llite//max_cached_mb ( limits inactive data cache on client)
    /proc/fs/lustre/llite/stat-ahead max ( max number of directories to precache )

Recovery

  • Should be fsck'ing disks quarterly or so

Misc

  • Lock count on MDS
    /proc/fs/lustre/ldlm/namespaces/mds-lustre-MDT0000_UUID/lock_count
  • Root squash
    lctl conf_param Lustre.mdt.rootsquash="UID:GID"
    lctl conf_param Lustre.mdt.nosquash_nid="<clientip@network>"
  • Tune down opportunistic locking for NFS server
  • "lctl dk" to control log output
  • Tell Lustre to temporarilly avoid degraded OST via echo'ing non-zero value to:
    /proc/fs/lustre/obdfilter//degraded

-- JamesRobnett - 2012-01-25
Topic revision: r7 - 2012-01-26, JamesRobnett
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback