-- DanielKlopp - 2012-10-10

NOTE: This is a draft

I Just Want To Reduce Data, Not Read All This

These are minimalist instructions, if you have problems, refer to the extensive documentation outside of this section.

First be inside the NRAO network, if not, ssh to polaris.cv.nrao.edu.

Next,

 
ssh foundation.cv.nrao.edu

Reserve a node for 2 days (change to your preference please)

nodescheduler --request 2.0

When it is ready you will receive an email at your NRAO email. It may take up to one minute after the email arrives before you have access. Check which nodes you were given by:

nodescheduler --list mynodes

And log in to them. Lustre is located at /lustre/naasc . You should have a directory there named after your NRAO username. For example, Dan Klopp's data reduction directory is:

/lustre/naasc/dklopp

Now you have two options. You can either reduce data over a forwarded X connection or you can use VNC.
  • The only case where a forwarded X connection is superior to VNC is for trivial and quick operations inside the NRAO network.
  • All other cases, use VNC.

This is so needlessly complicated that it could fill three chapters of an Oreilly Book. If you require this, please send Dan Klopp an email with the specifics.

Very simple, use the latest and greatest script version (at this point in time located in /users/dklopp/thegreatescape). If you are running it from outside NRAO, then try this (replace multivac01.cv.nrao.edu with whichever node you were assigned, and change "dklopp" to your user name):

/path/to/thegreatescape polaris.cv.nrao.edu multivac01.cv.nrao.edu dklopp

If you are running it inside NRAO, the easiest method is
/path/to/thegreatescape localhost multivac01.cv.nrao.edu dklopp

Introduction

This title is a misnomer. But 'Lustre' is the word everyone uses to refer to the compute nodes, desktop access, data storage, and what Lustre actually is. This guide covers everything that people normally associate with 'Lustre' and how to access it.Unfortunately, I really only have myself to blame for this neologism.

This guide is open to the public as we have many remote clients accessing our compute storage.

Performance

If you want Lustre access to be fast follow these guidelines:

  1. Always access NAASC Lustre from the compute "Multivac" nodes.
  2. Only use desktop access to NAASC Lustre for casual browsing, never data reduction
  3. Never perform an ls -r
  4. Ensure every file is at least 1MB in size (CASA, by and large, does this)
  5. Access files in parallel as much as possible (CASA does this)
  6. CV Lustre is slow and data reduction on it is not recommended

The "Lustres"

There are two physically and logically distinct Lustre installations at NRAO's Charlottesville office. One is called CV Lustre, one is NAASC Lustre. The distinction is a functional differences:

NAASC Lustre: high performance scratch area for NAASC related data reduction

  • CV Lustre: general storage for CV staff up to 2 TB. For more details please see TODO: Attach document.

  • These are DISTINCT installations.For example, dklopp's directory on NAASC Lustre can be found via:

/lustre/naasc/dklopp

Whereas on CV Lustre it is:

/lustre/cv/dklopp

These areas are NOT the same.You can copy data from one to the other to make them look the same, but they are not synchronized. Best case synchronization speed would be 100 MB/s.

Storage Limits

NAASC Lustre (/lustre/naasc) is limited to 2 TB of storage per user.

CV Lustre (/lustre/cv) is limited to 2 TB or 100,000 inodes (whichever comes first) of storage per user. If you require more, please see this document TODO: Place link here on people's usage and limits.

Remote VNC Access

TODO: Link Mark Lacy's area on this subject and reference 'thegreatescape'

Lustre Access

Lustre can be accessed via most desktops, and all the compute "Multivac" nodes.

  • For data reduction, please use /lustre/naasc .

  • For 2TB data storage, please use /lustre/cv . TODO: Place link here on people's usage and limits.

Where Lustre Can Be Found

All compute "Multivac" nodes and desktops.

Where Lustre Cannot Be Found

On all servers unless explicitly mentioned otherwise.

Copying Between CV and NAASC Lustre

The ideal place to copy data is on the compute "Multivac" nodes or the CV compute "Zuul" nodes.If Dan Klopp wanted to copy a file called testdoc.odt from his NAASC Lustre directory to his CV Lustre directory, all he would do is:

cp /lustre/naasc/dklopp/testdoc.odt /lustre/cv/dklopp/

I would strongly encourage copies are performed directly on either the "Multivac" or "Zuul" nodes.

Where to Crunch data

Where you attempt to access the Lustre filesystem will in part determine the performance.

For best performance, please reserve access to one of the compute "Multivac" nodes as described in this document.

Though you can use your desktop, the bandwidth from Lustre to your desktop is severely constrained by several orders of magnitude compared to the compute "Multivac" nodes.

How to Crunch Data

http://casaguides.nrao.edu

Please be sure to put all your data in your Lustre area.
Topic revision: r11 - 2013-02-12, DanielKlopp
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback