NOTE: This is a draft
I Just Want To Reduce Data, Not Read All This
These are minimalist instructions, if you have problems, refer to the extensive documentation outside of this section.
First be inside the NRAO network, if not, ssh to polaris.cv.nrao.edu.
Reserve a node for 2 days (change to your preference please)
nodescheduler --request 2.0
When it is ready you will receive an email at your NRAO email. It may take up to one minute after the email arrives before you have access. Check which nodes you were given by:
nodescheduler --list mynodes
And log in to them. Lustre is located at /lustre/naasc . You should have a directory there named after your NRAO username. For example, Dan Klopp's data reduction directory is:
Now you have two options. You can either reduce data over a forwarded X connection or you can use VNC.
- The only case where a forwarded X connection is superior to VNC is for trivial and quick operations inside the NRAO network.
- All other cases, use VNC.
This is so needlessly complicated that it could fill three chapters of an Oreilly Book
. If you require this, please send Dan Klopp an email with the specifics.
Very simple, use the latest and greatest script version (at this point in time located in /users/dklopp/thegreatescape). If you are running it from outside NRAO, then try this (replace multivac01.cv.nrao.edu with whichever node you were assigned, and change "dklopp" to your user name):
/path/to/thegreatescape polaris.cv.nrao.edu multivac01.cv.nrao.edu dklopp
If you are running it inside NRAO, the easiest method is
/path/to/thegreatescape localhost multivac01.cv.nrao.edu dklopp
This title is a misnomer. But 'Lustre' is the word everyone uses to refer to the compute nodes, desktop access, data storage, and what Lustre actually is. This guide covers everything that people normally associate with 'Lustre' and how to access it.Unfortunately, I really only have myself to blame for this neologism.
This guide is open to the public as we have many remote clients accessing our compute storage.
If you want Lustre access to be fast follow these guidelines:
- Always access NAASC Lustre from the compute "Multivac" nodes.
- Only use desktop access to NAASC Lustre for casual browsing, never data reduction
- Never perform an ls -r
- Ensure every file is at least 1MB in size (CASA, by and large, does this)
- Access files in parallel as much as possible (CASA does this)
- CV Lustre is slow and data reduction on it is not recommended
There are two physically and logically distinct Lustre installations at NRAO's Charlottesville office. One is called CV Lustre, one is NAASC Lustre. The distinction is a functional differences:
NAASC Lustre: high performance scratch area for NAASC related data reduction
- CV Lustre: general storage for CV staff up to 2 TB. For more details please see TODO: Attach document.
- These are DISTINCT installations.For example, dklopp's directory on NAASC Lustre can be found via:
Whereas on CV Lustre it is:
These areas are NOT the same.You can copy data from one to the other to make them look the same, but they are not synchronized. Best case synchronization speed would be 100 MB/s.
NAASC Lustre (/lustre/naasc) is limited to 2 TB of storage per user.
CV Lustre (/lustre/cv) is limited to 2 TB or 100,000 inodes (whichever comes first) of storage per user. If you require more, please see this document TODO: Place link here on people's usage and limits.
Remote VNC Access
TODO: Link Mark Lacy's area on this subject and reference 'thegreatescape'
Lustre can be accessed via most desktops, and all the compute "Multivac" nodes.
- For data reduction, please use /lustre/naasc .
- For 2TB data storage, please use /lustre/cv . TODO: Place link here on people's usage and limits.
Where Lustre Can Be Found
All compute "Multivac" nodes and desktops.
Where Lustre Cannot Be Found
On all servers unless explicitly mentioned otherwise.
Copying Between CV and NAASC Lustre
The ideal place to copy data is on the compute "Multivac" nodes or the CV compute "Zuul" nodes.If Dan Klopp wanted to copy a file called testdoc.odt from his NAASC Lustre directory to his CV Lustre directory, all he would do is:
cp /lustre/naasc/dklopp/testdoc.odt /lustre/cv/dklopp/
I would strongly encourage copies are performed directly on either the "Multivac" or "Zuul" nodes.
Where to Crunch data
Where you attempt to access the Lustre filesystem will in part determine the performance.
For best performance, please reserve access to one of the compute "Multivac" nodes as described in this document
Though you can use your desktop, the bandwidth from Lustre to your desktop is severely constrained by several orders of magnitude compared to the compute "Multivac" nodes.
How to Crunch Data
Please be sure to put all your data in your Lustre area.