The NAASC Cluster and Lustre filesystems - etiquette and helpful hints


Lustre Access

For access to the Lustre filesystem, please submit a helpdesk ticket to helpdesk-cv. There's a different kernel that needs to be used to boot your machine so you can see the Lustre filesystem (note too that reboots may not always pickup the right kernel - if you lose the lustre system that's probably what happened, again a helpdesk ticket should sort it out).

Note that the Lustre system is still experimental, and your data may get wiped at short notice. Please treat it as scratch space and back up any important reductions to your desktop. Right now we are asking everyone to stick to a voluntary quota of 2TB. If you need to exceed this for any reason let me know and we'll see how much space is available.

Cluster Access

Reserving Cluster Nodes

  • Log into cvpost-master: ssh -Y bnaked@cvpost-master.cv.nrao.edu
  • If you want to see what the syntax is for the nodescheduler command, just type: nodescheduler
-bash-4.1$ nodescheduler

Usage:
  nodescheduler --request <days>[:hh][:mm] <nodecount>

  nodescheduler --terminate [#|me]
Examples:
  • Request two half-nodes for fourteen days: nodescheduler --r 14 2
  • Request a single half-node for 15 days: nodescheduler --r 15 1
  • Request a single full-node for 15 days: nodescheduler --w 15 1

Getting Information About Cluster Nodes

-bash-4.1$ qstat -n1 | tail -n +6 | sort -k2
80.cvpost-serv-2.cv.nr  acosta      interact interactive_j    119723   --     --        --  1080:00:0 R  82:23:05   cvpost008/0-5
118.cvpost-serv-2.cv.n  aginsbur    interact interactive_j     50930   --     --        --  336:00:00 R  68:59:35   cvpost019/6-13
129.cvpost-serv-2.cv.n  alipnick    interact interactive_j     47726   --     --        --  720:00:00 R  67:39:17   cvpost022/0-7
...
  • Terminate a reservation:
    • Log into cvpost-master: ssh -Y bnaked@cvpost-master.cv.nrao.edu
    • nodescheduler --terminate 191.cvpost-serv-1.cv.nrao.edu

You can find many more lustre node interaction commands on the Clever Notes on the Cluster Nodes page.

Now that you have a node available, here is how to access it using VNC.

Quick Start VNC Connection Instructions

ALERT! NOTE : for visitors with the "observer" accounts (e.g., cv-481) there is documentation on the Science Site that includes simple VNC instructions.

Detailed instructions are given below which describe each of the steps in this section. For the impatient, I have listed the steps one needs to take to start a VNC session. This assumes that you:
  1. Have already checked-out a cvpost node (cvpost666 in this example)
  2. Are on the NRAO network or are tunneling your connection through ssh
  3. Have already set your vnc password.

  1. From a terminal on your local computer:
    1. ssh cvpost666.cv.nrao.edu
    2. vncserver -geometry 1400x900
    3. Server responds New 'cvpost666:1 (jmangum)' desktop is cvpost666:1, so X=1
  2. IF YOU ARE ON THE NRAO NETWORK (either via ethernet, or through VPN Client like Cicsco AnyConnect):
    1. Start your VNC client and configure your connection using Host=cvpost666.cv.nrao.edu. Port=X (from above), vnc password, allow other clients. You do NOT have to set "Tunnel over SSH".
  3. IF YOU ARE NOT ON THE NRAO NETWORK:
    1. From another terminal on your local computer:
      1. Assume port 5901 is not being used, so that Y=1
      2. ssh -N -C -L 5901:cvpost666.cv.nrao.edu:5901 YOURLOGIN@ssh.cv.nrao.edu
    2. Start your VNC client and configure your connection using "localhost" as the connection address.

More detailed VNC Connection Instructions

Setting Up Your Xserver

ALERT! NOTE : for most staff, this should be unnecessary. I recommend skipping this section and going to "Starting your VNC Server" instead. Staff accounts are already configured to use Gnome or KDE, and you should never see twm. -- PatrickMurphy

If you have not done so already, you need to setup your Xwindows environment in your home directory on ssh.cv.nrao.edu before starting a VNC session:

  1. Log into your home account on ssh.cv.nrao.edu: =ssh -Y USERNAME@ssh.cv.nrao.edu
  2. Edit your .vnc/xstartup file as follows:
    • Uncomment the two lines "unset SESSION_MANAGER" and "exec /etc/X11/xinit/xinitrc"
    • Change the final line from "twm &" to either "startkde &" or "gnome-session &"

Starting Your VNC Server

In the following we assume that you have an active reservation on the lustre node cvpost666:

  1. If this is your first time running VNC on the NRAO lustre computing system, you need to set your VNC password as follows:
    1. Log into the lustre node: ssh -Y cvpost666.cv.nrao.edu
    2. In the terminal window on the remote (lustre) computer, type at the Linux prompt: vncpasswd
    3. Enter your chosen VNC server password. Remember, this password MUST be different from your NRAO Linux account password as it will be shared with support staff when you require assistance.
  2. Start the VNC server on the lustre node computer:
    1. Log into the lustre node (you can use the same login session as you started above if you wish): ssh -Y cvpost666.cv.nrao.edu
    2. Once you are logged into the lustre node, and regardless of your remote computer:
      1. At the Linux prompt on cvpost666, type: vncserver. If you have a big monitor connected, you can use something like vncserver -geometry 1400x900, which will give you a lot of real estate to work with during your VNC session but will not consume your entire monitor screen.
      2. Once you have typed vncserver, the system will reply (e.g.) New 'cvpost666:1 (USERNAME)' desktop is cvpost666:1. In this case, your VNC session number is 1 . Remember this number since you will need it later. In the instructions below, the VNC session number is designated as X.
  3. Open up an SSH tunnel from your local computer to the luster node. Do this by starting another terminal on your local computer and typing (the "Y" is a free local port on your machine, while the "X" in the port number is your VNC server session number from above): ssh -N -C -L 590Y:cvpost666.cv.nrao.edu:590X YOURLOGIN@ssh.cv.nrao.edu. For the "Y" in the above, consider the following:
    • If you are running the Apple Remote Desktop Manager (installed in October 2007 on many Macs in CV by CIS), you cannot use "0", as this remote desktop application uses port 5900.
    • A good start is to try "1", but higher numbers may be necessary.
    • It's easiest to make X = Y for simplicity.
    • If you have trouble finding a free port use netstat -a | grep 59 to list all ports in use.
  4. Start the VNC viewer on your local computer. There are quite a few VNC client viewers in use:
    1. Screen Sharing on OSX(built-in): available in newer versions of Mac OSX:
      1. Open the Finder.
      2. Under the "Go" pulldown menu, choose "Connect to Server" (or type Command-K)
      3. Specify vnc://cvpost666.cv.nrao.edu:590X (where X is specified as above).
      4. Enter your password as above. You can manage this password in your keychain if you want.
    2. Chicken: A favourite of Mac users. It has recently shown signs of its age and lack of support, though.
      1. Start Chicken.
      2. Go to Connection, Open Connection
      3. Enter Host = localhost, Display = Y, your password, check Remember Password, and check Allow other clients to connect. The "Y" in the display parameter above is the "Y" in the port number chosen above.
      4. Hit the Connect button.
    3. RealVNC: A free VNC viewer that works well on Macs.
      1. Start RealVNC.
      2. From the VNC Viewer menu select Launch Listening VNC Viewer. This will start a listener process from which you can open multiple VNC connections. An icon on the top menu bar will appear.
      3. From the VNC listening icon select New Connection...
      4. Enter your VNC server address and port number: localhost:1 (assuming you have set up an ssh tunnel as instructed above).
      5. Select Connect. You will get a warning about this being an insecure connection, even if you have tunneled through ssh (it does not seem to be smart enough to know that you have setup an ssh tunnel...). Select Continue. This behaviour is normal. Rest assured that your connection is secure.
      6. You will next get a popup asking for your "Password". This is your VNC password that RealVNC wants. Enter it and select OK.
    4. Tiger VNC:
      1. Similar to RealVNC (will write instructions for this later...)
    5. vncviewer: Apparently a linux-based VNC client that I have not used...
    6. remmina: the default remote desktop client on Ubuntu. Details TBD.
  5. The VNC Viewer window to cvpost666 will now appear.
  6. If you want to be able to cut-copy-paste to-from your local computer and your VNC session, type the command vncconfig in one of your VNC session's terminals. This trick seems to work for the RealVNC and TigerVNC clients.

How to Transfer Files Through an SSH Tunnel

You will need to setup an ssh tunnel, separate from the one(s) you have setup for your VNC connection, to transfer files from the lustre to your local machine if you are not connected to the NRAO network. As above, assume you have reserved cvpost666:

  1. Open a terminal window and type: ssh -L 2222:cvpost666.cv.nrao.edu:22 YOURLOGIN@ssh.cv.nrao.edu. After entering your password you will then be logged-into cvpost666.
  2. Open a second terminal window and change to the directory you want to copy the files from cvpost666 to.
  3. To copy an entire directory from cvpost666 to your local machine: scp -r -P 2222 jmangum@localhost:/lustre/naasc/YOURLOGIN/DIRECTORY to copy DIRECTORY to your local machine.
  4. To copy just a single file from cvpost666 to your local machine: scp -r -P 2222 jmangum@localhost:/lustre/naasc/YOURLOGIN/FILENAME to copy FILENAME to your local machine. The -r switch is not necessary here, but I usually just include it so that I can use the same commend for file and directory transfer.
  5. To copy files from your local machine to cvpost666: scp -r -P 2222 FILENAME YOURLOGIN@localhost:/lustre/naasc/YOURLOGIN/.

How to transfer files directly between the JAO clusters and the CV cluster (or any machine behind a firewall)

You need to have a CV cluster node reserved, and define a temporary ssh tunnel, as follows:
  • Check-out a node on the CV cluster (see instructions above)
  • Create a terminal window on the CV cluster node
  • Type this command in that window:
    • ssh -N -f -L 2122:casa06.sco.alma.cl:22 username@login.alma.cl
  • Now, still on the CV cluster node:
    • If you need to pullthe files from JAO:
      • cd to the location that you want to place the files
      • In the following command, replace username with your username, but do not change localhost. When transferring large files, the --partial-dir option allows one to resume from where you left off if the connection gets broken partway through a large file.
      • rsync -vau --progress --partial-dir=.rsync -e 'ssh -p 2122' username@localhost:/mnt/jaosco/data/whatever .
    • If you need to pushthe files to JAO:
      • cd to the parent directory of the file or directory you need to push
      • In the following command, replace username with your username, but do not change localhost
      • rsync -vau --progress --partial-dir=.rsync -e 'ssh -p 2122' myDirectory username@localhost:/mnt/jaosco/data/whatever/
  • Notes:
    • In the ssh command, the password that you give is your password on login.alma.cl.
    • In the rsync command, the password that you give is your password on casa06.sco.alma.cl.
    • If the rsync command gives you the dreaded "man-in-the-middle attack" error, then remove the entry for localhost in your ~/.ssh/known_hosts file.
    • If you need to pull multiple files or directories you can repeat the whole string "username@localhost:/mnt..." as many times as you want, ending with a " . ", and you will only need to give your password once!
    • When you are done you can run "ps ax | grep ssh" to find the ssh tunnel process id, and then kill it.
    • Having problems? then email Todd

How to transfer files directly between your home machine and the CV cluster (same as above except for the machine names)

You need to define a temporary ssh tunnel, as follows:
  • Create a terminal window on your home machine.
  • Type this command in that window:
    • ssh -N -f -L 2122:cvpost666.cv.nrao.edu:22 username@polaris.cv.nrao.edu
  • Now, still on your home machine:
    • If you need to pull the files from the cluster:
      • cd to the location that you want to place the files
      • In the following command, replace username with your username, but do not change localhost. When transferring large files, the --partial-dir option allows one to resume from where you left off if the connection gets broken partway through a large file.
      • rsync -vau --progress --partial-dir=.rsync -e 'ssh -p 2122' username@localhost:/lustre/naasc/whatever .
    • If you need to pushthe files to the cluster:
      • cd to the parent directory of the file or directory you need to push
      • In the following command, replace username with your username, but do not change localhost
      • rsync -vau --progress --partial-dir=.rsync -e 'ssh -p 2122' myDirectory username@localhost:/lustre/naasc/whatever/
  • Notes:
    • In the ssh command, the password that you give is your password on polaris.
    • In the rsync command, the password that you give is your password on cvpost666 (which is often the same).
    • If the rsync command gives you the dreaded "man-in-the-middle attack" error, then remove the entry for localhost in your ~/.ssh/known_hosts file.
    • If you need to pull multiple files or directories you can repeat the whole string "username@localhost:/mnt..." as many times as you want, ending with a " . ", and you will only need to give your password once!
    • When you are done you can run "ps ax | grep ssh" to find the ssh tunnel process id, and then kill it.

How to Shuffle Files Between /lustre/naasc and /lustre/cv

As space on /lustre/naasc often reaches a critically-low state, it can often be useful to find a "holding pen" for projects that you are working on but perhaps not in the short-term. One can use /lustre/cv for this purpose. In the following I describe how to use rsync to perform this shuffling operation:

  • Reserve a lustre node as described above.
  • Open an SSH tunnel as described above. For example, say you have reserved lustre node 666 and started a VNC server which uses port 5901 on that lustre node. Furthermore, you don't have any other local ssh sessions running, so you can use local port 5901:
  • Start your VNC server as described above.
  • Determine if you really have a data space overuse problem. Check by using the following command from your /lustre/naasc/YOURLOGIN directory, which will give you a size-ordered listing of the directories under YOURLOGIN (think if it as a "hit-list"...
    • du -m --max-depth=1 | sort -rn
  • Using your VNC session to talk to the lustre node you have reserved, use the following rsynccommand to copy all of the files from your CrappyData directory on /lustre/naasc/YOURLOGIN to /lustre/cv/YOURLOGIN:
    • rsync -av --progress --partial CrappyData /lustre/cv/YOURLOGIN/
    • where:
      • -a == Stands for "archive" and syncs recursively and preserves symbolic links, special and device files, modification times, group, owner, and permissions.
      • -v == Be chatty about it.
      • --progress == Tell me how the file transfer is going, with file transfer percentages and other useful information.
      • --partial == This option allows you to resume the transfer if it is interrupted (due to network failure, for example). Note that one can combine the --progress and --partial options with the -P switch.
  • Once the transfer is complete, use ssh to log into the main file server (ssh.cv.nrao.edu) and go to /lustre/cv/YOURLOGIN/CrappyData.
  • Check that the files have really made it by using du -sm to see the total size of the files transfered.
  • Compare the transferred data size to that of the directory you wanted to transfer. They should be equal (or within a MB). If it looks like everything transfered, delete the contents of the directory on /lustre/naasc/YOURLOGIN/CrappyData.
  • If you get to the point where you want to shuffle your data back from /lustre/cv to /lustre/naasc, simply reverse the source and target on the rsync command above. From your /lustre/cv/YOURLOGIN directory do the following:
    • rsync -av --progress --partial CrappyData /lustre/naasc/YOURLOGIN/

How to Remotely Mount the Lustre File System

You need to use the sshfs program to do this (for Macs, you can install this from Managed Software Center). The syntax for sshfs is as follows:
  • sshfs [user@]host:/path/to/what/you/want  /local/mount/point
After you have sshfs installed do the following:
  • Create a directory to be the mount-point for the remote system:
    • cd (go to your login root directory...not necessary but more intuitive...)
    • mkdir /lustre
    • mkdir /lustre/naasc
  • sshfs from cvpost-master:
    • sshfs cvpost-master:/lustre/naasc ./lustre/naasc
  • Check the mount point:
    • ls lustre/naasc
You should be good-to-go now. To use this new mount point, you can for example point a browser to a file on /lustre/naasc:

Access to analysisUtils

The analysisUtils package contains functions necessary to calibrate ALMA data. It is stored in a CVS repository in Santiago. Checking out a personal copy requires a CVS account, which Todd and Stuartt have. You are welcome to use the copy in Todd's area (either in CV or SCO). Simply edit the file init.py in your ~/.casa directory to contain the following lines:
import sys
sys.path.append("/users/thunter/AIV/science/analysis_scripts/") 
import analysisUtils as aU

If you have problems, then email Todd.

How to set focus policy when you cannot access System menus from VNC

  • gconftool-2 --type string --set /apps/metacity/general/focus_mode mouse

This topic: ALMA > WebHome > NAASC > DataServicesGroup > LustreAndCluster
Topic revision: 2018-12-03, JeffMangum
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback