The data reduction is Python based. At the moment Dysh is still under development, so we will try to avoid using it as it is not yet stable. Instead the data reduction relies on custom Python code and astropy.

Getting started

If you are working with Pedro, he hopefully already gave you an overview of how to set things up. These instructions are meant to be used in GBO computers (GBO specifics in red), though they can be adapted to a general use case as well.

Where to work

You should use one of the public data reduction machines to work. If you are outside of GBO you can use this to get here.

You should keep your data in /home/scratch/<your user name>, not in your home (or ~) directory or other places.

Some LINUX terminal commands frequently used

In the terminal, if we do not know which is our working directory, we can access that by typing in pwd which stands for "print working directory". In the case where we already know our working directory, we can skip this. For example,

/home/scratch/mtasnim is my (the student's) working directory in this case. To change the directory, we type cd  /home/scratch/mtasnim/DriftScan or cd  /home/scratch/mtasnim/DriftScan/GLANDS , depending on the directory we would like to access.

To see the contents of a directory, type ls and press enter. To query into specific contents, type: ls *contentnamefragment.fileextension. For example, to see the Bank E merged fits files only, we type: ls *merged_Bank_E.fits

When we want to initialize

To use a different host, type in: ssh hostname in terminal. Par exemple: ssh newton .

To make data cubes using the gbtgridder, we don't need to follow any of the instructions on GitHub, since it has already been installed for us. The command would look different depending on if we are gridding a single file or multiple files.

Let's say we want to grid the merged fits files for AGBT22B_234_02_pol_0 only. Go to the terminal and type:

gbtgridder -o merged_cube_1 -a 20 AGBT22B_234_02_pol_0_merged_Bank_E.fits

The name after -o indicates our desired file name, and -a indicates the command to grid over average channels, since gridding over individual channels would yield very minimal width. After gridding has been done, type:

ls file_name_cube.fits

and it will show you the file in the directory (this is just to verify that gridding has been done properly)

If instead, we want to grid multiple files with the same extension, we can type in:

gbtgridder -o desiredfilename -a 20 *pol_0*cal_scan_*_bank_E.fits

This will grid all the polarization 0 scans in Bank E.

Let's say we want to view the "size" of our fits file inside the DriftScan directory. To do so, we type: du -hs *.fits

To exit from a terminal editor, type: colon+q and then "enter".

To open a png file, type in: xdg-open image.png

When opening a fits cube on DS9, type in: ds9 file_name_cube.fits

Setting up a virtual environment

You can use the virtual environment in /home/scratch/psalas/projects/DriftScan/py3.11 if you like.

We want to avoid using the system wide Python installation as this one is not meant to be modified. For this we will use our own Python installation in a virtual environment. To create a virtual environment and install some packages you can use:

~gbosdd/pythonversions/3.11/bin/python -m venv py3.11
source py3.11/bin/activate # This will "activate" the virtual environment. After this your shell prompt should have a (py3.11) at the start of the line (if not something might be wrong).
python -m pip install --upgrade pip # This will upgrade the package installer for Python.
python -m pip install wheel build setuptools # These are useful for installing other packages.
python -m pip install astropy numpy matplotlib pandas fitsio # Core working components.
python -m pip install jupyterlab ipympl # This is optional if you would like to use jupyterlab.=

You can also directly use Python from the virtual environment as:
py3.11/bin/python
So, for example, if you wanted to install a package into the virtual environment without "activating" it you could use:
py3.11/bin/python -m pip install pandas # This would install pandas into the py3.11 virtual environment.

Alternatively you can use conda to create and manage the virtual environments. To use conda, we need to type conda activate DriftScan in the terminal in order to access the conda environment, and initialization has been done in our bash profile already.

If using conda instead, we might need to install the modules glands and fitsio before we can import them. To do this, type pip install -e . in the GLANDS directory for module glands, and type pip install fitsio the fitsio module.

Other advantages of using virtual environments for a project:
  • It makes it easier to reproduce your results even if Python, or other libraries, change over time.
  • It makes it easier to share your results; you can share the details of the environemnt and others can reproduce your installation.

Working using jupyterlab

I would recommend using jupyter-lab or jupyter notebook to work, as these provide a natural way of documenting your work as well as keeping code, outputs and errors all in one place.

Once you have a working Python environment, or using /home/scratch/psalas/projects/DriftScan/py3.11, you can start a jupyter server with the command

jupyter-lab

this will launch a new browser window (or tab) with a jupyter server.

Remote notebooks

If you are working remotely you might want to change the above to:

jupyter-lab --ip=${HOST}.gb.nrao.edu --port=<port> --no-browser

where <port> is a number like 9020, and $HOST is the computer you are working on. For this example we'll use thales as $HOST. The --no-browser option keeps jupyter from trying to launch a web browser window, which could be very slow when working remotely.

Then, to open the jupyter server remotely you'll have to open a ssh tunnel:

ssh -N [-f] -L 9020:thales.gb.nrao.edu:9020 <your user name>@ssh.gb.nrao.edu

this will redirect port 9020 of thales to port 9020 in your local machine. Then you can connect by going to localhost:9020 in a web browser (it will also ask for the authentication token which you can find in the terminal where you first started the jupyter server).

To close the tunnel close the terminal where you started the tunnel. Alternatively, find it using ps aux | grep 9020 and then killing the process (e.g., using: kill -15 12345 where 12345 is the process ID).

If you like the old notebooks better, change jupyter-lab to jupyter notebook in the above commands.

If your internet connection is unstable, I'd recommend starting the jupyter server inside a screen (GNU screen). A basic tutorial can be found here. Once you launch a screen session, your terminal starts fresh, so you'll have to activate the Python environment inside the screen session to pick up the right Python and jupyter versions.

Downloading the GLANDS Python data reduction tools

To clone the GLANDS project:
git clone git@github.com:astrofle/GLANDS.git

You might need to have an ssh-key enabled for the above (try it and hopefully it will work without issues). Thomas Chamberlin has some instructions on how to set up git at GBO here.

You can follow these instructions to install the GLANDS tools. To install

python -m pip install -e .

in the GLANDS root directory. So if you cloned the code to /home/scratch/mtasnim/DriftScan/GLANDS and your Python virtual envirnment is in /home/scratch/mtasnim/DriftScan/python3.11 you'd do:

source /home/scratch/mtasnim/DriftScan/python3.11/bin/activate
cd /home/scratch/mtasnim/DriftScan/GLANDS
python -m pip install -e .

the -e option tells pip that the package is to be installed in editable mode. We use this because the code is still under development.

To update the GLANDS project to its latest version:

cd /home/scratch/mtasnim/DriftScan/GLANDS
git pull origin main

assuming you cloned the project to /home/scratch/mtasnim/DriftScan/GLANDS.

There are examples of how to use the GLANDS package in here.

Working directory example

Pedro is working on the project in:
/home/scratch/psalas/projects/DriftScan
feel free to use the Python virtual environment in there or adopt a similar structure.

Remote access to GBO computers

There are a number of options to access the GBO computers remotely:
  • FastX. (Please remember to close your sessions when you are not using them, as there are only a limited number of seats available.)
  • ssh
  • You can also use a Virtual Network Connection (VNC) to one of the public data reduction machines.

Other useful applications at GBO

When working in a GBO computer the following applications are available:
  • ds9 : FITS cube viewer. Website.
  • psum : Generates summaries of GBT sessions. Example usage: psum -L AGBT22B_234_02
  • gbtgridder : Grids the contents of an SDFITS file to a cube. GitHub page with some documentation.
  • gbt-rfi-gui : Graphical user interface (GUI) to examine radio frequency interference (RFI) scans from the GBT. User guide.
  • GBT archive : Data archive for the GBT. It requires login.

-- PedroSalas - 2023-06-01
Topic revision: r17 - 2023-09-06, PedroSalas
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback