The data reduction is Python based. At the moment
Dysh is still under development, so we will try to avoid using it as it is not yet stable. Instead the data reduction relies on custom Python code and
astropy.
Getting started
If you are working with Pedro, he hopefully already gave you an overview of how to set things up. These instructions are meant to be used in GBO computers (GBO specifics in
red), though they can be adapted to a general use case as well.
Where to work
You should use one of the
public data reduction machines to work. If you are outside of GBO you can use
this to get here.
You should keep your data in
/home/scratch/<your user name>
, not in your
home
(or ~) directory or other places.
Some LINUX terminal commands frequently used
In the terminal, if we do not know which is our working directory, we can access that by typing in
pwd
which stands for "print working directory". In the case where we already know our working directory, we can skip this. For example,
/home/scratch/mtasnim
is my (the student's) working directory in this case. To change the directory, we type
cd /home/scratch/mtasnim/DriftScan
or
cd /home/scratch/mtasnim/DriftScan/GLANDS
, depending on the directory we would like to access.
To see the contents of a directory, type
ls
and press enter. To query into specific contents, type:
ls *contentnamefragment.fileextension
. For example, to see the Bank E merged fits files only, we type:
ls *merged_Bank_E.fits
When we want to initialize
To use a different host, type in:
ssh hostname
in terminal. Par exemple:
ssh newton
.
To make data cubes using the gbtgridder, we don't need to follow any of the instructions on GitHub, since it has already been installed for us. The command would look different depending on if we are gridding a single file or multiple files.
Let's say we want to grid the merged fits files for AGBT22B_234_02_pol_0 only. Go to the terminal and type:
gbtgridder -o merged_cube_1 -a 20 AGBT22B_234_02_pol_0_merged_Bank_E.fits
The name after
-o
indicates our desired file name, and
-a
indicates the command to grid over average channels, since gridding over individual channels would yield very minimal width. After gridding has been done, type:
ls file_name_cube.fits
and it will show you the file in the directory (this is just to verify that gridding has been done properly)
If instead, we want to grid multiple files with the same extension, we can type in:
gbtgridder -o desiredfilename -a 20 *pol_0*cal_scan_*_bank_E.fits
This will grid all the polarization 0 scans in Bank E.
Let's say we want to view the "size" of our fits file inside the DriftScan directory. To do so, we type:
du -hs *.fits
To exit from a terminal editor, type: colon+q and then "enter".
To open a png file, type in:
xdg-open image.png
When opening a fits cube on DS9, type in:
ds9 file_name_cube.fits
Setting up a virtual environment
You can use the virtual environment in
/home/scratch/psalas/projects/DriftScan/py3.11
if you like.
We want to avoid using the system wide Python installation as this one is not meant to be modified. For this we will use our own Python installation in a virtual environment. To create a virtual environment and install some packages you can use:
~gbosdd/pythonversions/3.11/bin/python -m venv py3.11
source py3.11/bin/activate
# This will "activate" the virtual environment. After this your shell prompt should have a (py3.11) at the start of the line (if not something might be wrong).
python -m pip install --upgrade pip
# This will upgrade the package installer for Python.
python -m pip install wheel build setuptools
# These are useful for installing other packages.
python -m pip install astropy numpy matplotlib pandas fitsio
# Core working components.
python -m pip install jupyterlab ipympl
# This is optional if you would like to use jupyterlab.=
You can also directly use Python from the virtual environment as:
py3.11/bin/python
So, for example, if you wanted to install a package into the virtual environment without "activating" it you could use:
py3.11/bin/python -m pip install pandas # This would install pandas into the py3.11 virtual environment.
Alternatively you can use
conda to create and manage the virtual environments. To use conda, we need to type
conda activate DriftScan
in the terminal in order to access the conda environment, and initialization has been done in our bash profile already.
If using conda instead, we might need to install the modules glands and fitsio before we can import them. To do this, type
pip install -e .
in the GLANDS directory for module glands, and type
pip install fitsio
the fitsio module.
Other advantages of using virtual environments for a project:
- It makes it easier to reproduce your results even if Python, or other libraries, change over time.
- It makes it easier to share your results; you can share the details of the environemnt and others can reproduce your installation.
Working using jupyterlab
I would recommend using
jupyter-lab
or
jupyter notebook
to work, as these provide a natural way of documenting your work as well as keeping code, outputs and errors all in one place.
Once you have a working Python environment, or using
/home/scratch/psalas/projects/DriftScan/py3.11
, you can start a jupyter server with the command
jupyter-lab
this will launch a new browser window (or tab) with a jupyter server.
Remote notebooks
If you are working remotely you might want to change the above to:
jupyter-lab --ip=${HOST}.gb.nrao.edu --port=<port> --no-browser
where <port> is a number like 9020, and
$HOST
is the computer you are working on. For this example we'll use thales as
$HOST
. The
--no-browser
option keeps jupyter from trying to launch a web browser window, which could be very slow when working remotely.
Then, to open the jupyter server remotely you'll have to open a ssh tunnel:
ssh -N [-f] -L 9020:thales.gb.nrao.edu:9020 <your user name>@ssh.gb.nrao.edu
this will redirect port 9020 of thales to port 9020 in your local machine. Then you can connect by going to
localhost:9020
in a web browser (it will also ask for the authentication token which you can find in the terminal where you first started the jupyter server).
To close the tunnel close the terminal where you started the tunnel. Alternatively, find it using
ps aux | grep 9020
and then killing the process (e.g., using:
kill -15 12345
where 12345 is the process ID).
If you like the old notebooks better, change
jupyter-lab
to
jupyter notebook
in the above commands.
If your internet connection is unstable, I'd recommend starting the jupyter server inside a
screen
(
GNU screen). A basic tutorial can be found
here. Once you launch a
screen
session, your terminal starts fresh, so you'll have to activate the Python environment inside the
screen
session to pick up the right Python and jupyter versions.
To clone the
GLANDS
project:
git clone git@github.com:astrofle/GLANDS.git
You might need to have an ssh-key enabled for the above (try it and hopefully it will work without issues). Thomas Chamberlin has some instructions on how to set up git at GBO
here.
You can follow
these instructions to install the
GLANDS
tools. To install
python -m pip install -e .
in the
GLANDS
root directory. So if you cloned the code to
/home/scratch/mtasnim/DriftScan/GLANDS
and your Python virtual envirnment is in
/home/scratch/mtasnim/DriftScan/python3.11
you'd do:
source /home/scratch/mtasnim/DriftScan/python3.11/bin/activate
cd /home/scratch/mtasnim/DriftScan/GLANDS
python -m pip install -e .
the
-e
option tells
pip
that the package is to be installed in editable mode. We use this because the code is still under development.
To update the
GLANDS
project to its latest version:
cd /home/scratch/mtasnim/DriftScan/GLANDS
git pull origin main
assuming you cloned the project to
/home/scratch/mtasnim/DriftScan/GLANDS
.
There are examples of how to use the
GLANDS
package in
here.
Working directory example
Pedro is working on the project in:
/home/scratch/psalas/projects/DriftScan
feel free to use the Python virtual environment in there or adopt a similar structure.
Remote access to GBO computers
There are a number of options to access the GBO computers remotely:
- FastX. (Please remember to close your sessions when you are not using them, as there are only a limited number of seats available.)
- ssh
- You can also use a Virtual Network Connection (VNC) to one of the public data reduction machines.
Other useful applications at GBO
When working in a GBO computer the following applications are available:
-
ds9
: FITS cube viewer. Website.
-
psum
: Generates summaries of GBT sessions. Example usage: psum -L AGBT22B_234_02
-
gbtgridder
: Grids the contents of an SDFITS file to a cube. GitHub page with some documentation.
-
gbt-rfi-gui
: Graphical user interface (GUI) to examine radio frequency interference (RFI) scans from the GBT. User guide.
- GBT archive : Data archive for the GBT. It requires login.
--
PedroSalas - 2023-06-01