Compute Nodes

Basic System Specs

The SWC (SoftWare Correlator) hosts are Dell R640 systems. These have two Intel Xeon Gold 6132 CPUs which run at 2.60 GHz; each CPU provides 14 cores (hyperthreading is disabled on these systems). Each host has 32 GB of memory.

The AOC correlator is intended for test work which is desirable since the DC correlator is an operational system in fairly heavy use. As such, the AOC correlator has only enough hardware resources to allow reasonable test work. The AOC correlator has 5 SWC nodes and the oprational DC correlator has 64 nodes. The test correlator has also served as a backup to the DC correlator when the USNO encountered site-related problems; after this, it was decided to beef up the ability of the USNO to utilize the AOC test correlator as an actual, rather than ad hoc backup.

The compute nodes are diskless nodes and utilize a system image served up by server-1. In general, all the SWCs use the same readonly system image although the system design supports some host-specific filesCurrently, only swc-001 uses it's host-specific files; this is because swc-001 has an 1G ethernet external connection whereas all of the other SWCs only support a single 1G ethernet connection to the correlator admin net (10.1.36). See the disk layout section for more details. The SWCs also nfs mount a data partition served up by server-1 which houses users home directories and the DiFX installation, etc.

Network Hardware

Integral NICs

This particular model of the Dell R640 servers have a group of four NIC ports on the back; there is an additional NIC for the iDRAC system located at the far left. On this model there are two 10G ports (requiring an SFP; the names are em1 and em2) and two 1G ports (em3 and em4);. The ports are numbered from left to right so the first 1G port is actually called port 3 on hardware setup screens.

Infiniband NIC(s)

The SWCs were augmented with one or two Infiniband (IB) NICs allowing for high rate data transfer. At the AOC, each host has a single IB NIC. At the DC correlator, sixteen hosts, SWC-001..016, have two IB NICs while the remaining 48 hosts, SWC-017..064 have only a single IB NIC. The reason for this is that the first 16 nodes are optimized to serve as DiFX data nodes whose purpose is to read in raw data from disk and distribute it to the computational nodes. The second NIC allows the data nodes to theoretically have twice the output bandwidth of a singly NICed node.

-- JimJacobs - 2019-05-13
Topic revision: r3 - 2020-12-08, JimJacobs
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback