Compute Nodes
Basic System Specs
The SWC (
SoftWare Correlator) hosts are Dell R640 systems. These have two Intel Xeon Gold 6132 CPUs which run at 2.60 GHz; each CPU provides 14 cores (hyperthreading is disabled on these systems). Each host has 32 GB of memory.
The AOC correlator is intended for test work which is desirable since the DC correlator is an operational system in fairly heavy use. As such, the AOC correlator has only enough hardware resources to allow reasonable test work. The AOC correlator has 5 SWC nodes and the oprational DC correlator has 64 nodes. The test correlator has also served as a backup to the DC correlator when the USNO encountered site-related problems; after this, it was decided to beef up the ability of the USNO to utilize the AOC test correlator as an actual, rather than
ad hoc backup.
The compute nodes are diskless nodes and utilize a system image served up by server-1. In general, all the SWCs use the same readonly system image although the system design supports some host-specific filesCurrently, only swc-001 uses it's host-specific files; this is because swc-001 has an 1G ethernet external connection whereas all of the other SWCs only support a single 1G ethernet connection to the correlator admin net (10.1.36). See the
disk layout section for more details. The SWCs also nfs mount a data partition served up by server-1 which houses users home directories and the
DiFX installation, etc.
Network Hardware
Integral NICs
This particular model of the Dell R640 servers have a group of four NIC ports on the back; there is an additional NIC for the iDRAC system located at the far left. On this model there are two 10G ports (requiring an SFP; the names are
em1
and
em2
) and two 1G ports (
em3
and
em4
);. The ports are numbered from left to right so the first 1G port is actually called port 3 on hardware setup screens.
Infiniband NIC(s)
The SWCs were augmented with one or two Infiniband (IB) NICs allowing for high rate data transfer. At the AOC, each host has a single IB NIC. At the DC correlator, sixteen hosts, SWC-001..016, have two IB NICs while the remaining 48 hosts, SWC-017..064 have only a single IB NIC. The reason for this is that the first 16 nodes are optimized to serve as
DiFX data nodes whose purpose is to read in raw data from disk and distribute it to the computational nodes. The second NIC allows the data nodes to theoretically have twice the output bandwidth of a singly NICed node.
--
JimJacobs - 2019-05-13