This page or subpages are intended to track Wide Area Network (WAN) performance within the NRAO, primarily for improving performance instrument data mirroring and user access.
Graphs
- Need links to local network graphs
- Need links to remote network graphs
-
Procedures and Documentation
- Documentation for how to run tests against
- Internal networks
- Point to point (AOC to CV) networks
- External networks
- Documentation or pointers to documentation for the affects of things like buffer size, window size, latency etc
Diagnostics
- Understanding what's expected
- How to interpret results
Tuning
- Proper long haul gbit tuning (targeted at both internal and external users)
AOC-CV network
We need to support a minimum of 25MB/s transfers from AOC to CV to effectively mirror EVLA data. This may need to be as high as 75MB/s to handle peak data rates. We can potentially meet that limit buy using multiple channels. There are 4 active NGAS nodes mirroring to 4 slave nodes, in theory if each node can sustain nearly 15MB/s we will be in good shape.
Tests and Results
- AOC test system
- The AOC test system has an internal and external interface.
- 146.88.4.213 is connected to an internal network and is behind the site router and main building switch.
- 146.88.241.11 is connected to a switch between the AOC and New Mexico Tech (NMT) routers.
- Note Which selecting a different interface requires changing a static route on the test node, otherwise it will reply out the default interface and the pattern will be asymmetric.
- route add -net 192.33.115.0 netmask 255.255.255.0 dev eth1 would force it to route all traffic to CV test fixture out the external interface.
- CV test system
- CV test system has a single internal interface
- 192.33.115.157 is connected to the main building switch.
Initial tests show insufficient bandwidth between the sites. More importantly it shows highly varying rates between portions of the path. There is a minor difference when testing through the internal and external AOC test fixture described above.
A current diagram showing all the devices between the test points, their connection bandwidth and initial results can be found here:
WANdiag.pdf
- Links are color coded for 1, 10 and 100Gbit
- Results are color coded depending on they were the result of Web100 tests or iperf tests.
- Rates describe the input rate experienced by the receiving end.
- For example the 106Mbit/s and 755Mbit/s rates from 192.33.115.157 to the UVa Web100 server means 106Mbit/s from CV to UVa, 755Mbit/s from UVa to CV.
- Test to/from AOC & TeraGrid show: 16.6MB/s - 32.3MB/s today on upload and 5.7MB/s to 7.4 MB/S on download (ssh.aoc.nrao.edu & mss.ncsa.uiuc.edu)
- Test to/from CV & TeraGrid show: upload fairly saturated and 6.5 MB/s - 21.0MB/s today on download (fatman.cv.nrao.edu & mss.ncsa.uiuc.edu)
To be Done
- Contact UVa and NMT/UNM James
- Check link congestion, UVa is the more critical issue, NMT seems fine.
- Check traffic shaping
- Check topology
- Move iperf to Cato Mike
- Allow inbound port 12345 to all test IP's from any address for symmetric iperf tests. Currently can only test outbound. Gene/Derek
- Get UNM and Virgina NLR contact info James
- I'll handle talking to the two sites
- pchang@unm.edu (Paul Chang)
- Request tests between UNM-NLR and UVa-NLR
- Contact Tang at UVa and verify the Web100 server buffer settings didn't revert. That would explain the low inbound rate. I opted not to send the email, in retrospect I don't have evidence there's anything wrong with the Web100 server at UVa
Configuration changes
Congestion control options: /proc/sys/net/ipv4/tcp_congestion_control
modprobe tcp_htcp
No clear difference
modprobe tcp_cubic
No clear difference
-- JamesRobnett - 2011-07-19