You are here:
NRAO Public Wiki
>
ALMA/NAASC Web
>
NaascLustre
>
NAASCLustreUpgrade2018
(2018-05-14,
AndrewMcNichols
)
Edit
w
iki text
Edit
Attach
P
rint version
Before cutover
Before cutover
AEON
Click me to change my state from 'done' to 'todo'
t: 2018-05-11 - 14:16 - Main.AndrewMcNichols, last state: todo
Finish and verify data copies of comm area
Click me to change my state from 'done' to 'todo'
t: 2018-05-11 - 15:39 - Main.AndrewMcNichols, last state: todo
Todd's adapted immoments script
Click me to change my state from 'todo' to 'done'
t: 2018-05-11 - 15:39 - Main.AndrewMcNichols, last state: done
10x clustre nodes mounted and running batch pipeline jobs
During cutover (May 11 - May 13)
During cutover (May 11 - May 13)
AEON
17:00 EDT Friday
Click me to change my state from 'done' to 'todo'
t: 2018-05-11 - 17:36 - Main.CjAllen, last state: todo
Disconnect desktop clients, non-NAASC server clients
Click me to change my state from 'done' to 'todo'
t: 2018-05-11 - 17:30 - Main.CjAllen, last state: todo
Disconnect cvpost lustre clients
Click me to change my state from 'done' to 'todo'
t: 2018-05-11 - 17:30 - Main.CjAllen, last state: todo
Disconnect remaining NAASC server clients (excluding lustre data movers)
Click me to change my state from 'done' to 'todo'
t: 2018-05-11 - 19:12 - Main.AndrewMcNichols, last state: todo
Disable write access to /.lustre/naasc OSTs (i.e., unmount and remount with 'ro' option)
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 08:38 - Main.AndrewMcNichols, last state: todo
Run final rsyncs of the ~18 filesystem "pieces"
Run final global sync (echo "rsync -vaW --stats --links /.lustre/naasc/* /.lustre/aeon/" | at -m -t 201805111730.00)
09:30 EDT Saturday
Click me to change my state from 'done' to 'todo'
t: 2018-05-12 - 11:08 - Main.AndrewMcNichols, last state: todo
Update /etc/sysconfig/network-scripts/ on LNET routers (terminus, trantor) and validate
Click me to change my state from 'done' to 'todo'
t: 2018-05-12 - 11:08 - Main.AndrewMcNichols, last state: todo
Update /etc/fstab and reboot cvpost-master, cvpost-serv-1, skidmark
Click me to change my state from 'done' to 'todo'
t: 2018-05-12 - 11:08 - Main.AndrewMcNichols, last state: todo
Validate successful changeover on hex, elwood
Click me to change my state from 'done' to 'todo'
t: 2018-05-12 - 11:08 - Main.AndrewMcNichols, last state: todo
Validate successful changeover on cambryn, cvpost[061-064]
Click me to change my state from 'done' to 'todo'
t: 2018-05-12 - 11:09 - Main.AndrewMcNichols, last state: todo
Validate successful changeover on mano, osiris, cvpost[101-104]
Click me to change my state from 'done' to 'todo'
t: 2018-05-12 - 11:08 - Main.AndrewMcNichols, last state: todo
Validate successful changeover on valkyrie
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 08:38 - Main.AndrewMcNichols, last state: todo
After validation, reenable full service
Boot servers
Boot cluster
Boot workstations
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 08:38 - Main.AndrewMcNichols, last state: todo
Uptime notification emails to NAASC (staff and visitors) and cvlinux (to catch bcotton++)
After cutover
After cutover
AEON
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 08:38 - Main.AndrewMcNichols, last state: todo
Ensure closed access to old lustre - disable MDS (asimov)
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 13:19 - Main.AndrewMcNichols, last state: todo
Test/enable cron jobs that use lustre
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 08:38 - Main.AndrewMcNichols, last state: todo
cubach operational validation
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 13:18 - Main.AndrewMcNichols, last state: todo
Enable inventory script on all servers
Click me to change my state from 'done' to 'todo'
t: 2018-05-14 - 13:18 - Main.AndrewMcNichols, last state: todo
Add user 'apache' to /etc/[shadow,passwd] on MDS pair
Click me to change my state from 'done' to 'todo'
t: 2018-06-18 - 08:19 - Main.AndrewMcNichols, last state: todo
thunter development benchmarking
Click me to change my state from 'done' to 'todo'
t: 2018-06-18 - 08:20 - Main.AndrewMcNichols, last state: todo
bemonts performance benchmarking
Click me to change my state from 'todo' to 'done'
t: 2018-05-08 - 15:07 - Main.WikiGuest, last state: todo
Configure and enable BackupPC on all servers
Click me to change my state from 'done' to 'todo'
t: 2018-05-21 - 15:32 - Main.AndrewMcNichols, last state: todo
Configure and enable nrpe/Nagios on all servers
Click me to change my state from 'done' to 'todo'
t: 2018-05-21 - 15:32 - Main.AndrewMcNichols, last state: todo
Configure and enable logwatch on all servers
Click me to change my state from 'todo' to 'done'
t: 2018-05-14 - 10:56 - Main.AndrewMcNichols, last state: todo
Configure and enable rsyslog on all servers
Click me to change my state from 'done' to 'todo'
t: 2018-06-18 - 08:20 - Main.AndrewMcNichols, last state: todo
Enable quotas for observers
https://staff.nrao.edu/wiki/bin/view/CIS/Documentation/LustreQuotas
Click me to change my state from 'todo' to 'done'
t: 2018-05-14 - 10:56 - Main.AndrewMcNichols, last state: todo
Configure MDT ring to enable changelogs so Robinhood can run against new file system
Click me to change my state from 'todo' to 'done'
t: 2018-05-14 - 13:18 - Main.AndrewMcNichols, last state: todo
~30 days after cutover, de-installation of old lustre
-- Created 2018-05-03
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew wiki text
|
Edit
w
iki text
|
M
ore topic actions
Topic revision: r4 - 2018-05-14,
AndrewMcNichols
ALMA/NAASC
Log In
Register
ALMA/NAASC Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki?
Send feedback