Clever Notes on the Cluster Nodes
1 Know your Nodes (Tech Specs)
1.1 Sort by attributes (model, for example)
- Use
technode -n cvpost###
to query a model number of a node
- Use
technode -m model
to pull up a list of all nodes matching a particular model
[root@cvpost-master bin]# technode -n cvpost013
cvpost013 is a PowerEdge R620
[root@cvpost-master bin]# technode -m R620
cvpost002: Product Name: PowerEdge R620
cvpost003: Product Name: PowerEdge R620
cvpost004: Product Name: PowerEdge R620
cvpost011: Product Name: PowerEdge R620
cvpost012: Product Name: PowerEdge R620
cvpost013: Product Name: PowerEdge R620
cvpost014: Product Name: PowerEdge R620
cvpost015: Product Name: PowerEdge R620
cvpost017: Product Name: PowerEdge R620
cvpost018: Product Name: PowerEdge R620
cvpost030: Product Name: PowerEdge R620
cvpost031: Product Name: PowerEdge R620
cvpost032: Product Name: PowerEdge R620
cvpost033: Product Name: PowerEdge R620
cvpost034: Product Name: PowerEdge R620
cvpost035: Product Name: PowerEdge R620
cvpost036: Product Name: PowerEdge R620
cvpost037: Product Name: PowerEdge R620
cvpost038: Product Name: PowerEdge R620
cvpost039: Product Name: PowerEdge R620
cvpost040: Product Name: PowerEdge R620
cvpost041: Product Name: PowerEdge R620
cvpost042: Product Name: PowerEdge R620
cvpost043: Product Name: PowerEdge R620
cvpost044: Product Name: PowerEdge R620
cvpost045: Product Name: PowerEdge R620
cvpost046: Product Name: PowerEdge R620
cvpost047: Product Name: PowerEdge R620
cvpost048: Product Name: PowerEdge R620
cvpost049: Product Name: PowerEdge R620
cvpost050: Product Name: PowerEdge R620
cvpost051: Product Name: PowerEdge R620
cvpost052: Product Name: PowerEdge R620
cvpost053: Product Name: PowerEdge R620
cvpost054: Product Name: PowerEdge R620
cvpost055: Product Name: PowerEdge R620
cvpost056: Product Name: PowerEdge R620
cvpost057: Product Name: PowerEdge R620
cvpost058: Product Name: PowerEdge R620
cvpost059: Product Name: PowerEdge R620
cvpost060: Product Name: PowerEdge R620
cvpost061: Product Name: PowerEdge R620
cvpost062: Product Name: PowerEdge R620
cvpost063: Product Name: PowerEdge R620
cvpost064: Product Name: PowerEdge R620
cvpost065: Product Name: PowerEdge R620
[root@cvpost-master bin]#
1.2 What's reserved or in use
2 Know your Node (Users)
- Use
whonode
on cvpost-master
2.1 How many nodes are reserved
-
whonode -c
- Just the number of nodes reserved
[root@cvpost-master bin]# whonode -c
51 nodes are currently reserved.
2.2 How many nodes does a particular user have?
-
whonode -u username
- You can look up multiple users at a time by passing the -u flag multiple times
- Output will include the cvpost node names corresponding to the JobIDs belonging to the user.
#### NOTE: OUTPUT LOOKS BETTER ON ACTUAL TERMINAL####
[root@cvpost-master ~]# whonode -u swood -u jmangum -u rindebet -u jmeyer
Node Name Job ID Username SessID Nodes Tasks Req.Time Status Elap.Time
cvpost065/0 1360.cvpost-serv swood 23926 1 1 720:00:00 R 558:42:00
User swood has 1 node reserved.
Node Name Job ID Username SessID Nodes Tasks Req.Time Status Elap.Time
cvpost030/0 1392.cvpost-serv jmangum 31640 1 1 1080:00:0 R 479:37:14
cvpost025+23/0 1400.cvpost-serv jmangum 126351 2 2 1080:00:0 R 386:17:18
User jmangum has 3 nodes reserved.
Node Name Job ID Username SessID Nodes Tasks Req.Time Status Elap.Time
cvpost058/0 1261.cvpost-serv rindebet 15230 1 1 2088:00:0 R 1273:51:5
cvpost035/0 1453.cvpost-serv rindebet 13081 1 1 1080:00:0 R 54:47:48
User rindebet has 2 nodes reserved.
Node Name Job ID Username SessID Nodes Tasks Req.Time Status Elap.Time
cvpost039/0 1404.cvpost-serv jmeyer 11693 1 1 504:00:00 R 337:56:02
cvpost040/0 1415.cvpost-serv jmeyer 15127 1 1 504:00:00 R 219:46:01
cvpost026/0 1419.cvpost-serv jmeyer 92140 1 1 504:00:00 R 213:29:28
User jmeyer has 3 nodes reserved.
2.3 Who has node cvpostxxx?
[root@cvpost-master bin]# whonode -n cvpost044
cvpost044 is reserved by amcnicho
2.4 What node does the JobID correspond to?
-
whonode -j
- You can look up multiple JobIDs at once by passing the -j flag multiple times
[root@cvpost-master bin]# whonode -j 1447 -j 1455 -j 1400
JobID 1447 is on cvpost049/0
JobID 1455 is on cvpost059/0
JobID 1400 is on cvpost025/0+cvpost023/0
2.5 All users on reserved nodes
-
whonode
- This command automatically includes the count of all nodes
[root@cvpost-master bin]# whonode
[root@cvpost-master ~]# whonode
Node Name Job ID Username SessID Nodes Tasks Req.Time Status Elap.Time
cvpost058/0 1261.cvpost-serv rindebet 15230 1 1 2088:00:0 R 1273:17:1
cvpost029/0 1305.cvpost-serv cbrogan 127311 1 1 2160:00:0 R 1084:43:2
cvpost046/0 1328.cvpost-serv awootten 21319 1 1 744:00:00 R 722:54:00
cvpost005/0 1346.cvpost-serv elastufk 79962 1 1 960:00:00 R 653:11:08
cvpost065/0 1360.cvpost-serv swood 23926 1 1 720:00:00 R 558:07:21
cvpost051/0 1361.cvpost-serv awells 20860 1 1 672:00:00 R 557:31:10
cvpost016/0 1382.cvpost-serv efomalon 50061 1 1 720:00:00 R 506:44:00
cvpost043/0 1390.cvpost-serv amoullet 24986 1 1 720:00:00 R 479:36:19
cvpost030/0 1392.cvpost-serv jmangum 31640 1 1 1080:00:0 R 479:02:35
cvpost031/0 1399.cvpost-serv thunter 10718 1 1 1080:00:0 R 387:45:27
cvpost025+23/0 1400.cvpost-serv jmangum 126351 2 2 1080:00:0 R 385:42:39
cvpost048/0 1402.cvpost-serv akepley 15312 1 1 1080:00:0 R 360:35:19
cvpost039/0 1404.cvpost-serv jmeyer 11693 1 1 504:00:00 R 337:21:23
cvpost057/0 1410.cvpost-serv dschieb 29156 1 1 336:00:00 R 242:41:48
cvpost044/0 1413.cvpost-serv amcnicho 10867 1 1 336:00:00 R 222:12:12
cvpost040/0 1415.cvpost-serv jmeyer 15127 1 1 504:00:00 R 219:11:22
cvpost033+24/0 1416.cvpost-serv bmason 30493 2 2 336:00:00 R 217:17:21
cvpost026/0 1419.cvpost-serv jmeyer 92140 1 1 504:00:00 R 212:54:49
cvpost062/0 1421.cvpost-serv cv-8648 31844 1 1 336:00:00 R 197:58:11
cvpost056/0 1422.cvpost-serv cv-6434 5243 1 1 336:00:00 R 197:52:52
cvpost018/0 1424.cvpost-serv cv-8708 10290 1 1 336:00:00 R 197:51:06
cvpost017/0 1425.cvpost-serv cv-8650 40536 1 1 336:00:00 R 197:50:13
cvpost014/0 1426.cvpost-serv cv-8709 15098 1 1 240:00:00 R 197:48:16
cvpost013/0 1427.cvpost-serv cv-8602 31856 1 1 336:00:00 R 197:48:03
cvpost011/0 1429.cvpost-serv cv-8412 35302 1 1 336:00:00 R 197:37:52
cvpost063/0 1430.cvpost-serv cv-45 38757 1 1 336:00:00 R 196:22:22
cvpost009/0 1432.cvpost-serv reharris 2172 1 1 336:00:00 R 189:57:43
cvpost008/0 1433.cvpost-serv cubach 14880 1 1 1080:00:0 R 173:59:44
cvpost055/0 1434.cvpost-serv jhibbard 33564 1 1 960:00:00 R 171:52:04
cvpost054/0 1435.cvpost-serv ekeller 44365 1 1 1080:00:0 R 169:19:47
cvpost037/0 1437.cvpost-serv aremijan 4461 1 1 504:00:00 R 165:54:32
cvpost007/0 1438.cvpost-serv ppatil 44415 1 1 480:00:00 R 162:24:16
cvpost045/0 1440.cvpost-serv bkirk 32657 1 1 1080:00:0 R 147:32:38
cvpost036/0 1442.cvpost-serv cv-7798 23731 1 1 744:00:00 R 137:51:37
cvpost064/0 1445.cvpost-serv jdturner 4981 1 1 1080:00:0 R 83:30:48
cvpost052/0 1446.cvpost-serv cv-4963 4253 1 1 504:00:00 R 76:49:46
cvpost049/0 1447.cvpost-serv mlacy 30504 1 1 840:00:00 R 76:42:58
cvpost053/0 1448.cvpost-serv pfisher 46396 1 1 336:00:00 R 76:33:32
cvpost047/0 1449.cvpost-serv reharris 8249 1 1 336:00:00 R 76:26:12
cvpost041/0 1451.cvpost-serv gschieve 5834 1 1 120:00:00 R 71:59:21
cvpost038/0 1452.cvpost-serv pteuben 22985 1 1 480:00:00 R 54:37:06
cvpost035/0 1453.cvpost-serv rindebet 13081 1 1 1080:00:0 R 54:13:09
cvpost059/0 1455.cvpost-serv mlacy 9016 1 1 840:00:00 R 51:22:39
cvpost061/0 1457.cvpost-serv cv-7254 27340 1 1 336:00:00 R 27:46:10
cvpost042/0 1459.cvpost-serv bkent 23101 1 1 96:00:00 R 24:40:05
cvpost034/0 1460.cvpost-serv rrosen 25729 1 1 336:00:00 R 24:34:16
cvpost032/0 1461.cvpost-serv emurphy 45456 1 1 720:00:00 R 23:50:38
----------------------------
49 nodes are currently reserved.
3 Reservations
- Reservations are made from cvpost-master
3.1 Requirements
- You will get a "Disk quota exceeded" error reserving a node if you are over your quota on cvfiler (/users/YOU)
- To check your quota, type
quota -vs
(or visit The Account Check Page)
- You will not be able to reserve a node until you drop below your quota.
3.2 Make a reservation
- Ssh over to cvpost-master
- To reserve a cluster node, type
nodescheduler --request 1 #(numberofdays)
- e.g.: to reserve 1 node for 7 days:
nodescheduler --request 7 1
3.3 Release a node (end a reservation)
- Return to cvpost-master
- Type
nodescheduler --terminate $JOBID.cvpost-serv-1.cv.nrao.edu
- Remember, you can run whonode -u $USERNAME to find out what nodes correspond to what jobids
nodescheduler --terminate 1292.cvpost-serv-1.cv.nrao.edu