Clever Notes on the Cluster Nodes

1 Know your Nodes (Tech Specs)

1.1 Sort by attributes (model, for example)

  • Use technode -n cvpost### to query a model number of a node
  • Use technode -m model to pull up a list of all nodes matching a particular model

[root@cvpost-master bin]# technode -n cvpost013
cvpost013 is a PowerEdge R620
[root@cvpost-master bin]# technode -m R620     
cvpost002:    Product Name: PowerEdge R620
cvpost003:    Product Name: PowerEdge R620
cvpost004:    Product Name: PowerEdge R620
cvpost011:    Product Name: PowerEdge R620
cvpost012:    Product Name: PowerEdge R620
cvpost013:    Product Name: PowerEdge R620
cvpost014:    Product Name: PowerEdge R620
cvpost015:    Product Name: PowerEdge R620
cvpost017:    Product Name: PowerEdge R620
cvpost018:    Product Name: PowerEdge R620
cvpost030:    Product Name: PowerEdge R620
cvpost031:    Product Name: PowerEdge R620
cvpost032:    Product Name: PowerEdge R620
cvpost033:    Product Name: PowerEdge R620
cvpost034:    Product Name: PowerEdge R620
cvpost035:    Product Name: PowerEdge R620
cvpost036:    Product Name: PowerEdge R620
cvpost037:    Product Name: PowerEdge R620
cvpost038:    Product Name: PowerEdge R620
cvpost039:    Product Name: PowerEdge R620
cvpost040:    Product Name: PowerEdge R620
cvpost041:    Product Name: PowerEdge R620
cvpost042:    Product Name: PowerEdge R620
cvpost043:    Product Name: PowerEdge R620
cvpost044:    Product Name: PowerEdge R620
cvpost045:    Product Name: PowerEdge R620
cvpost046:    Product Name: PowerEdge R620
cvpost047:    Product Name: PowerEdge R620
cvpost048:    Product Name: PowerEdge R620
cvpost049:    Product Name: PowerEdge R620
cvpost050:    Product Name: PowerEdge R620
cvpost051:    Product Name: PowerEdge R620
cvpost052:    Product Name: PowerEdge R620
cvpost053:    Product Name: PowerEdge R620
cvpost054:    Product Name: PowerEdge R620
cvpost055:    Product Name: PowerEdge R620
cvpost056:    Product Name: PowerEdge R620
cvpost057:    Product Name: PowerEdge R620
cvpost058:    Product Name: PowerEdge R620
cvpost059:    Product Name: PowerEdge R620
cvpost060:    Product Name: PowerEdge R620
cvpost061:    Product Name: PowerEdge R620
cvpost062:    Product Name: PowerEdge R620
cvpost063:    Product Name: PowerEdge R620
cvpost064:    Product Name: PowerEdge R620
cvpost065:    Product Name: PowerEdge R620
[root@cvpost-master bin]# 

1.2 What's reserved or in use

2 Know your Node (Users)

  • Use whonode on cvpost-master

2.1 How many nodes are reserved

  • whonode -c
  • Just the number of nodes reserved
[root@cvpost-master bin]# whonode -c
51 nodes are currently reserved.

2.2 How many nodes does a particular user have?

  • whonode -u username
  • You can look up multiple users at a time by passing the -u flag multiple times
  • Output will include the cvpost node names corresponding to the JobIDs belonging to the user.
#### NOTE: OUTPUT LOOKS BETTER ON ACTUAL TERMINAL####
[root@cvpost-master ~]# whonode -u swood -u jmangum -u rindebet -u jmeyer

Node Name   Job ID         Username   SessID   Nodes   Tasks   Req.Time   Status   Elap.Time
cvpost065/0   1360.cvpost-serv   swood      23926   1   1   720:00:00   R   558:42:00
User swood has 1 node reserved.

Node Name   Job ID         Username   SessID   Nodes   Tasks   Req.Time   Status   Elap.Time
cvpost030/0   1392.cvpost-serv   jmangum    31640   1   1   1080:00:0   R   479:37:14
cvpost025+23/0   1400.cvpost-serv   jmangum    126351   2   2   1080:00:0   R   386:17:18
User jmangum has 3 nodes reserved.

Node Name   Job ID         Username   SessID   Nodes   Tasks   Req.Time   Status   Elap.Time
cvpost058/0   1261.cvpost-serv   rindebet   15230   1   1   2088:00:0   R   1273:51:5
cvpost035/0   1453.cvpost-serv   rindebet   13081   1   1   1080:00:0   R   54:47:48
User rindebet has 2 nodes reserved.

Node Name   Job ID         Username   SessID   Nodes   Tasks   Req.Time   Status   Elap.Time
cvpost039/0   1404.cvpost-serv   jmeyer     11693   1   1   504:00:00   R   337:56:02
cvpost040/0   1415.cvpost-serv   jmeyer     15127   1   1   504:00:00   R   219:46:01
cvpost026/0   1419.cvpost-serv   jmeyer     92140   1   1   504:00:00   R   213:29:28
User jmeyer has 3 nodes reserved.

2.3 Who has node cvpostxxx?

  • whonode -n cvpostxxx
[root@cvpost-master bin]# whonode -n cvpost044
cvpost044 is reserved by amcnicho

2.4 What node does the JobID correspond to?

  • whonode -j
  • You can look up multiple JobIDs at once by passing the -j flag multiple times
[root@cvpost-master bin]# whonode -j 1447 -j 1455 -j 1400
JobID 1447 is on cvpost049/0
JobID 1455 is on cvpost059/0
JobID 1400 is on cvpost025/0+cvpost023/0

2.5 All users on reserved nodes

  • whonode
  • This command automatically includes the count of all nodes

[root@cvpost-master bin]# whonode                
[root@cvpost-master ~]# whonode
Node Name   Job ID         Username   SessID   Nodes   Tasks   Req.Time   Status   Elap.Time
cvpost058/0   1261.cvpost-serv   rindebet   15230   1   1   2088:00:0   R   1273:17:1
cvpost029/0   1305.cvpost-serv   cbrogan    127311   1   1   2160:00:0   R   1084:43:2
cvpost046/0   1328.cvpost-serv   awootten   21319   1   1   744:00:00   R   722:54:00
cvpost005/0   1346.cvpost-serv   elastufk   79962   1   1   960:00:00   R   653:11:08
cvpost065/0   1360.cvpost-serv   swood      23926   1   1   720:00:00   R   558:07:21
cvpost051/0   1361.cvpost-serv   awells     20860   1   1   672:00:00   R   557:31:10
cvpost016/0   1382.cvpost-serv   efomalon   50061   1   1   720:00:00   R   506:44:00
cvpost043/0   1390.cvpost-serv   amoullet   24986   1   1   720:00:00   R   479:36:19
cvpost030/0   1392.cvpost-serv   jmangum    31640   1   1   1080:00:0   R   479:02:35
cvpost031/0   1399.cvpost-serv   thunter    10718   1   1   1080:00:0   R   387:45:27
cvpost025+23/0   1400.cvpost-serv   jmangum    126351   2   2   1080:00:0   R   385:42:39
cvpost048/0   1402.cvpost-serv   akepley    15312   1   1   1080:00:0   R   360:35:19
cvpost039/0   1404.cvpost-serv   jmeyer     11693   1   1   504:00:00   R   337:21:23
cvpost057/0   1410.cvpost-serv   dschieb    29156   1   1   336:00:00   R   242:41:48
cvpost044/0   1413.cvpost-serv   amcnicho   10867   1   1   336:00:00   R   222:12:12
cvpost040/0   1415.cvpost-serv   jmeyer     15127   1   1   504:00:00   R   219:11:22
cvpost033+24/0   1416.cvpost-serv   bmason     30493   2   2   336:00:00   R   217:17:21
cvpost026/0   1419.cvpost-serv   jmeyer     92140   1   1   504:00:00   R   212:54:49
cvpost062/0   1421.cvpost-serv   cv-8648    31844   1   1   336:00:00   R   197:58:11
cvpost056/0   1422.cvpost-serv   cv-6434    5243   1   1   336:00:00   R   197:52:52
cvpost018/0   1424.cvpost-serv   cv-8708    10290   1   1   336:00:00   R   197:51:06
cvpost017/0   1425.cvpost-serv   cv-8650    40536   1   1   336:00:00   R   197:50:13
cvpost014/0   1426.cvpost-serv   cv-8709    15098   1   1   240:00:00   R   197:48:16
cvpost013/0   1427.cvpost-serv   cv-8602    31856   1   1   336:00:00   R   197:48:03
cvpost011/0   1429.cvpost-serv   cv-8412    35302   1   1   336:00:00   R   197:37:52
cvpost063/0   1430.cvpost-serv   cv-45      38757   1   1   336:00:00   R   196:22:22
cvpost009/0   1432.cvpost-serv   reharris   2172   1   1   336:00:00   R   189:57:43
cvpost008/0   1433.cvpost-serv   cubach     14880   1   1   1080:00:0   R   173:59:44
cvpost055/0   1434.cvpost-serv   jhibbard   33564   1   1   960:00:00   R   171:52:04
cvpost054/0   1435.cvpost-serv   ekeller    44365   1   1   1080:00:0   R   169:19:47
cvpost037/0   1437.cvpost-serv   aremijan   4461   1   1   504:00:00   R   165:54:32
cvpost007/0   1438.cvpost-serv   ppatil     44415   1   1   480:00:00   R   162:24:16
cvpost045/0   1440.cvpost-serv   bkirk      32657   1   1   1080:00:0   R   147:32:38
cvpost036/0   1442.cvpost-serv   cv-7798    23731   1   1   744:00:00   R   137:51:37
cvpost064/0   1445.cvpost-serv   jdturner   4981   1   1   1080:00:0   R   83:30:48
cvpost052/0   1446.cvpost-serv   cv-4963    4253   1   1   504:00:00   R   76:49:46
cvpost049/0   1447.cvpost-serv   mlacy      30504   1   1   840:00:00   R   76:42:58
cvpost053/0   1448.cvpost-serv   pfisher    46396   1   1   336:00:00   R   76:33:32
cvpost047/0   1449.cvpost-serv   reharris   8249   1   1   336:00:00   R   76:26:12
cvpost041/0   1451.cvpost-serv   gschieve   5834   1   1   120:00:00   R   71:59:21
cvpost038/0   1452.cvpost-serv   pteuben    22985   1   1   480:00:00   R   54:37:06
cvpost035/0   1453.cvpost-serv   rindebet   13081   1   1   1080:00:0   R   54:13:09
cvpost059/0   1455.cvpost-serv   mlacy      9016   1   1   840:00:00   R   51:22:39
cvpost061/0   1457.cvpost-serv   cv-7254    27340   1   1   336:00:00   R   27:46:10
cvpost042/0   1459.cvpost-serv   bkent      23101   1   1   96:00:00   R   24:40:05
cvpost034/0   1460.cvpost-serv   rrosen     25729   1   1   336:00:00   R   24:34:16
cvpost032/0   1461.cvpost-serv   emurphy    45456   1   1   720:00:00   R   23:50:38

----------------------------

49 nodes are currently reserved.

3 Reservations

  • Reservations are made from cvpost-master

3.1 Requirements

  • You will get a "Disk quota exceeded" error reserving a node if you are over your quota on cvfiler (/users/YOU)
  • To check your quota, type quota -vs (or visit The Account Check Page)
  • You will not be able to reserve a node until you drop below your quota.

3.2 Make a reservation

  • Ssh over to cvpost-master
  • To reserve a cluster node, type nodescheduler --request 1 #(numberofdays)
  • e.g.: to reserve 1 node for 7 days:
nodescheduler --request 7 1 

3.3 Release a node (end a reservation)

  • Return to cvpost-master
  • Type nodescheduler --terminate $JOBID.cvpost-serv-1.cv.nrao.edu
  • Remember, you can run whonode -u $USERNAME to find out what nodes correspond to what jobids
nodescheduler --terminate 1292.cvpost-serv-1.cv.nrao.edu 
Topic revision: r1 - 2016-05-10, JessicaOtey
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback