Print

Print


Hi Andy,

I updated the RPMs on the redirector in our production system to test the supervisors. Now, 4 out of 9 supervisors seems to be working and getting the necessary traffic. I do see though, some of them are going to status `suspend + nostage`. 

If the number of data servers (~250) are already handled by 3-4 supervisors, then do the extra supervisors go to `suspend + nostage` status by default ?

Here are the log from one of the working supervisors:
```
150415 10:16:22 23726 Pander trying to connect to lvl 0 cmsxrootd.hep.wisc.edu:1213
150415 10:16:22 23726 Add cmsxrootd.hep.wisc.edu to manager config; id=0
150415 10:16:22 23726 ManTree: Now connected to 1 root node(s)
150415 10:16:22 23726 Protocol: Logged into cmsxrootd
150415 10:16:22 23708 AddNode srv server.10987:20@g19n12:31094 cluster 1213cmsxrootd.hep.wisc.edu mask=1 anum=0
150415 10:16:22 23708 Add server.10987:20@g19n12:31094 to cluster anon-s 1213cmsxrootd.hep.wisc.edu slot 0.4 (nodecnt=1 supn=1)
150415 10:16:22 23708 Update Counts Parm1=1 Parm2=0
150415 10:16:22 23708 Admit g19n12 TSpace=1GB NumFS=0 FSpace=0MB MinFR=0 MB Util=0 Share=100 TZone=-6
150415 10:16:22 23888 State: Status changed to active
150415 10:16:22 23888 Send status to redirector.23870:7@localhost
150415 10:16:22 23708 Admit g19n12 adding path: w /
150415 10:16:22 23708 server.10987:20@g19n12:31094 do_Space: 0MB free; 0% util
150415 10:16:22 23888 Inform cmsxrootd.hep.wisc.edu status
150415 10:16:22 23708 Inform cmsxrootd.hep.wisc.edu avail
150415 10:16:22 23708 Protocol: Primary server.10987:20@g19n12:31094 logged in.
=====> Routing for g19n12.hep.wisc.edu: local pub4 prv4 pub6 prv6
=====> Route all4: g19n12.hep.wisc.edu Dest=[::144.92.181.79]:31094
=====> Route all6: g19n12.hep.wisc.edu Dest=[2607:f388:101c:1000::198]:31094
150415 10:16:23 27313 AddNode srv server.8545:25@g26n24:31094 cluster 1213cmsxrootd.hep.wisc.edu mask=3 anum=0
150415 10:16:23 27313 Add server.8545:25@g26n24:31094 to cluster anon-s 1213cmsxrootd.hep.wisc.edu slot 1.5 (nodecnt=2 supn=1)
150415 10:16:23 27313 Update Counts Parm1=1 Parm2=0
150415 10:16:23 27313 Admit g26n24 TSpace=1GB NumFS=0 FSpace=0MB MinFR=0 MB Util=0 Share=100 TZone=-6
```

This is `cmsd.log` from another supervisor that isn't working:

```
------ cmsd [log in to unmask]:46536 initialization completed.
150415 10:31:27 23180 Start: Waiting for primary server to login.
150415 10:31:31 23178 Inet: Accepted connection from 7@localhost
150415 10:31:31 23178 Protocol: redirector.22586:7@localhost logged in.
150415 10:31:31 23178 Admit_Redirector redirector.22586:7@localhost assigned slot 1
150415 10:31:34 23178 Protocol: redirector.22586:7@localhost logged out; request read failed
150415 10:31:34 23178 Inet: Accepted connection from 7@localhost
150415 10:31:34 23178 Protocol: redirector.23267:7@localhost logged in.
150415 10:31:34 23178 Admit_Redirector redirector.23267:7@localhost assigned slot 1
150415 10:31:34 23284 Admin_Login initial request: 'login p 23267 port 31094'
150415 10:31:34 23284 Update FrontEnd Parm1=1 Parm2=31094
150415 10:31:34 23284 do_Login:: Primary server 23267 logged in; data port is 31094
150415 10:31:34 23179 Pander supervisor services to cmsxrootd.hep.wisc.edu:1213
150415 10:31:34 23179 Pander trying to connect to lvl 0 cmsxrootd.hep.wisc.edu:1213
150415 10:31:34 23179 Add cmsxrootd.hep.wisc.edu to manager config; id=0
150415 10:31:34 23179 ManTree: Now connected to 1 root node(s)
150415 10:31:34 23179 Protocol: Logged into cmsxrootd
150415 10:31:37 23162 Update Stage Parm1=-1 Parm2=0
150415 10:31:37 23162 Update Active Parm1=-1 Parm2=0
150415 10:31:37 23162 Config: supervisor service enabled.
150415 10:31:37 23285 State: Status changed to suspended + nostaging
150415 10:31:37 23285 Inform cmsxrootd.hep.wisc.edu status
```

---
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/227#issuecomment-93456345

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1