Hi Tommaso, Comments interspersed.... On Fri, 28 Aug 2015, Tommaso Boccali wrote: > ciao, > the supervisor (started by hand with > 'cmsd -d -l /var/log/xrootd/superv.log -c > /etc/xrootd/xrootd-redir-cms-superv.cfg -k 7' and > 'xrootd -d -l /var/log/xrootd/superv-x.log -c > /etc/xrootd/xrootd-redir-cms-superv.cfg -k 7' ) > The above may be fine except that if you are running a supervisor on the same node as the managerr then you really need to differentiate it using an instance name. You do this using the '-n' option (e.g. '-n sup1') otherwise the two daemons will stomp on each other which is not good. Instance names are described in: http://xrootd.org/doc/dev42/xrd_config.htm#_Toc411375787 Yes, you need to start an xrootd/cmsd pair. > while in the supervisor cmsd log I see > which seems good, but then a series of: > > ... > 150828 14:08:54 28799 manager.0:25@xrootd do_StateFWD: *Path find failed > for state* > /store/test/xrootd/T2_MY_UPM_BIRUNI/store/mc/HC/GenericTTbar/GEN-SIM-RECO/CMSSW_7_0_4_START70_V7-v1/00000/D00E55FF-F6CC-E311-9B51-02163E00E88E.root > 150828 14:08:54 28799 Dispatch manager.0:25@xrootd for state dlen=148 > 150828 14:08:54 28799 manager.0:25@xrootd do_State: > ... Well, this is why I recommend not turning on debugging. This is output from debugging and the log will be flooded with these kinds of messages (usually leading to a logfile that is several gigabytes long). All it means is that the manager asked the supervisor to lookup a file but there were no subscribers to the supervisor so the lookup could not occur. Quite normal. > So I have a few questions > > 0 - very naive: for the supervisor do I need to start cmsd AND xrootd? If I > do not do that, I see no effect at all Yes, you need to start a suprevisor xrotd/cmsd pair. > 1 - is the erorr expected with my config ? Yes, because full debugging is turned on (I would not recommend that). > 2 - I did not set explicitly the port numbers to the supervisor, and they > of course cannot be 1094/1213 since they are already taken by the manager. > I just have > xrd.port any > > is that enough? Yes, in fact the supervisor forces arbitrary port number assignment. There is no need to give supervisors a specific port. > 3 - should I see some servers being "moved" by the manager to the > supervisor? No, the only time a supervisor is used is when the 65th connection is made to the manager. In that case, the connectee is told to connect to the supervisor. > 4 - atm I have not opened any additional port on the firewall, since with > 'any' I do not know which port will be used. Should I open something? Well, then we may have a problem here. What is exactly firewalled here? That is where do you restrict port number usage? As for teh config: > > [root@xrootd xrootd]# cat /etc/xrootd/xrootd-redir-cms-superv.cfg > > xrd.port any > all.role supervisor > # The known managers > all.manager xrootd.ba.infn.it 1213 > > # Allow any path to be exported; this is further refined in the authfile. !!! Note that this causes problems for the xrdmapc command and isn't fixed until 4.2.3 (Marian has the details). > all.export / r/w > > # Hosts allowed to use this xrootd cluster > cms.allow host * > # Logging verbosity > xrootd.trace emsg login stall redirect !!!Unless there is a reason to turn on complete debugging, I strongly advise against it as it produces a humongous amount of log output, most of which is useless in the general case. > ofs.trace all -debug > xrd.trace all -debug > cms.trace all -debug > > > > On Thu, Aug 27, 2015 at 5:26 PM, Marian Zvada <[log in to unmask]> wrote: > >> Hi Tom, >> >> I haven't tried it though, but looks good to me. >> >> One more thing, we should not use thread limit by hard from whatever >> v4.x.x version, I think. This is well cared of within recent fixes I >> believe. I don't recall details right now but can search through if needed. >> >> So, feel free remove the line "xrd.sched maxt 16000". >> >> -Marian >> >> On 8/27/15 3:49 AM, Tommaso Boccali wrote: >> >>> Ciao, I am trying to see if there is the need for a supervisor on one of >>> our CMS EU redirs. >>> In the logs, I never really see anything like 'If you suspect this, >>> check the manager˙˙s log. It will contain warnings about orphaned data >>> servers' >>> >>> so I am not sure we have a problem, but still we a re very close to the >>> 64 limit so better to be proactive. >>> >>> >>> What I want to do as step #1 is to run the supervisor as an additional >>> daemon on the redirector (it is a test, I really want to see what >>> happens first, and the machine is big so should not be an issue) >>> >>> I looked at the documentation below, but I have to admit it is a bit >>> obscure (to me). >>> >>> So, I have a cmsd/xrootd (the eu redir) running on ports 1213/1094, and >>> redirecting "up" if needed (to the global redirector). >>> Starting from their config, I just wanted to prepare a config for the >>> supervisor. >>> >>> The minimal one I am trying to guess from the documentation would be >>> === >>> xrd.port any >>> all.role supervisor >>> # The known managers >>> all.manager xrootd.ba.infn.it <http://xrootd.ba.infn.it> 1213 >>> >>> # Allow any path to be exported; this is further refined in the authfile. >>> all.export / r/w >>> >>> # Hosts allowed to use this xrootd cluster >>> cms.allow host * >>> # Logging verbosity >>> xrootd.trace emsg login stall redirect >>> ofs.trace all -debug >>> xrd.trace all -debug >>> cms.trace all -debug >>> >>> cms.fxhold 8h >>> xrd.sched maxt 16000 >>> === >>> >>> but again , this is a sort of guess .... Do you have an example of a >>> standalone cfg file for a supervisor? >>> >>> >>> thanks >>> >>> tom >>> >>> http://xrootd.org/doc/dev42/cms_config.htm#_Toc405927050 >>> >>> -- >>> Tommaso Boccali >>> INFN Pisa >>> >>> ------------------------------------------------------------------------ >>> >>> Use REPLY-ALL to reply to list >>> >>> To unsubscribe from the XROOTD-L list, click the following link: >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 >>> >>> > > > -- > Tommaso Boccali > INFN Pisa > > ######################################################################## > Use REPLY-ALL to reply to list > > To unsubscribe from the XROOTD-L list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1