Hi Andy, I'm afraid our distributed setup is broken for a few time... Thanks for your help and for the future fix ;-) Regards, On 11/11/2015 03:13 PM, Andrew Hanushevsky wrote: > Hi Fabrice, > > Ah, OK, I see. This is a problem. There is no easy solution here. I > need to rework a bit of code to get the cmsd running. It has to with > the way the initialization is ordered, sigh. I won't have something > immediately an it will require code changes in the SSI. > > Andy > > On Wed, 11 Nov 2015, Fabrice Jammes wrote: > >> Hi Andy, >> >> Here's the requested traces: >> >> *cmsd starts successfully with the first config:* >> >> qserv@ccqserv126:~$ cat cmsd.conf >> all.role server >> all.manager ccqserv125.in2p3.fr:2131 >> ssi.svclib libxrdsvc.so >> #oss.statlib -2 libXrdSsi.so >> qserv@ccqserv126:~$ cmsd -d -c cmsd.conf >> 151111 22:45:19 103 Starting on Linux 3.10.0-229.20.1.el7.x86_64 >> Copr. 2004-2012 Stanford University, xrd version unknown >> ++++++ cmsd [log in to unmask] initialization started. >> Config using configuration file cmsd.conf >> Config maximum number of connections restricted to 1048576 >> Config maximum number of threads restricted to 1048576 >> 151111 22:45:19 103 XrdConfig: sendfile enabled. >> 151111 22:45:19 103 XrdSched: scheduling underused thread monitor in >> 780 seconds >> 151111 22:45:19 104 XrdXeq: Buffer Manager reshaper thread started >> 151111 22:45:19 105 XrdXeq: Time scheduler thread started >> 151111 22:45:19 103 XrdSched: Starting with 2 workers >> 151111 22:45:19 103 XrdLink: Allocating 8 link objects at a time >> 151111 22:45:19 107 XrdXeq: Worker thread started >> 151111 22:45:19 106 XrdXeq: Worker thread started >> 151111 22:45:19 103 XrdPoll: Starting poller 0 >> 151111 22:45:19 108 XrdXeq: Poller thread started >> 151111 22:45:19 103 XrdPoll: Starting poller 1 >> 151111 22:45:19 109 XrdXeq: Poller thread started >> 151111 22:45:19 103 XrdPoll: Starting poller 2 >> 151111 22:45:19 110 XrdXeq: Poller thread started >> 151111 22:45:19 103 XrdProtocol: getting port from protocol cmsd >> Copr. 2007 Stanford University/SLAC cmsd. >> ++++++ [log in to unmask] phase 1 initialization started. >> =====> all.role server >> =====> all.manager ccqserv125.in2p3.fr:2131 >> The following paths are available to the redirector: >> r / >> >> ------ [log in to unmask] phase 1 server initialization completed. >> 151111 22:45:19 103 XrdConfig: LCL port 37568 wsz=87380 (87380) >> 151111 22:45:19 103 XrdProtocol: getting protocol object cmsd >> ++++++ [log in to unmask] phase 2 server initialization started. >> Config warning: adminpath resides in /tmp and may be unstable! >> 151111 22:45:19 103 Configure2 Global System Identification: anon-s >> 2131ccqserv125.in2p3.fr >> ++++++ Storage system initialization started. >> ++++++ Configuring standalone mode . . . >> 151111 22:45:19 103 oss_AioInit: started AIO read signal thread; >> tid=1278469888 >> 151111 22:45:19 103 oss_AioInit: started AIO write signal thread; >> tid=1277417216 >> Config effective cmsd.conf oss configuration: >> oss.alloc 0 0 0 >> oss.cachescan 600 >> oss.fdlimit 524288 1048576 >> oss.maxsize 0 >> oss.trace fff >> oss.xfr 1 deny 10800 keep 1200 >> oss.memfile off max 8355569664 >> oss.defaults r/w nocheck nodread nomig norcreate nopurge >> nostage xattr >> ------ Storage system initialization completed. >> 151111 22:45:19 103 Start Srv=0 dfs=0 lcl=0 Pre=1 dmLife=0 0 >> 151111 22:45:19 103 Start Lim=0 0 fix=0 Qmax=1 >> 151111 22:45:19 103 Meter: Warning! No writable filesystems found. >> 151111 22:45:19 103 Update Space Parm1=0 Parm2=0 >> 151111 22:45:19 103 Meter: Write access and staging prohibited. >> ------ [log in to unmask] phase 2 server initialization completed. >> 151111 22:45:19 107 XrdSched: running cmsd startup inq=0 >> 151111 22:45:19 113 XrdXeq: Notification handler thread started >> 151111 22:45:19 115 XrdXeq: Admin traffic thread started >> 151111 22:45:19 114 XrdXeq: Prep handler thread started >> 151111 22:45:19 115 Start: Waiting for primary server to login. >> ------ cmsd [log in to unmask]:37568 initialization completed. >> 151111 22:45:19 106 XrdSched: Now have 3 workers >> 151111 22:45:19 106 XrdSched: running main accept inq=0 >> 151111 22:45:19 117 XrdXeq: Worker thread started >> >> *cmsd crashes with the second config:* >> >> qserv@ccqserv126:~$ cat cmsd.conf >> all.role server >> all.manager ccqserv125.in2p3.fr:2131 >> ssi.svclib libxrdsvc.so >> oss.statlib -2 libXrdSsi.so >> qserv@ccqserv126:~$ >> qserv@ccqserv126:~$ cmsd -d -c cmsd.conf >> 151111 22:58:54 137 Starting on Linux 3.10.0-229.20.1.el7.x86_64 >> Copr. 2004-2012 Stanford University, xrd version unknown >> ++++++ cmsd [log in to unmask] initialization started. >> Config using configuration file cmsd.conf >> Config maximum number of connections restricted to 1048576 >> Config maximum number of threads restricted to 1048576 >> 151111 22:58:54 137 XrdConfig: sendfile enabled. >> 151111 22:58:54 137 XrdSched: scheduling underused thread monitor in >> 780 seconds >> 151111 22:58:54 138 XrdXeq: Buffer Manager reshaper thread started >> 151111 22:58:54 141 XrdXeq: Worker thread started >> 151111 22:58:54 137 XrdSched: Starting with 2 workers >> 151111 22:58:54 137 XrdLink: Allocating 8 link objects at a time >> 151111 22:58:54 139 XrdXeq: Time scheduler thread started >> 151111 22:58:54 140 XrdXeq: Worker thread started >> 151111 22:58:54 137 XrdPoll: Starting poller 0 >> 151111 22:58:54 142 XrdXeq: Poller thread started >> 151111 22:58:54 137 XrdPoll: Starting poller 1 >> 151111 22:58:54 143 XrdXeq: Poller thread started >> 151111 22:58:54 137 XrdPoll: Starting poller 2 >> 151111 22:58:54 144 XrdXeq: Poller thread started >> 151111 22:58:54 137 XrdProtocol: getting port from protocol cmsd >> Copr. 2007 Stanford University/SLAC cmsd. >> ++++++ [log in to unmask] phase 1 initialization started. >> =====> all.role server >> =====> all.manager ccqserv125.in2p3.fr:2131 >> The following paths are available to the redirector: >> r / >> >> ------ [log in to unmask] phase 1 server initialization completed. >> 151111 22:58:54 137 XrdConfig: LCL port 52851 wsz=87380 (87380) >> 151111 22:58:54 137 XrdProtocol: getting protocol object cmsd >> ++++++ [log in to unmask] phase 2 server initialization started. >> Config warning: adminpath resides in /tmp and may be unstable! >> 151111 22:58:54 137 Configure2 Global System Identification: anon-s >> 2131ccqserv125.in2p3.fr >> ++++++ Storage system initialization started. >> =====> oss.statlib -2 libXrdSsi.so >> Plugin No such file or directory loading statlib libXrdSsi-4.so >> Config Falling back to using libXrdSsi.so >> ++++++ ssi phase 1 initialization started. >> =====> all.role server >> =====> ssi.svclib libxrdsvc.so >> ------ ssi phase 1 initialization completed. >> ++++++ ssi phase 2 initialization started. >> 151111 22:58:54 137 sysFinder: Network i/f undefined; unable to >> self-locate. >> ------ ssi phase 2 initialization failed. >> ++++++ Configuring standalone mode . . . >> ------ Storage system initialization failed. >> ------ [log in to unmask] phase 2 server initialization failed. >> 151111 22:58:54 137 XrdProtocol: Protocol cmsd could not be loaded >> ------ cmsd [log in to unmask]:-1 initialization failed. >> >> Hope it'll help. >> >> Thanks >> >> >> On 11/11/2015 02:10 PM, Andrew Hanushevsky wrote: >>> Hi Fabrice, >>> >>> Odd. OK, my answers.... >>> >>> On Wed, 11 Nov 2015, Fabrice Jammes wrote: >>> >>>>> 1) Who is producing the following messages? >>>> This messages are in cmsd logs and are produced by xrootd: >>> Got it. OK, this is because of static initialization of something we >>> will not use but cannot easily avoid initializing. It should be OK. >>> >>>>> 2) The "statlib" uses the libXrdSsi.so because we packaged it >>>>> there as a convenience since we need to use the file registry. Do >>>>> you have a static initialization section that expects it will fire >>>>> up all of qserv? We don't want that. >>>> I don't really understand this question, sorry. Here's our >>>> configuration file, it may help? >>> I just answered in in (1). This is the xrootd client doing static >>> initialization and this is because the SSI library uses the client >>> so it is forced to be initialized when the client library is loaded. >>> >>>>> 3) This is a container, right? >>>> Yes. FYI, our previous cmsd version was running fine under the same >>>> sort of container with same network setting. >>> Then is should run here. >>> >>>>> 5) I assume things are registered in DNS or at least appear >>>>> correctly in /etc/hosts otherwise we will have a problem. The >>>>> container has to look like an actual machine. >>>> # runned inside he container >>>> root@ccqserv126:/qserv# ping ccqserv126 >>>> PING ccqserv126.in2p3.fr (172.17.0.7): 56 data bytes >>>> 64 bytes from 172.17.0.7: icmp_seq=0 ttl=64 time=0.061 ms >>>> 64 bytes from 172.17.0.7: icmp_seq=1 ttl=64 time=0.049 ms >>> OK, it's properly registered. So, type up a small config file, as >>> follows: >>> >>> all.role server >>> all.manager ccqserv125.in2p3.fr:2131 >>> ssi.svclib libxrdsvc.so >>> #oss.statlib -2 libXrdSsi.so >>> >>> Setup the environment as you normally would but don't start >>> anything. By hand do: >>> >>> <path>/cmsd -d -c <path to config file above> >>> >>> Send the output to me. The uncomment the "statlib" directive and so >>> the same thing again. Send that output to me as well. >>> >>> Andy >> >> ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1