Print

Print


Hi Fabrice,

Ah, OK, I see. This is a problem. There is no easy solution here. I need 
to rework a bit of code to get the cmsd running. It has to with the way 
the initialization is ordered, sigh. I won't have something immediately an 
it will require code changes in the SSI.

Andy

On Wed, 11 Nov 2015, Fabrice Jammes wrote:

> Hi Andy,
>
> Here's the requested traces:
>
> *cmsd starts successfully with the first config:*
>
> qserv@ccqserv126:~$ cat cmsd.conf
> all.role server
> all.manager ccqserv125.in2p3.fr:2131
> ssi.svclib libxrdsvc.so
> #oss.statlib -2 libXrdSsi.so
> qserv@ccqserv126:~$ cmsd -d -c cmsd.conf
> 151111 22:45:19 103 Starting on Linux 3.10.0-229.20.1.el7.x86_64
> Copr.  2004-2012 Stanford University, xrd version unknown
> ++++++ cmsd [log in to unmask] initialization started.
> Config using configuration file cmsd.conf
> Config maximum number of connections restricted to 1048576
> Config maximum number of threads restricted to 1048576
> 151111 22:45:19 103 XrdConfig: sendfile enabled.
> 151111 22:45:19 103 XrdSched: scheduling underused thread monitor in 780 
> seconds
> 151111 22:45:19 104 XrdXeq: Buffer Manager reshaper thread started
> 151111 22:45:19 105 XrdXeq: Time scheduler thread started
> 151111 22:45:19 103 XrdSched: Starting with 2 workers
> 151111 22:45:19 103 XrdLink: Allocating 8 link objects at a time
> 151111 22:45:19 107 XrdXeq: Worker thread started
> 151111 22:45:19 106 XrdXeq: Worker thread started
> 151111 22:45:19 103 XrdPoll: Starting poller 0
> 151111 22:45:19 108 XrdXeq: Poller thread started
> 151111 22:45:19 103 XrdPoll: Starting poller 1
> 151111 22:45:19 109 XrdXeq: Poller thread started
> 151111 22:45:19 103 XrdPoll: Starting poller 2
> 151111 22:45:19 110 XrdXeq: Poller thread started
> 151111 22:45:19 103 XrdProtocol: getting port from protocol cmsd
> Copr.  2007 Stanford University/SLAC cmsd.
> ++++++ [log in to unmask] phase 1 initialization started.
> =====> all.role server
> =====> all.manager ccqserv125.in2p3.fr:2131
> The following paths are available to the redirector:
> r  /
>
> ------ [log in to unmask] phase 1 server initialization completed.
> 151111 22:45:19 103 XrdConfig: LCL port 37568 wsz=87380 (87380)
> 151111 22:45:19 103 XrdProtocol: getting protocol object cmsd
> ++++++ [log in to unmask] phase 2 server initialization started.
> Config warning: adminpath resides in /tmp and may be unstable!
> 151111 22:45:19 103 Configure2 Global System Identification: anon-s 
> 2131ccqserv125.in2p3.fr
> ++++++ Storage system initialization started.
> ++++++ Configuring standalone mode . . .
> 151111 22:45:19 103 oss_AioInit: started AIO read signal thread; 
> tid=1278469888
> 151111 22:45:19 103 oss_AioInit: started AIO write signal thread; 
> tid=1277417216
> Config effective cmsd.conf oss configuration:
>       oss.alloc        0 0 0
>       oss.cachescan    600
>       oss.fdlimit      524288 1048576
>       oss.maxsize      0
>       oss.trace        fff
>       oss.xfr          1 deny 10800 keep 1200
>       oss.memfile off  max 8355569664
>       oss.defaults  r/w  nocheck nodread nomig norcreate nopurge nostage 
> xattr
> ------ Storage system initialization completed.
> 151111 22:45:19 103 Start Srv=0 dfs=0 lcl=0 Pre=1 dmLife=0 0
> 151111 22:45:19 103 Start Lim=0 0 fix=0 Qmax=1
> 151111 22:45:19 103 Meter: Warning! No writable filesystems found.
> 151111 22:45:19 103 Update Space Parm1=0 Parm2=0
> 151111 22:45:19 103 Meter: Write access and staging prohibited.
> ------ [log in to unmask] phase 2 server initialization completed.
> 151111 22:45:19 107 XrdSched: running cmsd startup inq=0
> 151111 22:45:19 113 XrdXeq: Notification handler thread started
> 151111 22:45:19 115 XrdXeq: Admin traffic thread started
> 151111 22:45:19 114 XrdXeq: Prep handler thread started
> 151111 22:45:19 115 Start: Waiting for primary server to login.
> ------ cmsd [log in to unmask]:37568 initialization completed.
> 151111 22:45:19 106 XrdSched: Now have 3 workers
> 151111 22:45:19 106 XrdSched: running main accept inq=0
> 151111 22:45:19 117 XrdXeq: Worker thread started
>
> *cmsd crashes with the second config:*
>
> qserv@ccqserv126:~$ cat cmsd.conf
> all.role server
> all.manager ccqserv125.in2p3.fr:2131
> ssi.svclib libxrdsvc.so
> oss.statlib -2 libXrdSsi.so
> qserv@ccqserv126:~$
> qserv@ccqserv126:~$ cmsd -d -c cmsd.conf
> 151111 22:58:54 137 Starting on Linux 3.10.0-229.20.1.el7.x86_64
> Copr.  2004-2012 Stanford University, xrd version unknown
> ++++++ cmsd [log in to unmask] initialization started.
> Config using configuration file cmsd.conf
> Config maximum number of connections restricted to 1048576
> Config maximum number of threads restricted to 1048576
> 151111 22:58:54 137 XrdConfig: sendfile enabled.
> 151111 22:58:54 137 XrdSched: scheduling underused thread monitor in 780 
> seconds
> 151111 22:58:54 138 XrdXeq: Buffer Manager reshaper thread started
> 151111 22:58:54 141 XrdXeq: Worker thread started
> 151111 22:58:54 137 XrdSched: Starting with 2 workers
> 151111 22:58:54 137 XrdLink: Allocating 8 link objects at a time
> 151111 22:58:54 139 XrdXeq: Time scheduler thread started
> 151111 22:58:54 140 XrdXeq: Worker thread started
> 151111 22:58:54 137 XrdPoll: Starting poller 0
> 151111 22:58:54 142 XrdXeq: Poller thread started
> 151111 22:58:54 137 XrdPoll: Starting poller 1
> 151111 22:58:54 143 XrdXeq: Poller thread started
> 151111 22:58:54 137 XrdPoll: Starting poller 2
> 151111 22:58:54 144 XrdXeq: Poller thread started
> 151111 22:58:54 137 XrdProtocol: getting port from protocol cmsd
> Copr.  2007 Stanford University/SLAC cmsd.
> ++++++ [log in to unmask] phase 1 initialization started.
> =====> all.role server
> =====> all.manager ccqserv125.in2p3.fr:2131
> The following paths are available to the redirector:
> r  /
>
> ------ [log in to unmask] phase 1 server initialization completed.
> 151111 22:58:54 137 XrdConfig: LCL port 52851 wsz=87380 (87380)
> 151111 22:58:54 137 XrdProtocol: getting protocol object cmsd
> ++++++ [log in to unmask] phase 2 server initialization started.
> Config warning: adminpath resides in /tmp and may be unstable!
> 151111 22:58:54 137 Configure2 Global System Identification: anon-s 
> 2131ccqserv125.in2p3.fr
> ++++++ Storage system initialization started.
> =====> oss.statlib -2 libXrdSsi.so
> Plugin No such file or directory loading statlib libXrdSsi-4.so
> Config Falling back to using libXrdSsi.so
> ++++++ ssi phase 1 initialization started.
> =====> all.role server
> =====> ssi.svclib libxrdsvc.so
> ------ ssi phase 1 initialization completed.
> ++++++ ssi phase 2 initialization started.
> 151111 22:58:54 137 sysFinder: Network i/f undefined; unable to self-locate.
> ------ ssi phase 2 initialization failed.
> ++++++ Configuring standalone mode . . .
> ------ Storage system initialization failed.
> ------ [log in to unmask] phase 2 server initialization failed.
> 151111 22:58:54 137 XrdProtocol: Protocol cmsd could not be loaded
> ------ cmsd [log in to unmask]:-1 initialization failed.
>
> Hope it'll help.
>
> Thanks
>
>
> On 11/11/2015 02:10 PM, Andrew Hanushevsky wrote:
>> Hi Fabrice,
>> 
>> Odd. OK, my answers....
>> 
>> On Wed, 11 Nov 2015, Fabrice Jammes wrote:
>> 
>>>> 1) Who is producing the following messages?
>>> This messages are in cmsd logs and are produced by xrootd:
>> Got it. OK, this is because of static initialization of something we will 
>> not use but cannot easily avoid initializing. It should be OK.
>> 
>>>> 2) The "statlib" uses the libXrdSsi.so because we packaged it there as a 
>>>> convenience since we need to use the file registry. Do you have a static 
>>>> initialization section that expects it will fire up all of qserv? We 
>>>> don't want that.
>>> I don't really understand this question, sorry. Here's our configuration 
>>> file, it may help?
>> I just answered in in (1). This is the xrootd client doing static 
>> initialization and this is because the SSI library uses the client so it is 
>> forced to be initialized when the client library is loaded.
>> 
>>>> 3) This is a container, right?
>>> Yes. FYI, our previous cmsd version was running fine under the same sort 
>>> of container with same network setting.
>> Then is should run here.
>> 
>>>> 5) I assume things are registered in DNS or at least appear correctly in 
>>>> /etc/hosts otherwise we will have a problem. The container has to look 
>>>> like an actual machine.
>>> # runned inside he container
>>> root@ccqserv126:/qserv# ping ccqserv126
>>> PING ccqserv126.in2p3.fr (172.17.0.7): 56 data bytes
>>> 64 bytes from 172.17.0.7: icmp_seq=0 ttl=64 time=0.061 ms
>>> 64 bytes from 172.17.0.7: icmp_seq=1 ttl=64 time=0.049 ms
>> OK, it's properly registered. So, type up a small config file, as follows:
>> 
>> all.role server
>> all.manager ccqserv125.in2p3.fr:2131
>> ssi.svclib libxrdsvc.so
>> #oss.statlib -2 libXrdSsi.so
>> 
>> Setup the environment as you normally would but don't start anything. By 
>> hand do:
>> 
>> <path>/cmsd -d -c <path to config file above>
>> 
>> Send the output to me. The uncomment the "statlib" directive and so the 
>> same thing again. Send that output to me as well.
>> 
>> Andy
>
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1