Print

Print


Hi Andy,

I'm afraid our distributed setup is broken for a few time... Thanks for 
your help and for the future fix ;-)

Regards,

On 11/11/2015 03:13 PM, Andrew Hanushevsky wrote:
> Hi Fabrice,
>
> Ah, OK, I see. This is a problem. There is no easy solution here. I 
> need to rework a bit of code to get the cmsd running. It has to with 
> the way the initialization is ordered, sigh. I won't have something 
> immediately an it will require code changes in the SSI.
>
> Andy
>
> On Wed, 11 Nov 2015, Fabrice Jammes wrote:
>
>> Hi Andy,
>>
>> Here's the requested traces:
>>
>> *cmsd starts successfully with the first config:*
>>
>> qserv@ccqserv126:~$ cat cmsd.conf
>> all.role server
>> all.manager ccqserv125.in2p3.fr:2131
>> ssi.svclib libxrdsvc.so
>> #oss.statlib -2 libXrdSsi.so
>> qserv@ccqserv126:~$ cmsd -d -c cmsd.conf
>> 151111 22:45:19 103 Starting on Linux 3.10.0-229.20.1.el7.x86_64
>> Copr.  2004-2012 Stanford University, xrd version unknown
>> ++++++ cmsd [log in to unmask] initialization started.
>> Config using configuration file cmsd.conf
>> Config maximum number of connections restricted to 1048576
>> Config maximum number of threads restricted to 1048576
>> 151111 22:45:19 103 XrdConfig: sendfile enabled.
>> 151111 22:45:19 103 XrdSched: scheduling underused thread monitor in 
>> 780 seconds
>> 151111 22:45:19 104 XrdXeq: Buffer Manager reshaper thread started
>> 151111 22:45:19 105 XrdXeq: Time scheduler thread started
>> 151111 22:45:19 103 XrdSched: Starting with 2 workers
>> 151111 22:45:19 103 XrdLink: Allocating 8 link objects at a time
>> 151111 22:45:19 107 XrdXeq: Worker thread started
>> 151111 22:45:19 106 XrdXeq: Worker thread started
>> 151111 22:45:19 103 XrdPoll: Starting poller 0
>> 151111 22:45:19 108 XrdXeq: Poller thread started
>> 151111 22:45:19 103 XrdPoll: Starting poller 1
>> 151111 22:45:19 109 XrdXeq: Poller thread started
>> 151111 22:45:19 103 XrdPoll: Starting poller 2
>> 151111 22:45:19 110 XrdXeq: Poller thread started
>> 151111 22:45:19 103 XrdProtocol: getting port from protocol cmsd
>> Copr.  2007 Stanford University/SLAC cmsd.
>> ++++++ [log in to unmask] phase 1 initialization started.
>> =====> all.role server
>> =====> all.manager ccqserv125.in2p3.fr:2131
>> The following paths are available to the redirector:
>> r  /
>>
>> ------ [log in to unmask] phase 1 server initialization completed.
>> 151111 22:45:19 103 XrdConfig: LCL port 37568 wsz=87380 (87380)
>> 151111 22:45:19 103 XrdProtocol: getting protocol object cmsd
>> ++++++ [log in to unmask] phase 2 server initialization started.
>> Config warning: adminpath resides in /tmp and may be unstable!
>> 151111 22:45:19 103 Configure2 Global System Identification: anon-s 
>> 2131ccqserv125.in2p3.fr
>> ++++++ Storage system initialization started.
>> ++++++ Configuring standalone mode . . .
>> 151111 22:45:19 103 oss_AioInit: started AIO read signal thread; 
>> tid=1278469888
>> 151111 22:45:19 103 oss_AioInit: started AIO write signal thread; 
>> tid=1277417216
>> Config effective cmsd.conf oss configuration:
>>       oss.alloc        0 0 0
>>       oss.cachescan    600
>>       oss.fdlimit      524288 1048576
>>       oss.maxsize      0
>>       oss.trace        fff
>>       oss.xfr          1 deny 10800 keep 1200
>>       oss.memfile off  max 8355569664
>>       oss.defaults  r/w  nocheck nodread nomig norcreate nopurge 
>> nostage xattr
>> ------ Storage system initialization completed.
>> 151111 22:45:19 103 Start Srv=0 dfs=0 lcl=0 Pre=1 dmLife=0 0
>> 151111 22:45:19 103 Start Lim=0 0 fix=0 Qmax=1
>> 151111 22:45:19 103 Meter: Warning! No writable filesystems found.
>> 151111 22:45:19 103 Update Space Parm1=0 Parm2=0
>> 151111 22:45:19 103 Meter: Write access and staging prohibited.
>> ------ [log in to unmask] phase 2 server initialization completed.
>> 151111 22:45:19 107 XrdSched: running cmsd startup inq=0
>> 151111 22:45:19 113 XrdXeq: Notification handler thread started
>> 151111 22:45:19 115 XrdXeq: Admin traffic thread started
>> 151111 22:45:19 114 XrdXeq: Prep handler thread started
>> 151111 22:45:19 115 Start: Waiting for primary server to login.
>> ------ cmsd [log in to unmask]:37568 initialization completed.
>> 151111 22:45:19 106 XrdSched: Now have 3 workers
>> 151111 22:45:19 106 XrdSched: running main accept inq=0
>> 151111 22:45:19 117 XrdXeq: Worker thread started
>>
>> *cmsd crashes with the second config:*
>>
>> qserv@ccqserv126:~$ cat cmsd.conf
>> all.role server
>> all.manager ccqserv125.in2p3.fr:2131
>> ssi.svclib libxrdsvc.so
>> oss.statlib -2 libXrdSsi.so
>> qserv@ccqserv126:~$
>> qserv@ccqserv126:~$ cmsd -d -c cmsd.conf
>> 151111 22:58:54 137 Starting on Linux 3.10.0-229.20.1.el7.x86_64
>> Copr.  2004-2012 Stanford University, xrd version unknown
>> ++++++ cmsd [log in to unmask] initialization started.
>> Config using configuration file cmsd.conf
>> Config maximum number of connections restricted to 1048576
>> Config maximum number of threads restricted to 1048576
>> 151111 22:58:54 137 XrdConfig: sendfile enabled.
>> 151111 22:58:54 137 XrdSched: scheduling underused thread monitor in 
>> 780 seconds
>> 151111 22:58:54 138 XrdXeq: Buffer Manager reshaper thread started
>> 151111 22:58:54 141 XrdXeq: Worker thread started
>> 151111 22:58:54 137 XrdSched: Starting with 2 workers
>> 151111 22:58:54 137 XrdLink: Allocating 8 link objects at a time
>> 151111 22:58:54 139 XrdXeq: Time scheduler thread started
>> 151111 22:58:54 140 XrdXeq: Worker thread started
>> 151111 22:58:54 137 XrdPoll: Starting poller 0
>> 151111 22:58:54 142 XrdXeq: Poller thread started
>> 151111 22:58:54 137 XrdPoll: Starting poller 1
>> 151111 22:58:54 143 XrdXeq: Poller thread started
>> 151111 22:58:54 137 XrdPoll: Starting poller 2
>> 151111 22:58:54 144 XrdXeq: Poller thread started
>> 151111 22:58:54 137 XrdProtocol: getting port from protocol cmsd
>> Copr.  2007 Stanford University/SLAC cmsd.
>> ++++++ [log in to unmask] phase 1 initialization started.
>> =====> all.role server
>> =====> all.manager ccqserv125.in2p3.fr:2131
>> The following paths are available to the redirector:
>> r  /
>>
>> ------ [log in to unmask] phase 1 server initialization completed.
>> 151111 22:58:54 137 XrdConfig: LCL port 52851 wsz=87380 (87380)
>> 151111 22:58:54 137 XrdProtocol: getting protocol object cmsd
>> ++++++ [log in to unmask] phase 2 server initialization started.
>> Config warning: adminpath resides in /tmp and may be unstable!
>> 151111 22:58:54 137 Configure2 Global System Identification: anon-s 
>> 2131ccqserv125.in2p3.fr
>> ++++++ Storage system initialization started.
>> =====> oss.statlib -2 libXrdSsi.so
>> Plugin No such file or directory loading statlib libXrdSsi-4.so
>> Config Falling back to using libXrdSsi.so
>> ++++++ ssi phase 1 initialization started.
>> =====> all.role server
>> =====> ssi.svclib libxrdsvc.so
>> ------ ssi phase 1 initialization completed.
>> ++++++ ssi phase 2 initialization started.
>> 151111 22:58:54 137 sysFinder: Network i/f undefined; unable to 
>> self-locate.
>> ------ ssi phase 2 initialization failed.
>> ++++++ Configuring standalone mode . . .
>> ------ Storage system initialization failed.
>> ------ [log in to unmask] phase 2 server initialization failed.
>> 151111 22:58:54 137 XrdProtocol: Protocol cmsd could not be loaded
>> ------ cmsd [log in to unmask]:-1 initialization failed.
>>
>> Hope it'll help.
>>
>> Thanks
>>
>>
>> On 11/11/2015 02:10 PM, Andrew Hanushevsky wrote:
>>> Hi Fabrice,
>>>
>>> Odd. OK, my answers....
>>>
>>> On Wed, 11 Nov 2015, Fabrice Jammes wrote:
>>>
>>>>> 1) Who is producing the following messages?
>>>> This messages are in cmsd logs and are produced by xrootd:
>>> Got it. OK, this is because of static initialization of something we 
>>> will not use but cannot easily avoid initializing. It should be OK.
>>>
>>>>> 2) The "statlib" uses the libXrdSsi.so because we packaged it 
>>>>> there as a convenience since we need to use the file registry. Do 
>>>>> you have a static initialization section that expects it will fire 
>>>>> up all of qserv? We don't want that.
>>>> I don't really understand this question, sorry. Here's our 
>>>> configuration file, it may help?
>>> I just answered in in (1). This is the xrootd client doing static 
>>> initialization and this is because the SSI library uses the client 
>>> so it is forced to be initialized when the client library is loaded.
>>>
>>>>> 3) This is a container, right?
>>>> Yes. FYI, our previous cmsd version was running fine under the same 
>>>> sort of container with same network setting.
>>> Then is should run here.
>>>
>>>>> 5) I assume things are registered in DNS or at least appear 
>>>>> correctly in /etc/hosts otherwise we will have a problem. The 
>>>>> container has to look like an actual machine.
>>>> # runned inside he container
>>>> root@ccqserv126:/qserv# ping ccqserv126
>>>> PING ccqserv126.in2p3.fr (172.17.0.7): 56 data bytes
>>>> 64 bytes from 172.17.0.7: icmp_seq=0 ttl=64 time=0.061 ms
>>>> 64 bytes from 172.17.0.7: icmp_seq=1 ttl=64 time=0.049 ms
>>> OK, it's properly registered. So, type up a small config file, as 
>>> follows:
>>>
>>> all.role server
>>> all.manager ccqserv125.in2p3.fr:2131
>>> ssi.svclib libxrdsvc.so
>>> #oss.statlib -2 libXrdSsi.so
>>>
>>> Setup the environment as you normally would but don't start 
>>> anything. By hand do:
>>>
>>> <path>/cmsd -d -c <path to config file above>
>>>
>>> Send the output to me. The uncomment the "statlib" directive and so 
>>> the same thing again. Send that output to me as well.
>>>
>>> Andy
>>
>>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1