Print

Print


the fnal.gov address is from a WorkerNode (probably running a CMS job).

Gerard

On Thu, Sep 3, 2015 at 4:54 PM, Andrew Hanushevsky <[log in to unmask]> wrote:
Hi Tommaso,

What are fw-nat-inside-outside.gridka.de and cmswn2148.fnal.gov? The
message clearly shows that whatever they sent over was incorrect. Yes,
4.2.2 would crash in this case, sigh.

Andy

On Wed, 26 Aug 2015, Tommaso Boccali wrote:

> ciao, another piece of info which might be interesting:
>
> I was looking into the bari eu redir, which uses xrootd
>
> xrootd-4.1.1-1.el5
>
> the cmsd.log has TONS of messages like
>
> 150826 05:18:00 30442 XrdInet: Accepted connection from
> [log in to unmask]
> 150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD 90
> attached to poller 0; num=23
> 150826 05:18:00 30442 Pup: buffer overrun unpacking short arg 0: ident.
> 150826 05:18:00 30442 Login: fw-nat-inside-outside.gridka.de login failed;
> invalid login data
> 150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD 90
> detached from poller 0; num=22
>
> from many servers, most from FNAL
>
> 150826 21:41:28 3396 Login: cmswn2148.fnal.gov login failed; invalid login
> data
> 150826 21:41:28 3436 Login: cmswn2146.fnal.gov login failed; invalid login
> data
> 150826 21:41:35 3461 Login: cmswn2131.fnal.gov login failed; invalid login
> data
> 150826 21:41:36 2475 Login: cmswn2158.fnal.gov login failed; invalid login
> data
> 150826 21:41:40 3461 Login: cmswn2150.fnal.gov login failed; invalid login
> data
> 150826 21:41:45 3458 Login: cmswn2160.fnal.gov login failed; invalid login
> data
> 150826 21:41:47 3396 Login: cmswn2131.fnal.gov login failed; invalid login
> data
> 150826 21:41:50 3461 Login: cmswn2140.fnal.gov login failed; invalid login
> data
> 150826 21:41:56 3458 Login: cmswn2147.fnal.gov login failed; invalid login
> data
>
> apparently, we did not notice since 4.1.1-1 does not crash as 4.2.2, but
> moves along ...
>
> tom
>
> On Tue, Aug 25, 2015 at 9:07 PM, Marian Zvada <[log in to unmask]> wrote:
>
>> On 8/25/15 11:58 AM, Tommaso Boccali wrote:
>>
>>> Well, but: isn't th global redir only subscribed by regional redirs (so
>>> not many)?
>>>
>>
>> you're right, I neglected this fact (outsmarted myself ;))...
>>
>> Probably eu redirs are the most connected, with close to 64 cmsd
>>> entering... It s just normal we saw the problem there.
>>>
>>
>> ok, this is alarming and we should revise current setup and introduce more
>> redirectors if needed in EU. Btw, I recently talked with Andy about this -
>> it looks much more promising way to handle 64 limits - to think about
>> supervisors:
>>
>> http://xrootd.org/doc/dev42/cms_config.htm#_Toc405927050
>>
>> I'm going to do this in transitional federation where there is one global
>> redirector for all T3s and then those subscribers who will be kicked off
>> from production federation and subscribed there instead.
>>
>> -Marian
>>
>> Ifca said it has 336-1, which is fairly common. I guess it cannot be due
>>> to (just) the release....
>>>
>>> Andy, did you understand the source of the bad Iogin data? Is it worth
>>> trying and debugging it?
>>>
>>> Tom
>>>
>>> Il 25/ago/2015 06:21 PM, "Jan Iven" <[log in to unmask]
>>> <mailto:[log in to unmask]>> ha scritto:
>>>
>>>     On 08/25/2015 05:56 PM, Marian Zvada wrote:
>>>
>>>         Hi Tom,
>>>
>>>     [..]
>>>
>>>         yeah, that is my guess too, but then we have global redirectors
>>>         at CERN
>>>         running 4.2.2 dealing with hell lot of cmsd subscriptions so I'd
>>>         expect
>>>         some visible trouble there as well. So maybe we're lucky there
>>>         too so
>>>         far... (I believe that autorestart of cmsd if it crashes is
>>> disabled
>>>         there, Jan?)
>>>
>>>
>>>     No, the CMS global redirectors are on CC7, and will auto-restart
>>>     cmsd on "unclean" exit (Restart=on-abort).  I hope that SEGV counts
>>>     as such...
>>>
>>>     Not sure whether we'd even notice the occasional restart, unless
>>>     another tool (abrt) picks this up.
>>>
>>>     Cheers
>>>     jan
>>>
>>>
>
>
> --
> Tommaso Boccali
> INFN Pisa
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1



--
Gerard Bernabeu Altayó
Deputy Department Head

Distributed Computing Services Operations
Fermi National Accelerator Laboratory
630 840 6509 office
www.fnal.gov


Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1