Print

Print


ciao, coming back to this, a few months later (on 4.2.3)

i still see TONS of

160218 17:08:17 42171 Pup: buffer overrun unpacking short arg 0: ident.
160218 17:08:17 42171 Login: gridlink.hephy.oeaw.ac.at login failed;
invalid login data
...
160218 17:07:52 42118 Login: grid-wn080.physik.rwth-aachen.de login failed;
invalid login data
...
160218 17:04:06 40163 Login: fw-nat-inside-outside.gridka.de login failed;
invalid login data
...
160218 16:53:56 25501 Login: wna033.jinr-t1.ru login failed; invalid login
data


in cmsd.log

not sure it has any bad effect ... but: should we care?

this is at least 1 Hz, and comes form multiple sites ....


tom

On Fri, Sep 4, 2015 at 7:55 AM, Andrew Hanushevsky <[log in to unmask]>
wrote:

> Hi Tommaso,
>
> You mentioned that the fnal.goc addresses are worer nodes. Why are they
> connecting to the cmsd?
>
> Andy
>
>
> On Fri, 4 Sep 2015, Tommaso Boccali wrote:
>
> By the way, yesterday i upgraded the eu redir to 423. Seems to work fine,
>> even if the statistics is less than 1 day for the moment....
>>
>> Tom
>> Il 04/set/2015 01:33 AM, "Gerard Bernabeu" <[log in to unmask]> ha scritto:
>>
>> the fnal.gov address is from a WorkerNode (probably running a CMS job).
>>>
>>> Gerard
>>>
>>> On Thu, Sep 3, 2015 at 4:54 PM, Andrew Hanushevsky <
>>> [log in to unmask]>
>>> wrote:
>>>
>>> Hi Tommaso,
>>>>
>>>> What are fw-nat-inside-outside.gridka.de and cmswn2148.fnal.gov? The
>>>> message clearly shows that whatever they sent over was incorrect. Yes,
>>>> 4.2.2 would crash in this case, sigh.
>>>>
>>>> Andy
>>>>
>>>> On Wed, 26 Aug 2015, Tommaso Boccali wrote:
>>>>
>>>> ciao, another piece of info which might be interesting:
>>>>>
>>>>> I was looking into the bari eu redir, which uses xrootd
>>>>>
>>>>> xrootd-4.1.1-1.el5
>>>>>
>>>>> the cmsd.log has TONS of messages like
>>>>>
>>>>> 150826 05:18:00 30442 XrdInet: Accepted connection from
>>>>> [log in to unmask]
>>>>> 150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD
>>>>>
>>>> 90
>>>>
>>>>> attached to poller 0; num=23
>>>>> 150826 05:18:00 30442 Pup: buffer overrun unpacking short arg 0: ident.
>>>>> 150826 05:18:00 30442 Login: fw-nat-inside-outside.gridka.de login
>>>>>
>>>> failed;
>>>>
>>>>> invalid login data
>>>>> 150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD
>>>>>
>>>> 90
>>>>
>>>>> detached from poller 0; num=22
>>>>>
>>>>> from many servers, most from FNAL
>>>>>
>>>>> 150826 21:41:28 3396 Login: cmswn2148.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:28 3436 Login: cmswn2146.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:35 3461 Login: cmswn2131.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:36 2475 Login: cmswn2158.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:40 3461 Login: cmswn2150.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:45 3458 Login: cmswn2160.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:47 3396 Login: cmswn2131.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:50 3461 Login: cmswn2140.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>> 150826 21:41:56 3458 Login: cmswn2147.fnal.gov login failed; invalid
>>>>>
>>>> login
>>>>
>>>>> data
>>>>>
>>>>> apparently, we did not notice since 4.1.1-1 does not crash as 4.2.2,
>>>>> but
>>>>> moves along ...
>>>>>
>>>>> tom
>>>>>
>>>>> On Tue, Aug 25, 2015 at 9:07 PM, Marian Zvada <[log in to unmask]>
>>>>>
>>>> wrote:
>>>>
>>>>>
>>>>> On 8/25/15 11:58 AM, Tommaso Boccali wrote:
>>>>>>
>>>>>> Well, but: isn't th global redir only subscribed by regional redirs
>>>>>>>
>>>>>> (so
>>>>
>>>>> not many)?
>>>>>>>
>>>>>>>
>>>>>> you're right, I neglected this fact (outsmarted myself ;))...
>>>>>>
>>>>>> Probably eu redirs are the most connected, with close to 64 cmsd
>>>>>>
>>>>>>> entering... It s just normal we saw the problem there.
>>>>>>>
>>>>>>>
>>>>>> ok, this is alarming and we should revise current setup and introduce
>>>>>>
>>>>> more
>>>>
>>>>> redirectors if needed in EU. Btw, I recently talked with Andy about
>>>>>>
>>>>> this -
>>>>
>>>>> it looks much more promising way to handle 64 limits - to think about
>>>>>> supervisors:
>>>>>>
>>>>>> http://xrootd.org/doc/dev42/cms_config.htm#_Toc405927050
>>>>>>
>>>>>> I'm going to do this in transitional federation where there is one
>>>>>>
>>>>> global
>>>>
>>>>> redirector for all T3s and then those subscribers who will be kicked
>>>>>>
>>>>> off
>>>>
>>>>> from production federation and subscribed there instead.
>>>>>>
>>>>>> -Marian
>>>>>>
>>>>>> Ifca said it has 336-1, which is fairly common. I guess it cannot be
>>>>>>
>>>>> due
>>>>
>>>>> to (just) the release....
>>>>>>>
>>>>>>> Andy, did you understand the source of the bad Iogin data? Is it
>>>>>>> worth
>>>>>>> trying and debugging it?
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>> Il 25/ago/2015 06:21 PM, "Jan Iven" <[log in to unmask]
>>>>>>> <mailto:[log in to unmask]>> ha scritto:
>>>>>>>
>>>>>>>     On 08/25/2015 05:56 PM, Marian Zvada wrote:
>>>>>>>
>>>>>>>         Hi Tom,
>>>>>>>
>>>>>>>     [..]
>>>>>>>
>>>>>>>         yeah, that is my guess too, but then we have global
>>>>>>>
>>>>>> redirectors
>>>>
>>>>>         at CERN
>>>>>>>         running 4.2.2 dealing with hell lot of cmsd subscriptions so
>>>>>>>
>>>>>> I'd
>>>>
>>>>>         expect
>>>>>>>         some visible trouble there as well. So maybe we're lucky
>>>>>>> there
>>>>>>>         too so
>>>>>>>         far... (I believe that autorestart of cmsd if it crashes is
>>>>>>> disabled
>>>>>>>         there, Jan?)
>>>>>>>
>>>>>>>
>>>>>>>     No, the CMS global redirectors are on CC7, and will auto-restart
>>>>>>>     cmsd on "unclean" exit (Restart=on-abort).  I hope that SEGV
>>>>>>>
>>>>>> counts
>>>>
>>>>>     as such...
>>>>>>>
>>>>>>>     Not sure whether we'd even notice the occasional restart, unless
>>>>>>>     another tool (abrt) picks this up.
>>>>>>>
>>>>>>>     Cheers
>>>>>>>     jan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> Tommaso Boccali
>>>>> INFN Pisa
>>>>>
>>>>>
>>>> ########################################################################
>>>> Use REPLY-ALL to reply to list
>>>>
>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>
>>>>
>>>
>>>
>>> --
>>> *Gerard Bernabeu AltayĆ³*
>>> Deputy Department Head
>>>
>>> Distributed Computing Services Operations
>>> Fermi National Accelerator Laboratory
>>> 630 840 6509 office
>>> www.fnal.gov
>>>
>>>


-- 
Tommaso Boccali
INFN Pisa

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1