Print

Print


By the way, yesterday i upgraded the eu redir to 423. Seems to work fine,
even if the statistics is less than 1 day for the moment....

Tom
Il 04/set/2015 01:33 AM, "Gerard Bernabeu" <[log in to unmask]> ha scritto:

> the fnal.gov address is from a WorkerNode (probably running a CMS job).
>
> Gerard
>
> On Thu, Sep 3, 2015 at 4:54 PM, Andrew Hanushevsky <[log in to unmask]>
> wrote:
>
>> Hi Tommaso,
>>
>> What are fw-nat-inside-outside.gridka.de and cmswn2148.fnal.gov? The
>> message clearly shows that whatever they sent over was incorrect. Yes,
>> 4.2.2 would crash in this case, sigh.
>>
>> Andy
>>
>> On Wed, 26 Aug 2015, Tommaso Boccali wrote:
>>
>> > ciao, another piece of info which might be interesting:
>> >
>> > I was looking into the bari eu redir, which uses xrootd
>> >
>> > xrootd-4.1.1-1.el5
>> >
>> > the cmsd.log has TONS of messages like
>> >
>> > 150826 05:18:00 30442 XrdInet: Accepted connection from
>> > [log in to unmask]
>> > 150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD
>> 90
>> > attached to poller 0; num=23
>> > 150826 05:18:00 30442 Pup: buffer overrun unpacking short arg 0: ident.
>> > 150826 05:18:00 30442 Login: fw-nat-inside-outside.gridka.de login
>> failed;
>> > invalid login data
>> > 150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD
>> 90
>> > detached from poller 0; num=22
>> >
>> > from many servers, most from FNAL
>> >
>> > 150826 21:41:28 3396 Login: cmswn2148.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:28 3436 Login: cmswn2146.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:35 3461 Login: cmswn2131.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:36 2475 Login: cmswn2158.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:40 3461 Login: cmswn2150.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:45 3458 Login: cmswn2160.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:47 3396 Login: cmswn2131.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:50 3461 Login: cmswn2140.fnal.gov login failed; invalid
>> login
>> > data
>> > 150826 21:41:56 3458 Login: cmswn2147.fnal.gov login failed; invalid
>> login
>> > data
>> >
>> > apparently, we did not notice since 4.1.1-1 does not crash as 4.2.2, but
>> > moves along ...
>> >
>> > tom
>> >
>> > On Tue, Aug 25, 2015 at 9:07 PM, Marian Zvada <[log in to unmask]>
>> wrote:
>> >
>> >> On 8/25/15 11:58 AM, Tommaso Boccali wrote:
>> >>
>> >>> Well, but: isn't th global redir only subscribed by regional redirs
>> (so
>> >>> not many)?
>> >>>
>> >>
>> >> you're right, I neglected this fact (outsmarted myself ;))...
>> >>
>> >> Probably eu redirs are the most connected, with close to 64 cmsd
>> >>> entering... It s just normal we saw the problem there.
>> >>>
>> >>
>> >> ok, this is alarming and we should revise current setup and introduce
>> more
>> >> redirectors if needed in EU. Btw, I recently talked with Andy about
>> this -
>> >> it looks much more promising way to handle 64 limits - to think about
>> >> supervisors:
>> >>
>> >> http://xrootd.org/doc/dev42/cms_config.htm#_Toc405927050
>> >>
>> >> I'm going to do this in transitional federation where there is one
>> global
>> >> redirector for all T3s and then those subscribers who will be kicked
>> off
>> >> from production federation and subscribed there instead.
>> >>
>> >> -Marian
>> >>
>> >> Ifca said it has 336-1, which is fairly common. I guess it cannot be
>> due
>> >>> to (just) the release....
>> >>>
>> >>> Andy, did you understand the source of the bad Iogin data? Is it worth
>> >>> trying and debugging it?
>> >>>
>> >>> Tom
>> >>>
>> >>> Il 25/ago/2015 06:21 PM, "Jan Iven" <[log in to unmask]
>> >>> <mailto:[log in to unmask]>> ha scritto:
>> >>>
>> >>>     On 08/25/2015 05:56 PM, Marian Zvada wrote:
>> >>>
>> >>>         Hi Tom,
>> >>>
>> >>>     [..]
>> >>>
>> >>>         yeah, that is my guess too, but then we have global
>> redirectors
>> >>>         at CERN
>> >>>         running 4.2.2 dealing with hell lot of cmsd subscriptions so
>> I'd
>> >>>         expect
>> >>>         some visible trouble there as well. So maybe we're lucky there
>> >>>         too so
>> >>>         far... (I believe that autorestart of cmsd if it crashes is
>> >>> disabled
>> >>>         there, Jan?)
>> >>>
>> >>>
>> >>>     No, the CMS global redirectors are on CC7, and will auto-restart
>> >>>     cmsd on "unclean" exit (Restart=on-abort).  I hope that SEGV
>> counts
>> >>>     as such...
>> >>>
>> >>>     Not sure whether we'd even notice the occasional restart, unless
>> >>>     another tool (abrt) picks this up.
>> >>>
>> >>>     Cheers
>> >>>     jan
>> >>>
>> >>>
>> >
>> >
>> > --
>> > Tommaso Boccali
>> > INFN Pisa
>> >
>>
>> ########################################################################
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>
>
>
>
> --
> *Gerard Bernabeu AltayĆ³*
> Deputy Department Head
>
> Distributed Computing Services Operations
> Fermi National Accelerator Laboratory
> 630 840 6509 office
> www.fnal.gov
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1