Print

Print


ciao, another piece of info which might be interesting:

I was looking into the bari eu redir, which uses xrootd

xrootd-4.1.1-1.el5

the cmsd.log has TONS of messages like

150826 05:18:00 30442 XrdInet: Accepted connection from [log in to unmask]
150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD 90 attached to poller 0; num=23
150826 05:18:00 30442 Pup: buffer overrun unpacking short arg 0: ident.
150826 05:18:00 30442 Login: fw-nat-inside-outside.gridka.de login failed; invalid login data
150826 05:18:00 30442 ?:[log in to unmask] XrdPoll: FD 90 detached from poller 0; num=22

from many servers, most from FNAL 

150826 21:41:28 3396 Login: cmswn2148.fnal.gov login failed; invalid login data
150826 21:41:28 3436 Login: cmswn2146.fnal.gov login failed; invalid login data
150826 21:41:35 3461 Login: cmswn2131.fnal.gov login failed; invalid login data
150826 21:41:36 2475 Login: cmswn2158.fnal.gov login failed; invalid login data
150826 21:41:40 3461 Login: cmswn2150.fnal.gov login failed; invalid login data
150826 21:41:45 3458 Login: cmswn2160.fnal.gov login failed; invalid login data
150826 21:41:47 3396 Login: cmswn2131.fnal.gov login failed; invalid login data
150826 21:41:50 3461 Login: cmswn2140.fnal.gov login failed; invalid login data
150826 21:41:56 3458 Login: cmswn2147.fnal.gov login failed; invalid login data

apparently, we did not notice since 4.1.1-1 does not crash as 4.2.2, but moves along ...

tom

On Tue, Aug 25, 2015 at 9:07 PM, Marian Zvada <[log in to unmask]> wrote:
On 8/25/15 11:58 AM, Tommaso Boccali wrote:
Well, but: isn't th global redir only subscribed by regional redirs (so
not many)?

you're right, I neglected this fact (outsmarted myself ;))...

Probably eu redirs are the most connected, with close to 64 cmsd
entering... It s just normal we saw the problem there.

ok, this is alarming and we should revise current setup and introduce more redirectors if needed in EU. Btw, I recently talked with Andy about this - it looks much more promising way to handle 64 limits - to think about supervisors:

http://xrootd.org/doc/dev42/cms_config.htm#_Toc405927050

I'm going to do this in transitional federation where there is one global redirector for all T3s and then those subscribers who will be kicked off from production federation and subscribed there instead.

-Marian

Ifca said it has 336-1, which is fairly common. I guess it cannot be due
to (just) the release....

Andy, did you understand the source of the bad Iogin data? Is it worth
trying and debugging it?

Tom

Il 25/ago/2015 06:21 PM, "Jan Iven" <[log in to unmask]
<mailto:[log in to unmask]>> ha scritto:

    On 08/25/2015 05:56 PM, Marian Zvada wrote:

        Hi Tom,

    [..]

        yeah, that is my guess too, but then we have global redirectors
        at CERN
        running 4.2.2 dealing with hell lot of cmsd subscriptions so I'd
        expect
        some visible trouble there as well. So maybe we're lucky there
        too so
        far... (I believe that autorestart of cmsd if it crashes is disabled
        there, Jan?)


    No, the CMS global redirectors are on CC7, and will auto-restart
    cmsd on "unclean" exit (Restart=on-abort).  I hope that SEGV counts
    as such...

    Not sure whether we'd even notice the occasional restart, unless
    another tool (abrt) picks this up.

    Cheers
    jan




--
Tommaso Boccali
INFN Pisa


Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1