Print

Print


Thank you for the response, Marian!

Cheers,
   Lukasz

On Wed, Apr 6, 2016, at 18:56, Marian Zvada wrote:
> Hi Lukasz,
> 
> sorry I missed this email... We left it under radar, my suspect is also 
> we had tons (more than usual load) of client requests coming at the same 
> time... I said we left if under radar meaning our admin will watch for 
> this particular behavior and when it occurs again I'll try to do some 
> quick parsing through the xrootd logs to see who is doing what at that 
> moment.
> 
> Since then I didn't hear it occurred again. Though, we had other issues 
> in last days, cmsd dying under big load etc, but that's something for 
> the other to open and discuss. Let's see what we find out, though.
> 
> Thanks,
> Marian
> 
> On 3/25/16 5:15 PM, Lukasz Janyst wrote:
> > Hi Marian,
> >
> > has this issue been resolved? If so, I would very much like to know how.
> >
> > Cheers,
> >     Lukasz
> >
> > On Thu, Mar 17, 2016, at 11:06, Lukasz Janyst wrote:
> >> Hi Marian,
> >>
> >> the only other not-completely-improbable explanation is that a massive
> >> number of clients tries to connect within a time slice of one RTT.
> >>
> >> Cheers,
> >>     Lukasz
> >>
> >> On Wed, Mar 16, 2016, at 20:58, Marian Zvada wrote:
> >>> Hi Lukasz,
> >>>
> >>> thanks for feedback, yep, it looks like more the system-wide scalability
> >>> issue which might or might not be connected to any bug in xrootd.
> >>> Though, the xrootd is the service hammered by something here which needs
> >>> attention, too.
> >>>
> >>> We'll watch closely SYNs on the UNL host and try to debug live when this
> >>> occurs again.
> >>>
> >>> Thanks,
> >>> Marian
> >>>
> >>> On 3/16/16 5:34 AM, Lukasz Janyst wrote:
> >>>> One way to debug this would be to run wireshark to see where the bogus
> >>>> SYN packets are coming from.
> >>>>
> >>>>      Lukasz
> >>>>
> >>>> On Wed, Mar 16, 2016, at 11:27, Lukasz Janyst wrote:
> >>>>> Isn't it a sign of either a DOS attack or a network problem? I would
> >>>>> guess that a restart of the service helps because, by closing the
> >>>>> listening socket, you close the corresponding kernel SYN queue.
> >>>>>
> >>>>>      Lukasz
> >>>>>
> >>>>> On Wed, Mar 16, 2016, at 00:39, Marian Zvada wrote:
> >>>>>> Hi Folks,
> >>>>>>
> >>>>>> we're seeing these two types of kernel messages which are obviously
> >>>>>> connected to xrootd process on US regional redirectors running on the
> >>>>>> port 1094:
> >>>>>>
> >>>>>> ---
> >>>>>> kernel: TCPv6: Possible SYN flooding on port 1094. Sending cookies.
> >>>>>> kernel: possible SYN flooding on port 1094. Sending cookies.
> >>>>>> ---
> >>>>>>
> >>>>>> This is happening intermittently on both US regional redirectors
> >>>>>> cmsxrootd1.fnal.gov and xrootd.unl.edu. Both are behind DNS aliased host
> >>>>>> cmsxrootd.fnal.gov. We're pretty confident that this typically occurs in
> >>>>>> syslog when redirector is giving very long waits for access to files
> >>>>>> through xrootd.
> >>>>>>
> >>>>>> Simple restart of service bring response time back to normal. We also
> >>>>>> didn't notice any significant increase in use of memory nor cpu on the
> >>>>>> machines itself so we're wondering if anyone from the list or developers
> >>>>>> may explain if this is something to worry about. It is also hard to
> >>>>>> catch so maybe if you have any idea what to watch next time and record
> >>>>>> (besides core file) that'll help. Luckily, we at least know when we're
> >>>>>> getting warning state of the xrootd-fallback SAM test this 'flooding' is
> >>>>>> likely happening again...
> >>>>>>
> >>>>>> FNAL and UNL regional redirectors run xrootd-4.3.0-0.rc3.el6.x86_64 and
> >>>>>> along slowness seen and odd kernel records in system logs there is
> >>>>>> nothing obvious in the xrootd and cmsd logs to report. Maybe do you know
> >>>>>> which specific xrootd process chain might trigger this kernel errors?
> >>>>>>
> >>>>>> Any feedback is very welcome!
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Marian
> >>>>>>
> >>>>>> ########################################################################
> >>>>>> Use REPLY-ALL to reply to list
> >>>>>>
> >>>>>> To unsubscribe from the XROOTD-L list, click the following link:
> >>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1