Print

Print


Hi Marian,

has this issue been resolved? If so, I would very much like to know how.

Cheers,
   Lukasz

On Thu, Mar 17, 2016, at 11:06, Lukasz Janyst wrote:
> Hi Marian,
> 
> the only other not-completely-improbable explanation is that a massive
> number of clients tries to connect within a time slice of one RTT.
> 
> Cheers,
>    Lukasz
> 
> On Wed, Mar 16, 2016, at 20:58, Marian Zvada wrote:
> > Hi Lukasz,
> > 
> > thanks for feedback, yep, it looks like more the system-wide scalability 
> > issue which might or might not be connected to any bug in xrootd. 
> > Though, the xrootd is the service hammered by something here which needs 
> > attention, too.
> > 
> > We'll watch closely SYNs on the UNL host and try to debug live when this 
> > occurs again.
> > 
> > Thanks,
> > Marian
> > 
> > On 3/16/16 5:34 AM, Lukasz Janyst wrote:
> > > One way to debug this would be to run wireshark to see where the bogus
> > > SYN packets are coming from.
> > >
> > >     Lukasz
> > >
> > > On Wed, Mar 16, 2016, at 11:27, Lukasz Janyst wrote:
> > >> Isn't it a sign of either a DOS attack or a network problem? I would
> > >> guess that a restart of the service helps because, by closing the
> > >> listening socket, you close the corresponding kernel SYN queue.
> > >>
> > >>     Lukasz
> > >>
> > >> On Wed, Mar 16, 2016, at 00:39, Marian Zvada wrote:
> > >>> Hi Folks,
> > >>>
> > >>> we're seeing these two types of kernel messages which are obviously
> > >>> connected to xrootd process on US regional redirectors running on the
> > >>> port 1094:
> > >>>
> > >>> ---
> > >>> kernel: TCPv6: Possible SYN flooding on port 1094. Sending cookies.
> > >>> kernel: possible SYN flooding on port 1094. Sending cookies.
> > >>> ---
> > >>>
> > >>> This is happening intermittently on both US regional redirectors
> > >>> cmsxrootd1.fnal.gov and xrootd.unl.edu. Both are behind DNS aliased host
> > >>> cmsxrootd.fnal.gov. We're pretty confident that this typically occurs in
> > >>> syslog when redirector is giving very long waits for access to files
> > >>> through xrootd.
> > >>>
> > >>> Simple restart of service bring response time back to normal. We also
> > >>> didn't notice any significant increase in use of memory nor cpu on the
> > >>> machines itself so we're wondering if anyone from the list or developers
> > >>> may explain if this is something to worry about. It is also hard to
> > >>> catch so maybe if you have any idea what to watch next time and record
> > >>> (besides core file) that'll help. Luckily, we at least know when we're
> > >>> getting warning state of the xrootd-fallback SAM test this 'flooding' is
> > >>> likely happening again...
> > >>>
> > >>> FNAL and UNL regional redirectors run xrootd-4.3.0-0.rc3.el6.x86_64 and
> > >>> along slowness seen and odd kernel records in system logs there is
> > >>> nothing obvious in the xrootd and cmsd logs to report. Maybe do you know
> > >>> which specific xrootd process chain might trigger this kernel errors?
> > >>>
> > >>> Any feedback is very welcome!
> > >>>
> > >>> Thanks,
> > >>> Marian
> > >>>
> > >>> ########################################################################
> > >>> Use REPLY-ALL to reply to list
> > >>>
> > >>> To unsubscribe from the XROOTD-L list, click the following link:
> > >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1