Print

Print


Hello,

We have been experiencing an issue at Purdue where our redirector and
servers will stop responding to client requests to read data.  It looks
like the issue happens during the authentication process.  The xrootd
process stops logging anything during this time.  The only solution I
have found is to restart the xrootd process.  After that, things start
working normally again.

I attached output from strace, netstat, lsof and limits.  Strace shows a
bunch of read/writes for what looks like lcmaps logging.  Netstat shows
a ton of connections in a CLOSE_WAIT state, but not to the point were
the process is going to run out of FDs.

Also I attached two attempts at xrdcp from an unresponsive server.  One
when getting redirected from our redirector and one when copying
directly from the server.

Other processes on the same server like gridftp are still authenticating
with gums properly during this time.

Can you help?  We are facing servers becoming responsive on a daily
basis.  Please let me know if you need more information.

There is also an OSG ticket on this issue:
https://ticket.grid.iu.edu/20867

Thanks,
-Erik


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1