OK, so if it happens again please get the log file messages ahead of he issue and maybe a gcore. I assume once the messages started they never stopped, yes? Andy On Thu, 16 Jan 2014, Matevz Tadel wrote: > On 01/16/14 12:02, Andrew Hanushevsky wrote: >> Yes, 1024 for a proxy server is too few. Make it 2048. We start over-riding >> the >> soft limit in 3.3.6 as we run into this problem way too often. > > That's not a proxy server ... it's the redirector for UCSD, not loaded. Will > still increase it :) > > There was a peak of 520 connections yesterday early afternoon but that wasn't > it, it went down to below 50 over period of 4 hours. It happened after > midnight today. > > Matevz > >> Andy >> >> On Thu, 16 Jan 2014, Matevz Tadel wrote: >> >>> Hi Brian, >>> >>> On 01/16/14 11:18, Brian Bockelman wrote: >>>> Hi Matevz, >>>> >>>> I've seen this happen before, but never tracked it down. Two thoughts: >>>> >>>> 1) Did the process run out of file descriptors? >>>> 2) Did you hit the thread limit? >>> >>> Don't really think so ... if anything, it would be a thread limit. I have >>> (from /proc/xxx/limits on current redirector, I stopped the old one before >>> massaging the log): >>> Max processes 1024 514545 >>> processes >>> Max open files 65536 65536 files >>> >>> Matevz >>> >>>> Many daemons (in general, don't know about Xrootd) tend to hit weird >>>> error >>>> conditions when this happens. >>>> >>>> Brian >>>> >>>> On Jan 16, 2014, at 1:15 PM, Matevz Tadel <[log in to unmask]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I got the disk filled up during the night on the UCSD redirector with >>>>> stuff >>>>> like this: >>>>> ... >>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link >>>>> nagios.14319:[log in to unmask]; bad file descriptor >>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for >>>>> nagios.14319:[log in to unmask] >>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link >>>>> nagios.14319:[log in to unmask]; bad file descriptor >>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for >>>>> nagios.14319:[log in to unmask] >>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link >>>>> nagios.14319:[log in to unmask]; bad file descriptor >>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for >>>>> nagios.14319:[log in to unmask] >>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link >>>>> nagios.14319:[log in to unmask]; bad file descriptor >>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for >>>>> nagios.14319:[log in to unmask] >>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link >>>>> nagios.14319:[log in to unmask]; bad file descriptor >>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for >>>>> nagios.14319:[log in to unmask] >>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link >>>>> nagios.14319:[log in to unmask]; bad file descriptor >>>>> ... >>>>> >>>>> Any ideas? :) This was with 3.3.3. >>>>> >>>>> Cheers, >>>>> Matevz >>>>> >>>>> ######################################################################## >>>>> Use REPLY-ALL to reply to list >>>>> >>>>> To unsubscribe from the XROOTD-DEV list, click the following link: >>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >>>> >>> >>> ######################################################################## >>> Use REPLY-ALL to reply to list >>> >>> To unsubscribe from the XROOTD-DEV list, click the following link: >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >>> > > ######################################################################## > Use REPLY-ALL to reply to list > > To unsubscribe from the XROOTD-DEV list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1