Print

Print


On 01/16/14 13:07, Andrew Hanushevsky wrote:
> OK, so if it happens again please get the log file messages ahead of he issue
> and maybe a gcore. I assume once the messages started they never stopped, yes?

OK, deal. Yes, the messages never stopped, well ... until the disk was full :)

Matevz

> Andy
>
> On Thu, 16 Jan 2014, Matevz Tadel wrote:
>
>> On 01/16/14 12:02, Andrew Hanushevsky wrote:
>>> Yes, 1024 for a proxy server is too few. Make it 2048. We start over-riding the
>>> soft limit in 3.3.6 as we run into this problem way too often.
>>
>> That's not a proxy server ... it's the redirector for UCSD, not loaded. Will
>> still increase it :)
>>
>> There was a peak of 520 connections yesterday early afternoon but that wasn't
>> it, it went down to below 50 over period of 4 hours. It happened after
>> midnight today.
>>
>> Matevz
>>
>>> Andy
>>>
>>> On Thu, 16 Jan 2014, Matevz Tadel wrote:
>>>
>>>> Hi Brian,
>>>>
>>>> On 01/16/14 11:18, Brian Bockelman wrote:
>>>>> Hi Matevz,
>>>>>
>>>>> I've seen this happen before, but never tracked it down.  Two thoughts:
>>>>>
>>>>> 1) Did the process run out of file descriptors?
>>>>> 2) Did you hit the thread limit?
>>>>
>>>> Don't really think so ... if anything, it would be a thread limit. I have
>>>> (from /proc/xxx/limits on current redirector, I stopped the old one before
>>>> massaging the log):
>>>> Max processes             1024                 514545 processes
>>>> Max open files            65536                65536                files
>>>>
>>>> Matevz
>>>>
>>>>> Many daemons (in general, don't know about Xrootd) tend to hit weird error
>>>>> conditions when this happens.
>>>>>
>>>>> Brian
>>>>>
>>>>> On Jan 16, 2014, at 1:15 PM, Matevz Tadel <[log in to unmask]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I got the disk filled up during the night on the UCSD redirector with stuff
>>>>>> like this:
>>>>>> ...
>>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>>> nagios.14319:[log in to unmask]
>>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>>> nagios.14319:[log in to unmask]
>>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>>> nagios.14319:[log in to unmask]
>>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>>> nagios.14319:[log in to unmask]
>>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>>> nagios.14319:[log in to unmask]
>>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>>> ...
>>>>>>
>>>>>> Any ideas? :) This was with 3.3.3.
>>>>>>
>>>>>> Cheers,
>>>>>> Matevz
>>>>>>
>>>>>> ########################################################################
>>>>>> Use REPLY-ALL to reply to list
>>>>>>
>>>>>> To unsubscribe from the XROOTD-DEV list, click the following link:
>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>>>>>
>>>>
>>>> ########################################################################
>>>> Use REPLY-ALL to reply to list
>>>>
>>>> To unsubscribe from the XROOTD-DEV list, click the following link:
>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>>>>
>>
>> ########################################################################
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-DEV list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-DEV list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1