OK, so if it happens again please get the log file messages ahead of he
issue and maybe a gcore. I assume once the messages started they never
stopped, yes?
Andy
On Thu, 16 Jan 2014, Matevz Tadel wrote:
> On 01/16/14 12:02, Andrew Hanushevsky wrote:
>> Yes, 1024 for a proxy server is too few. Make it 2048. We start over-riding
>> the
>> soft limit in 3.3.6 as we run into this problem way too often.
>
> That's not a proxy server ... it's the redirector for UCSD, not loaded. Will
> still increase it :)
>
> There was a peak of 520 connections yesterday early afternoon but that wasn't
> it, it went down to below 50 over period of 4 hours. It happened after
> midnight today.
>
> Matevz
>
>> Andy
>>
>> On Thu, 16 Jan 2014, Matevz Tadel wrote:
>>
>>> Hi Brian,
>>>
>>> On 01/16/14 11:18, Brian Bockelman wrote:
>>>> Hi Matevz,
>>>>
>>>> I've seen this happen before, but never tracked it down. Two thoughts:
>>>>
>>>> 1) Did the process run out of file descriptors?
>>>> 2) Did you hit the thread limit?
>>>
>>> Don't really think so ... if anything, it would be a thread limit. I have
>>> (from /proc/xxx/limits on current redirector, I stopped the old one before
>>> massaging the log):
>>> Max processes 1024 514545
>>> processes
>>> Max open files 65536 65536 files
>>>
>>> Matevz
>>>
>>>> Many daemons (in general, don't know about Xrootd) tend to hit weird
>>>> error
>>>> conditions when this happens.
>>>>
>>>> Brian
>>>>
>>>> On Jan 16, 2014, at 1:15 PM, Matevz Tadel <[log in to unmask]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I got the disk filled up during the night on the UCSD redirector with
>>>>> stuff
>>>>> like this:
>>>>> ...
>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>> nagios.14319:[log in to unmask]
>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>> nagios.14319:[log in to unmask]
>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>> nagios.14319:[log in to unmask]
>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>> nagios.14319:[log in to unmask]
>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>> 140116 04:08:34 109203 XrdPoll: Sever event occured for
>>>>> nagios.14319:[log in to unmask]
>>>>> 140116 04:08:34 109203 XrdPoll: Unable to exclude link
>>>>> nagios.14319:[log in to unmask]; bad file descriptor
>>>>> ...
>>>>>
>>>>> Any ideas? :) This was with 3.3.3.
>>>>>
>>>>> Cheers,
>>>>> Matevz
>>>>>
>>>>> ########################################################################
>>>>> Use REPLY-ALL to reply to list
>>>>>
>>>>> To unsubscribe from the XROOTD-DEV list, click the following link:
>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>>>>
>>>
>>> ########################################################################
>>> Use REPLY-ALL to reply to list
>>>
>>> To unsubscribe from the XROOTD-DEV list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>>>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-DEV list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|