Hi Brian,
So, this is the global redirector, yes?
Andy
-----Original Message-----
From: Brian Bockelman
Sent: Wednesday, July 06, 2011 3:19 PM
To: Andrew Hanushevsky
Cc: xrootd-dev
Subject: Re: Hitting thread limits?
On Jul 6, 2011, at 5:14 PM, Andrew Hanushevsky wrote:
> Hi Brian,
>
> Hmmm, are you specifying the xrd.sched maxt directive? If so, shame on you
> and immediately remove it!
>
No, actually.
> If not, is your OS limit set to 500? It shouldn't be, typically it should
> at least 1K and usually 2k. Is the message coming from the xrootd or the
> cmsd? It makes a big difference. For the xrootd, the limit can be reached
> depending on how fast one can turn around a transaction. Internally, it's
> set to no less than 5 seconds to avoid rescheduling if the client has
> another request in the queue. For the redirector that may be longer than
> need be. If it's the cmsd then we need to look where the requests are
> coming from. This is just a local redirector, yes? Or is this the global
> one?
>
This is from the cmsd: it turns out that one T2 has a completely broken-down
storage, and all requests were going to the redirector. Unfortunately, the
broken T2 is the only site in the US with heavy-ion data... meaning the
redirector searched pointlessly for files for the 500 clients, and easily
hitting 2048 threads.
Not sure what we can do about this?
Brian
|