Hi Andy
Here is some more information on the problem we are seeing at RAL
regarding the creation of worker threads.
The number of allowed processes on both the redirector and the server is
set to 29000 with the max number allowed set to 30000
/afs/slac.stanford.edu/u/br/olaiya/tmp/proc.txt
Starting 448 jobs (2 jobs on each batch machine), accessing data via
xrootd we start to see the following output in the server log:
041205 02:57:58 24692 XrdScheduler: Unable to create worker thread ;
resource temporarily unavailable
041205 02:58:02 24600 XrdScheduler: Unable to create worker thread ;
resource temporarily unavailable
( The full log is here:
/afs/slac.stanford.edu/u/br/olaiya/tmp/xrdlog.20041205 )
This happens when the number of opens file reaches ~340. At the same
time when listing the open files we see some of the connections are
flagged with (CLOSE_WAIT).
The output of lsof on the server can be found here:
/afs/slac.stanford.edu/u/br/olaiya/tmp/server_lsof.txt
Is there some other setting I should tweak in order to allow the
creation of more worker threads?
cheers
Manny
|