Print

Print


I am running the XRootD 3.0.5 release from the yum repo and am seeing 
rather severe performance issues.  I have a system of 12 active batch 
machines with a total of just over 200 batch slots.  When the batch 
system is full of jobs -- and this happens with a number of different 
types of jobs -- I see very poor performance from the xrootd server.

I see load averages of 140 to 200 or more and I see a high percent of 
CPU time in wait states (70-80%) with a large number of interrupts 
(10-15K), both of these reported by dstat.

The server is a Dell R710 with MD1000 disk arrays attached with SAS 
interconnects.  There is a 10G network interface and I have measured 
over 800 Gb/s total network throughput using xrdcp simultaneously on 10 
machines.

There are two namespaces, managed by two xrootd servers.  When the main 
namespace gets clogged up by batch jobs, the second namespace still 
performs well.

Is it expected that an xrootd process would not be able to handle 200 
simultaneous data flows?  If not, how should I go about debugging this 
system?

Thanks.

	Paul T. Keener
	Department of Physics and Astronomy
	University of Pennsylvania