Print

Print


	Hi Daniele,

do you have an example of a log file for these jobs ? I do not know
exactly what servers these disks have been installed on, but we
noticed in E158, where most of the data were sitting on one
(relatively slow) server, jobs were limited by I/O throughput to about
2 MB/sec. This limit comes from the random access pattern that split
ROOT trees provide. If your job is sufficiently fast, you can saturate
I/O limit quite quickly -- with 2-3 jobs. If you submit too many jobs
(tens or even hundreds), the server will thrash to the point that the
clients will receive NFS timeouts. ROOT usually does not like that --
you may see error messages in the log file about files not found (when
the files are actually on disk), or about problems uncompressing
branches. These are usually more severe on Linux clients, where the
NFS client implementation is not very robust.. 

There are several ways to cope with this problem:

1) Submit fewer jobs at one time. I would not submit more than 10
   I/O-limited jobs in parallel. 
2) Place your data on different servers. That means, different sulky
   servers is best. Even if you are on the same sulky server but split
   your data onto different partitions, you still get the benefit of
   parallelizing disk access
3) Re-write your jobs to first copy your data onto a local disk on the
   batch worker (for instance, /tmp), then run on the local copy, then
   delete the local copy. The benefit of that is that the cp command
   will access the file in direct-access mode (with 10-20 MB/sec
   throughput, depending on the network interface throughput). 
4) Make your ntuples non-split (very highly recommended). This usually
   increases the throughput by a factor of 10-20. If your typical job
   reads most of the branches of the tree, making tree split makes no
   sense. Non-split trees provide direct access to disk, which is much
   more optimal. 

							Yury


On Thu, Oct 31, 2002 at 09:26:08AM -0800, Daniele del Re wrote:
> 
> Hi all,
> 
>  in the last two days I tried to run on data and MC on the new disk AWG18.
> No way. I got problems in the 80% of the jobs. Someone crashed, most of
> them have did not read a large number of root files (actually there).
> 
>  This problem seems to be worse than ever. Do we have to contact
> computing people about this?
> 
>  Daniele
> 
>