Print

Print


Hi,

   I can't attend today but I send a status for GridKa below:

>     o FZK - Gregory
>        o status of setting up the xrootd system
>        o GPFS servers?

* The GPFS fileservers are not yet ready and we have a setup with 12 nas 
boxes out of which 6 are actually used for providing collections which are 
beeing imported from SLAC.


* I did some tests in order to compare the performance using xrd or nfs

   I am running 10 to 70 simultaneous jobs (I'm taking extra care that
they start in the same time) in which I "KanCopyUtil -r -n 25000" one of
the 42 test collections (chosen randomly) I have on disk on one of the nas
box. I also took care that the running condition remain constant as much
as possible (for expl: staying all along on the same batch worked).

    For each test, I am monitoring the network i/o flow and the time the
jobs take by using the ganglia monitoring system. The results are
summarized on the plot:

http://www.slac.stanford.edu/~schott/internal/gridka/comparison_xrd-nfs.jpg

    When running through nfs, the net in is saturated and as a consequence
the running time increases linearly with the number of jobs. When running
through xrd, the net in is higher (the double when having 10 jobs) and
still increase when I am running more jobs (it is 5 times more with 70
jobs). The running time is then shorter, for a given number of jobs
running in parallel, when using xrd than when using nfs (the time ratio is
1/3 with 70 jobs); it also increases more slowly.


*  Pete suggested that I also perform a test in which I checked out 
specific packages in order to :

> have a version of KanCopyUtil in which the TFile buffer cache
> has been turned on. This _should_ cause ROOT to switch from making lots
> of very small reads (4-5kB) to reading 512kB blocks. (Which may be a
> bit large, but the exact size can be tuned later. I'm just curious if
> this type of change has any effect on the performance you see.)

   The result are:

                 | time          net.in          CPU
----------------------------------------------------
NFS (old tags)  | 45 min        140 kB          10%
XRD (ols tags)  | 15 min        640 kB          35%
----------------------------------------------------
NFS (new tags)  | 58 min        500 kB          10%
XRD (new tags)  |  6 min        600 kB          80%


   I added the system CPU as it varied by a large amount between these 
tests. xrd got faster with a higher CPU and a similar net.in while nfs got 
slower, with a higher net.in (similar to xrd) and the same CPU.

   Any comments are welcome!

Cheers,
   Gregory