Hi Pablo,
Interesting study!
Thanks for sharing these results.
It would be also nice to have more information about CPU load, network
I/O, memory consumption, etc during the tests.
Performance in events/s alone is usually not a very revealing number and
does not tell directly what part of the system influenced performance
scaling.
I would like to comment on your conclusions a bit.
If you plan to build a one node PROOF farm, then you probably should go
with centralized storage.
If you are looking at something that scales up, the decision is not that
straightforward.
The problem is to balance data intake rate of your farm with the
bandwidth provided by a given storage solution.
A canonical ball park number for a root job read rate is about 10 MBytes/s.
If you consider, say, 10 node farm with 8 cores per node, each core
running I/O bound proof worker process, then your sustained input
bandwidth demand will be about 800 MB/s.
If you go with a centralized storage solution you might have a problem.
Such load will definitely kill your typical 1Gbit/s local switch and is
likely to strain your site's network infrastructure.
Now, expand your imaginary PROOF farm to 100 nodes.
The point is that network quickly becomes a bottleneck.
If you have or can cheaply build network infrastructure that can sustain
such loads, then you are in good shape and can proceed with centralized
storage. Otherwise local distributed storage is your only solution.
For your particular case - one disk can sustain approximately 4 PROOF
jobs. That usually means that your jobs have read rate of about 5 MB/s.
So, second hard drive (RAIDed or not) should help to feed 8 jobs, which
seems to be an optimal number of jobs per node in your configuration anyway.
One 500GB extra disk cost about $100 these days.
Add one to each node and now you can expand your farm without much worries.
You might find some information on PROOF scaling studies in talks given
at the last PROOF workshop. http://root.cern.ch/root/PROOF2007/
Cheers,
Sergey
Pablo Fernandez wrote:
> Thanks!
>
> Unfortunately the xrootd protocol does not work as expected in dcache. The
> idea was to use a conventional SE to store all the data for Tier2 and also
> serve files to the Tier3... I don't know if Lustre implements an xrootd door
> as well, maybe in a few months I'll try that.
>
> BR/Pablo
>
> On Wednesday 20 February 2008 11:40, Fabrizio Furano wrote:
>> Hi Pablo,
>>
>> that's very interesting, and I agree completely with your conclusion,
>> i.e. in most cases the lan data access is more efficient and scales
>> better with respect to local disk access. Many times this is not very
>> well understood by people, always striving to keep local files at any cost.
>>
>> It would be very interesting to have a comparison between the
>> performance in proof between a dcache storage and an analogous xrootd
>> storage, which is the default solution for that. With the same pool of
>> workers of course.
>>
>> From what I've understood, dcache uses a read ahead mechanism (at the
>> client side), while xrootd uses a scheme which is mixed with informed
>> async prefetching.
>>
>> Fabrizio
>>
>> Pablo Fernandez ha scritto:
>>> Hi all,
>>>
>>> I would like to share with you some information about my testings of
>>> performance in Proof with different storage schemas.
>>>
>>> http://root.cern.ch/phpBB2/viewtopic.php?t=6236
>>>
>>> I have translated this topic to the Proof Forum since seems to me more
>>> Proof-related than just xrootd, I hope you don't mind.
>>>
>>> BR/Pablo
>
|