Hi Fons, yes, in principle I agree, and I believe that the 'shortcut' feature in PROOF is a cool idea which enhances the overall processing rate. However, to make this possible without having to track file locations (in a slow DB? And what about spreading replicas? First you have to make them, and count this pre-processing in the benchmarks :-D) you chose to make the xrootd cluster and the worker nodes coexist, to exploit the location speed of the xrootd storage part. That is a cool idea too, and I ignore if with other systems that will be possible or easy to accomplish. The only thing is that I am skeptical about the fact that the proof scheduler will *in any case* be able to keep always the most efficient matching between worker nodes and their local storage. I suppose that, having many concurrent users and many different files, the gain will be reduced due to the difficulty of the scheduler to find always the optimal match. I suppose that the efficiency would decrease asynptotically to a middle point between the performance with and without shortcut. But that is not a loss, it's an advantage over the 'without' case, i.e. with the storage system completely detached from the proof cluster. So, I don't think that you don't need the local storage or the shortcut feature, unless you have a very powerful network/storage behind. You would end up in putting more horsepower in the storage/network with respect to the worker pool. The advantage which typically storage systems give is that of a greater flexibility, however. With xrootd you share the nodes and get the best of the two worlds, with other storage systems I really do not know. Fabrizio Fons Rademakers ha scritto: > Hi Fabrizio, > > assume a small rack of 20 1U/2U dual-quad-cores + 8 disks each. Such a > rack can process: 20 * 8 * 15 = 2.4GB/s (15MB/s ROOT compressed file > reading speed, I/O bound query). Now such a rack would need a switch > with a dual 10GB uplink to get just 2 GB/s in over the network. Now add > another couple of such racks. You would need a disk pool + a lot of 10GB > eth equipment per rack. You still think it scales better than having > disks close to the CPU's? > > Cheers, Fons. > > > Fabrizio Furano wrote: >> Hi Pablo, >> >> that's very interesting, and I agree completely with your conclusion, >> i.e. in most cases the lan data access is more efficient and scales >> better with respect to local disk access. Many times this is not very >> well understood by people, always striving to keep local files at any >> cost. >> >> It would be very interesting to have a comparison between the >> performance in proof between a dcache storage and an analogous xrootd >> storage, which is the default solution for that. With the same pool of >> workers of course. >> >> From what I've understood, dcache uses a read ahead mechanism (at the >> client side), while xrootd uses a scheme which is mixed with informed >> async prefetching. >> >> Fabrizio >> >> Pablo Fernandez ha scritto: >>> Hi all, >>> >>> I would like to share with you some information about my testings of >>> performance in Proof with different storage schemas. >>> http://root.cern.ch/phpBB2/viewtopic.php?t=6236 >>> >>> I have translated this topic to the Proof Forum since seems to me >>> more Proof-related than just xrootd, I hope you don't mind. >>> >>> BR/Pablo >> >