LISTSERV 16.5 - XROOTD-L Archives

Hi Niall,

> Yes. The writes in particular can be very high rate and bursty (and the 
> sequence of the messages is very important, so its critical that they are 
> appended in the right order). Currently up to 125,000 messages to be 
> written a second, with another doubling in the next 12 months not entirely 
> out of the question. Each message is pretty small (fits in an ethernet 
> frame..).
What are assuming your backing store will be? I say that because at 125,000 
ops/sec is not at all easy to support with many of today's storage devices. 
Additionally, the intereference you're talking about from other streams (see 
below) poses additional burdens. You may also need to use more than one 
server here.The highest single-server op rate we've ever measured was about 
90,000/sec (two CPU Sun v20z); using a DRAM storage model. I suspect that we 
would really have no problem, given the right hardware, with 125,000 
ops/sec.

> The reads are a mix of 'realtime' which are reasonable small as well, say 
> a few hundred kB, and 'historical' which may stream through a TB or more. 
> One critical aspect which your applications do not appear to have is that 
> the time between a message arriving and being available to data readers 
> must be as small as possible. Preferably milliseconds.
The latency  is not the problem here. So far we our tests have shown that 
xrootd will easily turn around  a request, measured as a full roundtrip,  in 
60 microseconds (that's includes all 10Gb network and client overhead). Our 
measurements show that the theretical lower limit is probably on the order 
of 20 microseconds assuming that you're using a 2.8GHZ Intel processor and 
infiniband. We could probably do better with some tinkering. Also remember 
that single stream latency does not necessarily correlate with ops/sec.We 
have yet to figure out what that correlation is for xrootd but we do know 
that it is a precondition for a high transaction rate. Our focus on latency 
comes from the PetaCache project where we are trying to support thousands of 
parallel random read requests. Now, what is the problem is the "other" 
non-high transaction rate work that would be going on at the same time. 
We've found that mixing such workloads on a single server or device hurts 
one or the other and many times both. I don't think you can expect a high 
transaction rate if the data holder has to also support "bulk" requests.

Andy