Hi,
Slide 19 in the presentation is addressing open/redirection performance.
The numbers should be in microseconds, not milliseconds. Some of the numbers
are wrong, maybe typos. The slide is vague regarding which combination of
client-server platforms was used. I've attached a file that summarizes the
results of my testing. It looks like the numbers came from these tests.
--Bill Weeks, SLAC, (650) 926-2909
>X-Authentication-Warning: pcuw01.cern.ch: elmer set sender to
[log in to unmask] using -f
>Date: Sun, 17 Jul 2005 20:18:29 +0200
>From: Peter Elmer <[log in to unmask]>
>To: Andreas Petzold <[log in to unmask]>
>Cc: [log in to unmask], [log in to unmask]
>Subject: Re: xrootd seminat talk at GSI (Darmstadt)
>Mime-Version: 1.0
>Content-Disposition: inline
>User-Agent: Mutt/1.4.1i
>X-PMX-Version: 4.7.1.128075, Antispam-Engine: 2.0.3.2, Antispam-Data:
2005.7.17.20
>
> Hi Andreas,
>
>On Sun, Jul 17, 2005 at 07:50:10PM +0200, Andreas Petzold wrote:
>> thanks for the comments.
>>
>> One more question (I've already emailed Andy, but he may be on the
>> road): on the slide that shows the numbers of the timing comparison
>> direct disk access, xrootd, xrootd w/ redirection I'm not sure about
>> the units. In Andy's talk, he uses [us] but does it mean 10^-6s which
>> would be really fast, or does it mean 10^-3s?
>
> I had understood it to be 10^-6s, but Andy or Bill should confirm this.
>
> Those numbers do have a tendency to confuse people, however. Andy or Bill
>Weeks can explain better what they were trying to do, but I understood it to
>be (roughly) an attempt to isolate and measure all pieces of the process of
>redirecting and opening files in order to insure that each piece within
>the xrootd daemon itself was scalable and performant. This means (again, I
>think) that they were probably measuring this with memory mapped files, which
>removes the latencies for finding/opening something on the disk system itself
>from their measurement. This _does_ demonstrate that the overhead from the
>xrootd daemon itself is minimal, but doesn't of course give one the true
>number you would see opening a file from disk. A couple of people reacted
>as if a fast one was being pulled on them. ;-) The numbers should be more
>clearly labeled to indicate this. In practice the disk part will presumably
>vary depending on what hardware one has and how many things are hitting it.
>All of this is of course somewhat independent from the data server itself.
>
> [Recall also that they were doing this in the context of measurements
>for the "Big Memory Machine", so in that case there is no disk subsystem
>part, but clearly for most people that has to be added on top.]
>
> Andy and Bill, is this correct?
>
> Pete
>
>
>> Peter Elmer wrote:
>> > Hi Andreas,
>> >
>> >On Sun, Jul 17, 2005 at 01:23:11PM +0200, Andreas Petzold wrote:
>> >
>> >>I gonna give a talk on xrootd at Darmstadt next week. I've put my slides
>> >>on the web:
>> >>
>> >>http://iktp.tu-dresden.de/~petzold/work/xrootd.pdf
>> >>http://iktp.tu-dresden.de/~petzold/work/xrootd.sxi
>> >>
>> >>If you've any comments or suggestions please let me know.
>> >
>> >
>> > It looks very nice. I see only a couple of small things (see below).
>> > When you have the final version let me know and I can add it to the
>> >presentations
>> >section of the xrootd webpage.
>> >
>> > thanks,
>> > Pete
>> >
>> > slide 9 - the protocol should be "root" instead of "xrootd" (the clients
>> > even enforce this, although strictly speaking they probably
>> > don't need to do that)
>> >
>> > slide 23 - The BaBar Padova Tier-A also uses xrootd, plus there is some
>> > set of university sites using it (e.g. for serving background
>> > events for MC production)
>> >
>> > Objectiviy -> Objectivity (technically this is the AMS, of
>> > course)
>> >
>> > One of the other things that the system has been able to
>> > achieve is that the basic system (even a load balanced
>> > system)
>> > is fairly easy to setup.
>> >
>> > slide 24 - I just realized Derek Feichtinger wasn't on the list. He has
>> > been contributing code, so I've added him... I'd call the
>> > list "core collaborators".
>> >
>> > Also, Priceton -> Princeton ;-)
>
>
>
>-------------------------------------------------------------------------
>Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644
>Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
>-------------------------------------------------------------------------
Test Program - xrdopen
This is a C program the uses the xroot API to communicate with an xrootd
or olbd server. The inner loop of this program consists of open(), close(),
open(), close(), read(). The first open incurs the cost of connecting
to the server, but the connection is retained for the second open. The
read() is a one byte read. Before and after each open() and read(), the
gettimeofday() system function is used to measure the latency of the
open() and read() functions. The gettimeofday() function appears to have
a granularity of 1 microsecond. The program returns the latency for the
first open(), second open(), and read() in microseconds.
Outer script - runxrdopen
The outer script invokes the test program 10,000 times, sorts the values
for the first open(), second open(), and read(), and then calculates the
average and median values. To avoid unwanted skewing of the average values,
the highest 1% of the sorted values are discarded.
Test results
The test program was run against the local filesystem on the xrootd
server to measure the operating system cost of opening a file. Then
all combinations of Solaris and Linux clients and xrootd servers were
tested. Finally, the a Solaris olbd was tested with both clients.
All numbers in the following table are in microseconds.
1st Open 2nd Open Read
Avg Median Avg Median Avg Median
Linux local filesystem 13 13 3 3 4 5
Solaris local filesystem 43 42 4 5 18 17
Solaris client -> Linux xrootd 4312 4241 884 862 122 122
Solaris client -> Solaris xrootd 4402 4315 917 895 133 133
Linux client -> Linux xrootd 7568 7544 5378 5329 108 108
Linux client -> Solaris xrootd 7685 7638 5429 5376 118 118
Solaris client -> Solaris olbd 5570 5424 1235 1206 133 132
Linux client -> Solaris olbd 11011 10916 8046 7969 117 117
|