To follow up on our meeting with Rene this afternoon, I have
investigated a few throughput scenarios.
All this is done on my home W98 P2 400 MHz 128 MB memory PC, with the
data and code sitting on a SCSI-attached JAZ disk. According to Norton
Utilities, the disk benchmarks at 3.7 MB/s (compared to 4.9 MB/s for one
of my internal IDE disks).
I ran 2 tests each on two datasets, both using the Small detector
design: 500 single 5 GeV pions, and 73 500 GeV ZZ events. The first test
is to just read the dataset; the second is to add up the energy in the
EM Cal and to plot the x-y positions of the Tracker hits, *as a root
interactive macro*. One loop for each, ie no nested loops.
Here are the results (in all cases CPU = wall time)
dataset | ascii | root | root | read | anal | tot read | tot
anal |
| size | input | size | MB/s | MB/s | time | time
|
pions | 7.1 MB | 7.2 | 1.1 | 0.9MB/s | 0.6 | 8 s | 12
s |
ZZ | 42.7 | 40.1 | 8.3 | 0.8 | 0.6 | 49 s | 69
s |
We are getting factors of 5-6.5 in filesize. Doing analysis adds about
50%. Reading is apparently not limited by disk rates. You can see that
just reading 20k ZZ events with these times would take 3.7 hours!
Rene said Root would max at 3-4 MB/s, so for 0.5 MB/event for 20k ZZ
events, one is looking at 45 minutes to read the dataset.
I guess we'd better address the speedup suggestions pronto:
1. put compare functions into the TMap keys
2. index McPart (rather than use pointers)
3. write separate branches for each subsystem.
Richard
--
Richard Dubois
SLD, Stanford Linear Accelerator Center
[log in to unmask]
http://www.slac.stanford.edu/~richard/
650-926-3824
650-926-2923 (FAX)
|