Hi all,
it seems that the parallel open now is quite stable. I am unable to
get crashes with 20-30 opens. A test with 533 opens is in progress, but
I see no reasons for troubles.
Here I list the TFile/TXNetFile interface as it is now, in order to
get some comments/fixes/agreements and later submit a stable set of
fixes to ROOT, since there are some fixes to TFile involved.
There are two ways to issue parallel opens:
a) explicit async open requests via TFile
b) set the XNet.ForceParallelOpen and every regular TFile::Open via
TXNetFile will try to be parallel. NOTE: for the structure of TFile,
this will work only under a particular assumption.
Any comments?
Fabrizio
-----
Case a), here is the interface (from an idea by Fons/Gerri):
with:
tfh = TFile::AsyncOpenRequest(myurl);
we start a request for a parallel open. The returned value is not a
TFile but an internal structure. We can request LOTS of parallel opens
at lightspeed, if the requested urls are xrootd:// urls of course.
Otherwise the files are simply opened through TFile, as usual. No
parallel rfio or rootd or whatever.
When we need the TFile, because the app has to read/write something,
we obtain it this way:
TFile *f = f = TFile::Open(tfh);
That's all.
------
-----
Case b)
Question: since XrdClient is parallel by default, why
TFile-->TXNetFile cannot simply be?
Answer: because A LOT of ROOT functions (e.g. Map() ) suppose that a
TFile instance IS initialized immediately after a TFile::Open(). If the
open is proceeding in parallel, this is not true.
Workaround: set the XNet.ForceParallelOpen to 1, and check for the
file openness before doing anything on it. The file openness check will
WAIT for the open to be completed, so we can postpone it, actually
keeping multiple opening files in the background.
Example:
TFile *f = f = TFile::Open("root://a.b.c.d//xyz.root");
... big and important computation ...
... but the file open is proceeding in parallel ...
... still we don't touch f ...
f->IsOpen();
f->Map();
Question: What's going to happen if I invoke some ROOT functions
without calling IsOpen()
Answer: Everything can happen. The best you can get is a crash. Or
nothing.
---------------------------------------------------------------
And here is the simple ROOT script I used for the tests towards
the Kanga cluster at SLAC:
{
// We want to open all the files specified in this textfile
// The textfile contains a list of xrootd urls
ifstream ff("/afs/slac.stanford.edu/u/br/furano/someurls.txt");
// The user needing to open many files will hold all the TFile pointers
// somewhere. We use this silly array. This is a test proggy!
TFile *tfv[2000];
// Handles for the parallel open feature. Strategy #1.
TFile::TFileHandler_t *tfhv[2000];
// Various initializations
TFile::TFileHandler_t *tfh;
string s;
int i = 0;
for (i = 0; i < 2000; i++) {
tfhv[i] = 0;
tfv[i] = 0;
}
i = 0;
while (!ff.eof()) {
ff >> s;
cout << s << endl << endl;
tfh = 0;
// This one requests the file open. It will be carried out
// in the background.
tfh = TFile::AsyncOpenRequest(s.c_str());
// The handle for the file open has to kept to get
// the TFile pointer when needed
if (tfh) tfhv[i] = tfh;
i++;
}
// All the requests have been submitted. The only thing we are sure of
// is that they are in progress.
// By calling TFile::Open with a handle, we actually WAIT for THAT file open
// request to be FINISHED. All the others will continue in the background.
//
// This is a stupid but massive test proggy, so we get sequentially ALL
the TFiles
TFile *f;
for (i = 0; i < 2000; i++) {
if (tfhv[i]) {
f = TFile::Open(tfhv[i]);
tfv[i] = f;
}
}
// And we close everything
for (i = 0; i < 2000; i++) {
f = tfv[i];
cout << endl << endl << endl << "Destroying instance " << i << " f="
<< f << endl;
if (f && !f->IsZombie()) {
f->Close();
delete f;
}
}
cout << endl << endl << endl << "End of the script!!!" << endl;
}
|