Print

Print


Hi Jan,

  I see, imho this means that there is very little overhead you can 
overlap, at least on the client side. Or that you are opening all those 
files towards very few servers, or the same one. I hope not.

  Anyway the async open was not meant as a way to speed up the open 
primitive, but as a way to do other things while the open is in 
progress, or to stage many files in parallel without serializing the 
waits. But in your situation it seems that there are not so many waits 
to parallelize.

Fabrizio


Jan Iwaszkiewicz wrote:
> Hi!
> 
> I have done some test as Fabrizio advised.
> The results of tests with asynchronous open are similar to those with 
> standard open:
> 
> I used the following code:
> 
>   TTime starttime = gSystem->Now();
>    TList *toOpenList = new TList();
>    toOpenList->SetOwner(kFALSE);
>    TIter nextElem(fDset->GetListOfElements());
>    while (TDSetElement *elem = dynamic_cast<TDSetElement*>(nextElem())) {
>       TFile::AsyncOpen(elem->GetFileName());
>       toOpenList->Add(elem);
>    }
> 
>    TFile::EAsyncOpenStatus aos;
>    TIter nextToOpen(toOpenList);
>    while (toOpenList->GetSize() > 0) {
>       while (TDSetElement* elem = 
> dynamic_cast<TDSetElement*>(nextToOpen())) {
>          aos = TFile::GetAsyncOpenStatus(elem->GetFileName());
>          if (aos == TFile::kAOSSuccess || aos == TFile::kAOSNotAsync
>              || aos == TFile::kAOSFailure) {
>             elem->Lookup();
>             toOpenList->Remove(elem);
>          }
>          else if (aos != TFile::kAOSInProgress)
>             Error("fileOpenTestTmp", "unknown aos");
>       }
>       nextToOpen.Reset();
>    };
>    toOpenList->Delete();
> 
>    TTime endtime = gSystem->Now();
>    Float_t time_holder = Long_t(endtime-starttime)/Float_t(1000);
>    cout << "Openning time was " << time_holder << " seconds" << endl;
> 
> 
> The result is:
> 
> #files    asynchronous        standard TFile::Open
> 300    12.5            11.7
> 240    9.68            9.4
> 120    4.5            4.6
> 
> Have a nice weekend!
> Jan
> 
> Jan Iwaszkiewicz wrote:
>> Hi Fabrizio, Hi Andy!
>>
>> Thank you for the answers.
>> I'm making tests with TFile::AsyncOpen and will keep you informed. 
>> Maybe I should clarify that we want to lookup locations of the files 
>> on the PROOF master node but then open the files on worker nodes. The 
>> point of the lookup is to determine what files each worker will 
>> open/process. For the problems that Andy described:
>> 1) I agree. 2) It seems to be even more important to parallelize it.
>>
>> In fact the possibility to get all locations of a file is also high on 
>> our wish-list. It would prevent us from opening a remote file while 
>> another copy is on one of our workers. We have no mechanism to avoid 
>> it. I think it's quite different use case than file serving. We want 
>> to make best use of a set of nodes belonging to a PROOF session. It 
>> would be very usefull to have this functionality!
>> Cheers,
>> Jan
>>
>> -----Original Message-----
>> From: Andrew Hanushevsky [mailto:[log in to unmask]]
>> Sent: Wed 8/16/2006 10:47 PM
>> To: Fabrizio Furano; Jan Iwaszkiewicz
>> Cc: [log in to unmask]; [log in to unmask]; Gerardo Ganis
>> Subject: Re: Quering locations of a vector of files
>>  
>> Hi Jan,
>>
>> Another way to speed up the processing is to use the Prepare method 
>> that allows you to set in motion all the steps needed to get file 
>> location information. As far as finding out the location of a list of 
>> files, that may be doable but has problems of its own. In your case it 
>> probably doesn't matter but in the general case two things may happen: 
>> 1) the location may be incorrect by the time you get the information 
>> (i.e., the file has been moved or deleted), and 2) there is no 
>> particular location for files that don't exist yet (this includes 
>> files that may be in an MSS but not yet on disk). The latter is more 
>> problematical as it takes a while to determine that. Anyway, we'll 
>> look into a mechanism to get you file location information (one of n 
>> for each file) using a list.
>>
>> Andy
>>
>> ----- Original Message ----- From: "Fabrizio Furano" 
>> <[log in to unmask]>
>> To: "Jan Iwaszkiewicz" <[log in to unmask]>
>> Cc: <[log in to unmask]>; "Maarten Ballintijn" 
>> <[log in to unmask]>; "Gerri Ganis" <[log in to unmask]>
>> Sent: Wednesday, August 16, 2006 10:09 AM
>> Subject: Re: Quering locations of a vector of files
>>
>>
>>> Hi Jan,
>>>
>>>  at the moment such a primitive is not part of the protocol. The 
>>> simpler way of doing it is to call Stat for each file, but this 
>>> reduces the per-file overhead only by a small amount, with respect to 
>>> a Open call.
>>>  In fact, both primitives actually drive the client to the final 
>>> endpoint (the file), so you cannot avoid the overhead (mainly 
>>> communication latencies) of being redirected to other servers.
>>>
>>>  Since you say it's critical for you, my suggestion is to open as 
>>> many files as you can in the parallel way. Doing so, all the 
>>> latencies are parallelized, and you can expect a much higher 
>>> performance.
>>>
>>>  To do this, just call TFile::AsyncOpen(fname) for each file you need 
>>> to open (a cycle), and then, later, you can call the regular 
>>> TFile::Open (another cycle).
>>>   The async call is non-blocking and very fast. You can find an 
>>> example of its ROOT-based usage here:
>>>
>>> http://root.cern.ch/root/Version512.news.html
>>>
>>>  The ugly thing is that doing this you are using a lot of resources, 
>>> so, if you have really a lot of files to open (let's say, 5000) and 
>>> the resources are a problem, maybe you can find a workaround by 
>>> opening them in bunches of fixed size.
>>>
>>> Fabrizio
>>>
>>> Jan Iwaszkiewicz wrote:
>>>> Hi,
>>>>
>>>> In PROOF we realized that we need a possibility to query exact 
>>>> locations of a set of files. As far as I have seen in the xrootd 
>>>> protocol, there is no way to ask for locations of a vector of files.
>>>>
>>>> At the beginning of a query, we want to check exact locations of all 
>>>> the files form a data set. The current implementation does it by 
>>>> opening all the files, one by one.
>>>> The speed is about 30 files/sec. For many queries, the lookup takes 
>>>> much longer than the processing.
>>>> It is a critical problem for us.
>>>>
>>>> The bool XrdClientAdmin::SysStatX(const char *paths_list, kXR_char 
>>>> *binInfo) method can check multiple files but it only verifies 
>>>> whether the files exist.
>>>> I imagine that it would be best for us to have something similar but 
>>>> returning file locations. Is such an extension to the protocol 
>>>> possible/reasonable to implement?
>>>>
>>>> Cheers,
>>>> Jan
>>
>>
>>