Print

Print


Hi Andy,

  that would be no problem assuming we can easily query the oldb admin 
interface. How would this be done, via a popen/pclose or is there on API?

Cheers, Fons.



Andrew Hanushevsky wrote:
> Hi Fons,
> 
> It would probably be relatively easy to do if the query was entered via 
> the OLB admin interface. It's more difficult to do via an xroot protocol 
> query request. Would that satisfy you?
> 
> Andy
> 
> ----- Original Message ----- From: "Fons Rademakers" 
> <[log in to unmask]>
> To: "Fabrizio Furano" <[log in to unmask]>
> Cc: "Jan Iwaszkiewicz" <[log in to unmask]>; <[log in to unmask]>; 
> <[log in to unmask]>; "Gerri Ganis" <[log in to unmask]>
> Sent: Saturday, August 19, 2006 3:02 PM
> Subject: Re: Quering locations of a vector of files
> 
> 
>> Hi Andy, Fabrizio,
>>
>>   what we really urgently would like to have is an xrootd command that 
>> takes as input a vector of generic xrootd urls and returns a vector 
>> with resolved urls (including multiple urls in case the same file 
>> exists on more than one leaf node). Of course the first time this will 
>> take some time since the head node will have to ask the leaf nodes, 
>> but from then on this info lives in the xrootd head node cache, so it 
>> should be very quick. We need the final location in PROOF to submit 
>> work packets with priority to the nodes that have the data local.
>>
>> Can you tell me if this feature is possible and if we can get it soon?
>>
>> Cheers, Fons.
>>
>>
>>
>> Fabrizio Furano wrote:
>>> Hi Jan,
>>>
>>>  I see, imho this means that there is very little overhead you can 
>>> overlap, at least on the client side. Or that you are opening all 
>>> those files towards very few servers, or the same one. I hope not.
>>>
>>>  Anyway the async open was not meant as a way to speed up the open 
>>> primitive, but as a way to do other things while the open is in 
>>> progress, or to stage many files in parallel without serializing the 
>>> waits. But in your situation it seems that there are not so many 
>>> waits to parallelize.
>>>
>>> Fabrizio
>>>
>>>
>>> Jan Iwaszkiewicz wrote:
>>>> Hi!
>>>>
>>>> I have done some test as Fabrizio advised.
>>>> The results of tests with asynchronous open are similar to those 
>>>> with standard open:
>>>>
>>>> I used the following code:
>>>>
>>>>   TTime starttime = gSystem->Now();
>>>>    TList *toOpenList = new TList();
>>>>    toOpenList->SetOwner(kFALSE);
>>>>    TIter nextElem(fDset->GetListOfElements());
>>>>    while (TDSetElement *elem = 
>>>> dynamic_cast<TDSetElement*>(nextElem())) {
>>>>       TFile::AsyncOpen(elem->GetFileName());
>>>>       toOpenList->Add(elem);
>>>>    }
>>>>
>>>>    TFile::EAsyncOpenStatus aos;
>>>>    TIter nextToOpen(toOpenList);
>>>>    while (toOpenList->GetSize() > 0) {
>>>>       while (TDSetElement* elem = 
>>>> dynamic_cast<TDSetElement*>(nextToOpen())) {
>>>>          aos = TFile::GetAsyncOpenStatus(elem->GetFileName());
>>>>          if (aos == TFile::kAOSSuccess || aos == TFile::kAOSNotAsync
>>>>              || aos == TFile::kAOSFailure) {
>>>>             elem->Lookup();
>>>>             toOpenList->Remove(elem);
>>>>          }
>>>>          else if (aos != TFile::kAOSInProgress)
>>>>             Error("fileOpenTestTmp", "unknown aos");
>>>>       }
>>>>       nextToOpen.Reset();
>>>>    };
>>>>    toOpenList->Delete();
>>>>
>>>>    TTime endtime = gSystem->Now();
>>>>    Float_t time_holder = Long_t(endtime-starttime)/Float_t(1000);
>>>>    cout << "Openning time was " << time_holder << " seconds" << endl;
>>>>
>>>>
>>>> The result is:
>>>>
>>>> #files    asynchronous        standard TFile::Open
>>>> 300    12.5            11.7
>>>> 240    9.68            9.4
>>>> 120    4.5            4.6
>>>>
>>>> Have a nice weekend!
>>>> Jan
>>>>
>>>> Jan Iwaszkiewicz wrote:
>>>>> Hi Fabrizio, Hi Andy!
>>>>>
>>>>> Thank you for the answers.
>>>>> I'm making tests with TFile::AsyncOpen and will keep you informed. 
>>>>> Maybe I should clarify that we want to lookup locations of the 
>>>>> files on the PROOF master node but then open the files on worker 
>>>>> nodes. The point of the lookup is to determine what files each 
>>>>> worker will open/process. For the problems that Andy described:
>>>>> 1) I agree. 2) It seems to be even more important to parallelize it.
>>>>>
>>>>> In fact the possibility to get all locations of a file is also high 
>>>>> on our wish-list. It would prevent us from opening a remote file 
>>>>> while another copy is on one of our workers. We have no mechanism 
>>>>> to avoid it. I think it's quite different use case than file 
>>>>> serving. We want to make best use of a set of nodes belonging to a 
>>>>> PROOF session. It would be very usefull to have this functionality!
>>>>> Cheers,
>>>>> Jan
>>>>>
>>>>> -----Original Message-----
>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]]
>>>>> Sent: Wed 8/16/2006 10:47 PM
>>>>> To: Fabrizio Furano; Jan Iwaszkiewicz
>>>>> Cc: [log in to unmask]; [log in to unmask]; Gerardo Ganis
>>>>> Subject: Re: Quering locations of a vector of files
>>>>>  Hi Jan,
>>>>>
>>>>> Another way to speed up the processing is to use the Prepare method 
>>>>> that allows you to set in motion all the steps needed to get file 
>>>>> location information. As far as finding out the location of a list 
>>>>> of files, that may be doable but has problems of its own. In your 
>>>>> case it probably doesn't matter but in the general case two things 
>>>>> may happen: 1) the location may be incorrect by the time you get 
>>>>> the information (i.e., the file has been moved or deleted), and 2) 
>>>>> there is no particular location for files that don't exist yet 
>>>>> (this includes files that may be in an MSS but not yet on disk). 
>>>>> The latter is more problematical as it takes a while to determine 
>>>>> that. Anyway, we'll look into a mechanism to get you file location 
>>>>> information (one of n for each file) using a list.
>>>>>
>>>>> Andy
>>>>>
>>>>> ----- Original Message ----- From: "Fabrizio Furano" 
>>>>> <[log in to unmask]>
>>>>> To: "Jan Iwaszkiewicz" <[log in to unmask]>
>>>>> Cc: <[log in to unmask]>; "Maarten Ballintijn" 
>>>>> <[log in to unmask]>; "Gerri Ganis" <[log in to unmask]>
>>>>> Sent: Wednesday, August 16, 2006 10:09 AM
>>>>> Subject: Re: Quering locations of a vector of files
>>>>>
>>>>>
>>>>>> Hi Jan,
>>>>>>
>>>>>>  at the moment such a primitive is not part of the protocol. The 
>>>>>> simpler way of doing it is to call Stat for each file, but this 
>>>>>> reduces the per-file overhead only by a small amount, with respect 
>>>>>> to a Open call.
>>>>>>  In fact, both primitives actually drive the client to the final 
>>>>>> endpoint (the file), so you cannot avoid the overhead (mainly 
>>>>>> communication latencies) of being redirected to other servers.
>>>>>>
>>>>>>  Since you say it's critical for you, my suggestion is to open as 
>>>>>> many files as you can in the parallel way. Doing so, all the 
>>>>>> latencies are parallelized, and you can expect a much higher 
>>>>>> performance.
>>>>>>
>>>>>>  To do this, just call TFile::AsyncOpen(fname) for each file you 
>>>>>> need to open (a cycle), and then, later, you can call the regular 
>>>>>> TFile::Open (another cycle).
>>>>>>   The async call is non-blocking and very fast. You can find an 
>>>>>> example of its ROOT-based usage here:
>>>>>>
>>>>>> http://root.cern.ch/root/Version512.news.html
>>>>>>
>>>>>>  The ugly thing is that doing this you are using a lot of 
>>>>>> resources, so, if you have really a lot of files to open (let's 
>>>>>> say, 5000) and the resources are a problem, maybe you can find a 
>>>>>> workaround by opening them in bunches of fixed size.
>>>>>>
>>>>>> Fabrizio
>>>>>>
>>>>>> Jan Iwaszkiewicz wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> In PROOF we realized that we need a possibility to query exact 
>>>>>>> locations of a set of files. As far as I have seen in the xrootd 
>>>>>>> protocol, there is no way to ask for locations of a vector of files.
>>>>>>>
>>>>>>> At the beginning of a query, we want to check exact locations of 
>>>>>>> all the files form a data set. The current implementation does it 
>>>>>>> by opening all the files, one by one.
>>>>>>> The speed is about 30 files/sec. For many queries, the lookup 
>>>>>>> takes much longer than the processing.
>>>>>>> It is a critical problem for us.
>>>>>>>
>>>>>>> The bool XrdClientAdmin::SysStatX(const char *paths_list, 
>>>>>>> kXR_char *binInfo) method can check multiple files but it only 
>>>>>>> verifies whether the files exist.
>>>>>>> I imagine that it would be best for us to have something similar 
>>>>>>> but returning file locations. Is such an extension to the 
>>>>>>> protocol possible/reasonable to implement?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Jan
>>>>>
>>>>>
>>>>>
>>>
>>
>> -- 
>> Org:    CERN, European Laboratory for Particle Physics.
>> Mail:   1211 Geneve 23, Switzerland
>> E-Mail: [log in to unmask]              Phone: +41 22 7679248
>> WWW:    http://fons.rademakers.org           Fax:   +41 22 7669640
>>
> 

-- 
Org:    CERN, European Laboratory for Particle Physics.
Mail:   1211 Geneve 23, Switzerland
E-Mail: [log in to unmask]              Phone: +41 22 7679248
WWW:    http://fons.rademakers.org           Fax:   +41 22 7669640