LISTSERV 16.5 - XROOTD-L Archives

Hi Fons,

It's pretty easy to do. I would assume you'd role that into your proof 
protocol, yes?

Andy

----- Original Message ----- 
From: "Fons Rademakers" <[log in to unmask]>
To: "Andrew Hanushevsky" <[log in to unmask]>
Cc: "Fabrizio Furano" <[log in to unmask]>; "Jan Iwaszkiewicz" 
<[log in to unmask]>; <[log in to unmask]>; <[log in to unmask]>; "Gerri 
Ganis" <[log in to unmask]>
Sent: Tuesday, August 22, 2006 2:26 AM
Subject: Re: Quering locations of a vector of files


> Hi Andy,
>
>  that would be no problem assuming we can easily query the oldb admin 
> interface. How would this be done, via a popen/pclose or is there on API?
>
> Cheers, Fons.
>
>
>
> Andrew Hanushevsky wrote:
>> Hi Fons,
>>
>> It would probably be relatively easy to do if the query was entered via 
>> the OLB admin interface. It's more difficult to do via an xroot protocol 
>> query request. Would that satisfy you?
>>
>> Andy
>>
>> ----- Original Message ----- From: "Fons Rademakers" 
>> <[log in to unmask]>
>> To: "Fabrizio Furano" <[log in to unmask]>
>> Cc: "Jan Iwaszkiewicz" <[log in to unmask]>; <[log in to unmask]>; 
>> <[log in to unmask]>; "Gerri Ganis" <[log in to unmask]>
>> Sent: Saturday, August 19, 2006 3:02 PM
>> Subject: Re: Quering locations of a vector of files
>>
>>
>>> Hi Andy, Fabrizio,
>>>
>>>   what we really urgently would like to have is an xrootd command that 
>>> takes as input a vector of generic xrootd urls and returns a vector with 
>>> resolved urls (including multiple urls in case the same file exists on 
>>> more than one leaf node). Of course the first time this will take some 
>>> time since the head node will have to ask the leaf nodes, but from then 
>>> on this info lives in the xrootd head node cache, so it should be very 
>>> quick. We need the final location in PROOF to submit work packets with 
>>> priority to the nodes that have the data local.
>>>
>>> Can you tell me if this feature is possible and if we can get it soon?
>>>
>>> Cheers, Fons.
>>>
>>>
>>>
>>> Fabrizio Furano wrote:
>>>> Hi Jan,
>>>>
>>>>  I see, imho this means that there is very little overhead you can 
>>>> overlap, at least on the client side. Or that you are opening all those 
>>>> files towards very few servers, or the same one. I hope not.
>>>>
>>>>  Anyway the async open was not meant as a way to speed up the open 
>>>> primitive, but as a way to do other things while the open is in 
>>>> progress, or to stage many files in parallel without serializing the 
>>>> waits. But in your situation it seems that there are not so many waits 
>>>> to parallelize.
>>>>
>>>> Fabrizio
>>>>
>>>>
>>>> Jan Iwaszkiewicz wrote:
>>>>> Hi!
>>>>>
>>>>> I have done some test as Fabrizio advised.
>>>>> The results of tests with asynchronous open are similar to those with 
>>>>> standard open:
>>>>>
>>>>> I used the following code:
>>>>>
>>>>>   TTime starttime = gSystem->Now();
>>>>>    TList *toOpenList = new TList();
>>>>>    toOpenList->SetOwner(kFALSE);
>>>>>    TIter nextElem(fDset->GetListOfElements());
>>>>>    while (TDSetElement *elem = 
>>>>> dynamic_cast<TDSetElement*>(nextElem())) {
>>>>>       TFile::AsyncOpen(elem->GetFileName());
>>>>>       toOpenList->Add(elem);
>>>>>    }
>>>>>
>>>>>    TFile::EAsyncOpenStatus aos;
>>>>>    TIter nextToOpen(toOpenList);
>>>>>    while (toOpenList->GetSize() > 0) {
>>>>>       while (TDSetElement* elem = 
>>>>> dynamic_cast<TDSetElement*>(nextToOpen())) {
>>>>>          aos = TFile::GetAsyncOpenStatus(elem->GetFileName());
>>>>>          if (aos == TFile::kAOSSuccess || aos == TFile::kAOSNotAsync
>>>>>              || aos == TFile::kAOSFailure) {
>>>>>             elem->Lookup();
>>>>>             toOpenList->Remove(elem);
>>>>>          }
>>>>>          else if (aos != TFile::kAOSInProgress)
>>>>>             Error("fileOpenTestTmp", "unknown aos");
>>>>>       }
>>>>>       nextToOpen.Reset();
>>>>>    };
>>>>>    toOpenList->Delete();
>>>>>
>>>>>    TTime endtime = gSystem->Now();
>>>>>    Float_t time_holder = Long_t(endtime-starttime)/Float_t(1000);
>>>>>    cout << "Openning time was " << time_holder << " seconds" << endl;
>>>>>
>>>>>
>>>>> The result is:
>>>>>
>>>>> #files    asynchronous        standard TFile::Open
>>>>> 300    12.5            11.7
>>>>> 240    9.68            9.4
>>>>> 120    4.5            4.6
>>>>>
>>>>> Have a nice weekend!
>>>>> Jan
>>>>>
>>>>> Jan Iwaszkiewicz wrote:
>>>>>> Hi Fabrizio, Hi Andy!
>>>>>>
>>>>>> Thank you for the answers.
>>>>>> I'm making tests with TFile::AsyncOpen and will keep you informed. 
>>>>>> Maybe I should clarify that we want to lookup locations of the files 
>>>>>> on the PROOF master node but then open the files on worker nodes. The 
>>>>>> point of the lookup is to determine what files each worker will 
>>>>>> open/process. For the problems that Andy described:
>>>>>> 1) I agree. 2) It seems to be even more important to parallelize it.
>>>>>>
>>>>>> In fact the possibility to get all locations of a file is also high 
>>>>>> on our wish-list. It would prevent us from opening a remote file 
>>>>>> while another copy is on one of our workers. We have no mechanism to 
>>>>>> avoid it. I think it's quite different use case than file serving. We 
>>>>>> want to make best use of a set of nodes belonging to a PROOF session. 
>>>>>> It would be very usefull to have this functionality!
>>>>>> Cheers,
>>>>>> Jan
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]]
>>>>>> Sent: Wed 8/16/2006 10:47 PM
>>>>>> To: Fabrizio Furano; Jan Iwaszkiewicz
>>>>>> Cc: [log in to unmask]; [log in to unmask]; Gerardo Ganis
>>>>>> Subject: Re: Quering locations of a vector of files
>>>>>>  Hi Jan,
>>>>>>
>>>>>> Another way to speed up the processing is to use the Prepare method 
>>>>>> that allows you to set in motion all the steps needed to get file 
>>>>>> location information. As far as finding out the location of a list of 
>>>>>> files, that may be doable but has problems of its own. In your case 
>>>>>> it probably doesn't matter but in the general case two things may 
>>>>>> happen: 1) the location may be incorrect by the time you get the 
>>>>>> information (i.e., the file has been moved or deleted), and 2) there 
>>>>>> is no particular location for files that don't exist yet (this 
>>>>>> includes files that may be in an MSS but not yet on disk). The latter 
>>>>>> is more problematical as it takes a while to determine that. Anyway, 
>>>>>> we'll look into a mechanism to get you file location information (one 
>>>>>> of n for each file) using a list.
>>>>>>
>>>>>> Andy
>>>>>>
>>>>>> ----- Original Message ----- From: "Fabrizio Furano" 
>>>>>> <[log in to unmask]>
>>>>>> To: "Jan Iwaszkiewicz" <[log in to unmask]>
>>>>>> Cc: <[log in to unmask]>; "Maarten Ballintijn" 
>>>>>> <[log in to unmask]>; "Gerri Ganis" <[log in to unmask]>
>>>>>> Sent: Wednesday, August 16, 2006 10:09 AM
>>>>>> Subject: Re: Quering locations of a vector of files
>>>>>>
>>>>>>
>>>>>>> Hi Jan,
>>>>>>>
>>>>>>>  at the moment such a primitive is not part of the protocol. The 
>>>>>>> simpler way of doing it is to call Stat for each file, but this 
>>>>>>> reduces the per-file overhead only by a small amount, with respect 
>>>>>>> to a Open call.
>>>>>>>  In fact, both primitives actually drive the client to the final 
>>>>>>> endpoint (the file), so you cannot avoid the overhead (mainly 
>>>>>>> communication latencies) of being redirected to other servers.
>>>>>>>
>>>>>>>  Since you say it's critical for you, my suggestion is to open as 
>>>>>>> many files as you can in the parallel way. Doing so, all the 
>>>>>>> latencies are parallelized, and you can expect a much higher 
>>>>>>> performance.
>>>>>>>
>>>>>>>  To do this, just call TFile::AsyncOpen(fname) for each file you 
>>>>>>> need to open (a cycle), and then, later, you can call the regular 
>>>>>>> TFile::Open (another cycle).
>>>>>>>   The async call is non-blocking and very fast. You can find an 
>>>>>>> example of its ROOT-based usage here:
>>>>>>>
>>>>>>> http://root.cern.ch/root/Version512.news.html
>>>>>>>
>>>>>>>  The ugly thing is that doing this you are using a lot of resources, 
>>>>>>> so, if you have really a lot of files to open (let's say, 5000) and 
>>>>>>> the resources are a problem, maybe you can find a workaround by 
>>>>>>> opening them in bunches of fixed size.
>>>>>>>
>>>>>>> Fabrizio
>>>>>>>
>>>>>>> Jan Iwaszkiewicz wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> In PROOF we realized that we need a possibility to query exact 
>>>>>>>> locations of a set of files. As far as I have seen in the xrootd 
>>>>>>>> protocol, there is no way to ask for locations of a vector of 
>>>>>>>> files.
>>>>>>>>
>>>>>>>> At the beginning of a query, we want to check exact locations of 
>>>>>>>> all the files form a data set. The current implementation does it 
>>>>>>>> by opening all the files, one by one.
>>>>>>>> The speed is about 30 files/sec. For many queries, the lookup takes 
>>>>>>>> much longer than the processing.
>>>>>>>> It is a critical problem for us.
>>>>>>>>
>>>>>>>> The bool XrdClientAdmin::SysStatX(const char *paths_list, kXR_char 
>>>>>>>> *binInfo) method can check multiple files but it only verifies 
>>>>>>>> whether the files exist.
>>>>>>>> I imagine that it would be best for us to have something similar 
>>>>>>>> but returning file locations. Is such an extension to the protocol 
>>>>>>>> possible/reasonable to implement?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Jan
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>> -- 
>>> Org:    CERN, European Laboratory for Particle Physics.
>>> Mail:   1211 Geneve 23, Switzerland
>>> E-Mail: [log in to unmask]              Phone: +41 22 7679248
>>> WWW:    http://fons.rademakers.org           Fax:   +41 22 7669640
>>>
>>
>
> -- 
> Org:    CERN, European Laboratory for Particle Physics.
> Mail:   1211 Geneve 23, Switzerland
> E-Mail: [log in to unmask]              Phone: +41 22 7679248
> WWW:    http://fons.rademakers.org           Fax:   +41 22 7669640
>