Print

Print


Hi all,

Wilko Kroeger wrote:
> 
> Hello Horst
> 
> On Wed, 8 Mar 2006, Goeringer Dr. Horst wrote:
> 
>> Hi all,
>>
>> at GSI in Darmstadt/Germany, the home of large heavy ion experiments
>> and Alice tier 2 center,
>> we offer some ten file servers as file cache for batch farm analysis.
>> These file servers are filled and managed by gStore,
>> the GSI Mass Storage System.
>>
>> In preparation of a batch farm analysis each authorized user is able
>> to fill any amount of the available cache with files from our tape
>> libraries
>> just submitting a single gStore command. gStore takes care for proper
>> distribution of the new files among the file server nodes.
>> During analysis the jobs in the batch farm read their input files from
>> cache
>> very efficiently with gStore commands or via the RFIO API.
>> So - sorry for the long poetry, but I think though this is a very common
>> scheme
>> it was necessary as background to understand our problem.
>>
>> For the future we plan to provide xrootd as additional access method
>> from the batch farm (read single files).
>> First test installations of xrootd servers and redirector are done.
>> Now my question: What is the most efficient way to tell the redirector
>> that there are new files (100s or 1000s) on "his" file server disks?
>> Provide a filelist to the redirector? How?
> 
> The most efficient way is to use the prepare command. This command is
> sent to the redirector with a list of files. The redirector will then
> locate these files and update its cache. The c-client library, the perl
> bindings to this lib or the xrd command line tool could be
> used to issue the prepare command. So far, I only
> tested the perl interface which would look something like this:
> 
> XrdClientAdmin::XrdInitialize("root://rtedirectorName///");
> my $f = "//prod/tp/f4\n///prod/tp/f5\n///prod/tp/f6";
> XrdClientAdmin::XrdPrepare($f," "," ");
> XrdClientAdmin::XrdTerminate();
> 
> I will test the xrd command and will put some documentation on the xrootd
> web page.
> 
>> Make open/close calls for each new file just written?
>> Tell the redirector to rebuild his cache tables? How?
> 
> The redirector keeps only files that have been requested by users in the
> cache. If a file hasn't been requested for more then 8h (this can be set
> in the config) it is removed from the cache.
> The redirector doesn't try to keep all files that are on disk in the
> cache.
> 
>> It is very important that the redirector knows all files
>> before an analysis run starts, as the 5 seconds latency time to find
>> a new file is not acceptable.
> 
> It is in the development to change the 5s delay. The redirector will be
> able to respond to the client as soon as it finds a file.
> Using the prepare should work for you.

  Yes, this is correct, but just for the memories, in those 5 secs you 
can open many files in parallel. If you have to open 100 files and you 
are so unlucky that the redirector does not know where they are, you 
will NOT wait 500 seconds. So, if you have a list of files to open, you 
can open them all at the maximum speed achievable by just cycling into 
your list and:
- opening the files normally (if using XrdClient)
- invoking TFile::AsyncOpenRequest if under ROOT 5
  Doing this, you have also the dvantage that you can stage many files 
in parallel. Provided that you have more than one tape unit, of course.
  Theoretically you could open up to 2-3000 files (already in the cache) 
in those 5 secs. In practice it depends on many things... but anyway 
it's very efficient.

  Unfortunately, this is possible only in the newer releases of the 
client. There should already be a ROOT 5 version using it. Just ask 
Gerri for it.
  In some days I am going to retest this stuff on the head, which looks 
suspect to me after some commits I saw. Then I will do a big commit 
containing also the newest additions.

Fabrizio



> 
> Cheers,
>    Wilko
> 
>> Thank you very much for your help!
>> Horst Goeringer
>>
>>
>>
>>
>>