Print

Print


hello Pete, Andy, and all,

          at the moment, it is a read only system.
so the user recreated his file and migrate it to HPSS using the usual 
rfio command "rfcp".
it can be considered as a backdoor and it is a special case (I did not 
think I would encounter that one) as usually the files are writen once / 
read many times, so the problem I mentionned is usually avoided in the 
HEP world but that can happen with user or privately produced files. I 
understand it will certainly rarely seen, but it could happen.
I think this issue can be encountered even if the writing is made within 
the xrootd service (as opposed to the case where a "backdoor" was used). 
There would be 2 cases:

a) 1 pool allowing r/w: assuming one would like to update a file foo 
(initial version: v1), we can encounter the case where there are 2 
duplicates of foo_v1.root on 2 different servers s1 and s2. Then, let's 
assume that s1:foo_v1.root is being updated. We will then have on disk:
s1:foo_v2.root
s2:foo_v1.root
and in HPSS, foo_v2.root will superceed foo_v1.root after migration.
but then you have some inconsistency in the cache.
So if someone who wants to read the latest version of foo can be 
directed to s2:foo_v1.root. That is not what we want and this is also 
why I don't really like the idea of mixing read and write in a single 
pool. But a way to avoid the situation would be to use some checking to 
the MSS core server as I proposed. I understand that when the MSS is 
down all the xrootd could be down a priori even if you want to access 
files already on disk: well in the case of a MSS failure, the checking 
operation would return an error and in that case, one could resume 
activity with the copy on disk.

b) 1 pool for read access + 1 for writing:
clearly, if you got updates of files in the writing pool, then the read 
pool would have its cache not up-to-date and you will have to find a 
mechanism to update it. It could be done using xrdcp.
cheers,
JY

Peter Elmer wrote:

>  Hi Jean-Yves,
>
>  I didn't realize that you were offering a read/write service to the D0
>people, is that really correct?
>
>  As you know we only recently (last fall) brought up the read/write output
>buffer for skimming (which isn't backed by MSS, but a temp buffer), and
>Wilko and I are just now planning to bring up the /store/users read/write 
>system (backed by MSS) at SLAC in the next weeks. That was supposed to have 
>the stat check to MSS when a new file was being opened for write (only) to 
>verify that it does not already exist (i.e. this is the oss.check config
>directive, I think).
>
>  As Andy has pointed out, it was not foreseen to allow _updates_ to files 
>via any other mechanism than via xrootd itself (if that is what is going 
>on). While a mechanism such as the one you suggest could be added to cover 
>such "somebody came in through the backdoor to update the file", it seems to 
>me like it would it would be difficult and costly to do it properly and 
>generally. 
>
>  What alternate door did the user use to update the file in HPSS?
>
>                                   Pete
>
>On Wed, Mar 02, 2005 at 08:52:33PM +0100, Jean-Yves Nief wrote:
>  
>
>>         one of the D0 user accessing files via xrootd encountered the 
>>following issue: after having accessed a file via xrootd (so after being 
>>staged from the "master" copy stored in HPSS), he modified the master 
>>copy in HPSS and wanted to access the modified file via xrootd: but as 
>>the old version of the file was already on the disk cache, no staging 
>>occured of course (but that is the expected behavior obviously) and he 
>>grabbed the old one version, which is not what he wanted. Well as an 
>>emergency solution and as it was the first time it happened, I've 
>>deleted the old version on the disk cache so he could proceed.
>>However, I think it would be nice to have some control on the validity 
>>of the cache: one solution would be to add the following test: in case 
>>the file is already in the cache, compare the creation time on the cache 
>>disk (t1) with the last modified time of the file stored in HPSS (t2): 
>>if t1<t2 then restage the file.
>>it will be a very small overhead to the mechanism, each time a file is 
>>accessed: it have just to issue a "statx" request to the MSS.
>>or maybe there is a more simple solution.
>>cheers,
>>JY
>>    
>>
>
>
>
>-------------------------------------------------------------------------
>Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
>Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
>-------------------------------------------------------------------------
>  
>