hello Pete, Andy, and all, at the moment, it is a read only system. so the user recreated his file and migrate it to HPSS using the usual rfio command "rfcp". it can be considered as a backdoor and it is a special case (I did not think I would encounter that one) as usually the files are writen once / read many times, so the problem I mentionned is usually avoided in the HEP world but that can happen with user or privately produced files. I understand it will certainly rarely seen, but it could happen. I think this issue can be encountered even if the writing is made within the xrootd service (as opposed to the case where a "backdoor" was used). There would be 2 cases: a) 1 pool allowing r/w: assuming one would like to update a file foo (initial version: v1), we can encounter the case where there are 2 duplicates of foo_v1.root on 2 different servers s1 and s2. Then, let's assume that s1:foo_v1.root is being updated. We will then have on disk: s1:foo_v2.root s2:foo_v1.root and in HPSS, foo_v2.root will superceed foo_v1.root after migration. but then you have some inconsistency in the cache. So if someone who wants to read the latest version of foo can be directed to s2:foo_v1.root. That is not what we want and this is also why I don't really like the idea of mixing read and write in a single pool. But a way to avoid the situation would be to use some checking to the MSS core server as I proposed. I understand that when the MSS is down all the xrootd could be down a priori even if you want to access files already on disk: well in the case of a MSS failure, the checking operation would return an error and in that case, one could resume activity with the copy on disk. b) 1 pool for read access + 1 for writing: clearly, if you got updates of files in the writing pool, then the read pool would have its cache not up-to-date and you will have to find a mechanism to update it. It could be done using xrdcp. cheers, JY Peter Elmer wrote: > Hi Jean-Yves, > > I didn't realize that you were offering a read/write service to the D0 >people, is that really correct? > > As you know we only recently (last fall) brought up the read/write output >buffer for skimming (which isn't backed by MSS, but a temp buffer), and >Wilko and I are just now planning to bring up the /store/users read/write >system (backed by MSS) at SLAC in the next weeks. That was supposed to have >the stat check to MSS when a new file was being opened for write (only) to >verify that it does not already exist (i.e. this is the oss.check config >directive, I think). > > As Andy has pointed out, it was not foreseen to allow _updates_ to files >via any other mechanism than via xrootd itself (if that is what is going >on). While a mechanism such as the one you suggest could be added to cover >such "somebody came in through the backdoor to update the file", it seems to >me like it would it would be difficult and costly to do it properly and >generally. > > What alternate door did the user use to update the file in HPSS? > > Pete > >On Wed, Mar 02, 2005 at 08:52:33PM +0100, Jean-Yves Nief wrote: > > >> one of the D0 user accessing files via xrootd encountered the >>following issue: after having accessed a file via xrootd (so after being >>staged from the "master" copy stored in HPSS), he modified the master >>copy in HPSS and wanted to access the modified file via xrootd: but as >>the old version of the file was already on the disk cache, no staging >>occured of course (but that is the expected behavior obviously) and he >>grabbed the old one version, which is not what he wanted. Well as an >>emergency solution and as it was the first time it happened, I've >>deleted the old version on the disk cache so he could proceed. >>However, I think it would be nice to have some control on the validity >>of the cache: one solution would be to add the following test: in case >>the file is already in the cache, compare the creation time on the cache >>disk (t1) with the last modified time of the file stored in HPSS (t2): >>if t1<t2 then restage the file. >>it will be a very small overhead to the mechanism, each time a file is >>accessed: it have just to issue a "statx" request to the MSS. >>or maybe there is a more simple solution. >>cheers, >>JY >> >> > > > >------------------------------------------------------------------------- >Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 >Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland >------------------------------------------------------------------------- > >