Hi Guys, Did you manage to sort something out, despite the cancellation of the meeting? These are serious problems for us. Tim. > -----Original Message----- > From: [log in to unmask] > [mailto:[log in to unmask]] On Behalf Of > Emmanuel Olaiya > Sent: 06 June 2005 22:57 > To: Andy Hanushevsky > Cc: Brew, CAJ (Chris); [log in to unmask]; Bill Weeks > Subject: Re: PreStage Problems > > Hi Andy > > Yes, it would be good if you could have a look at this with > me. We can > arrange a time in the xrootd meeting tomorrow. > > cheers > > Manny > > Andy Hanushevsky wrote: > > Hi Manny, > > > > I find this is quite mysterious as this should never be the > case and, > > frankly, appears to violate causality. I suspect something > else is going > > on. If this is reproducible then why don't we run a test with all > > debugging turned on. Yes? > > > > Andy > > > > ----- Original Message ----- From: "Emmanuel Olaiya" > <[log in to unmask]> > > To: "Andrew Hanushevsky" <[log in to unmask]> > > Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; > > <[log in to unmask]>; "Bill Weeks" <[log in to unmask]> > > Sent: Monday, June 06, 2005 1:41 PM > > Subject: Re: PreStage Problems > > > > > >> Hi Andy > >> > >> I should have mentioned that we also remove the prestage queue and > >> restarted both the server and redirector. However the old > request to > >> wait did not change. Moreover, any similar new requests > were also told > >> to wait until the old request was terminated. > >> > >> cheers > >> > >> Manny > >> > >> Andrew Hanushevsky wrote: > >> > >>> Hi Manny, > >>> > >>> Yes, but who telling the client to wait? The redirector > or the server > >>> that > >>> wanted to orginally stage the file in. When you restart the > >>> redirector it > >>> loses all it's memory but the data server does not. So, > it will hapiily > >>> tell the redirector that it has the file eventhough the file is > >>> merely in > >>> the pre-stage queue. As long as the file is in the > prestage queue and > >>> not on > >>> disk, the only option is to direct clients to where the > file will be > >>> staged in and then the clients simply wait for the file > (which in this > >>> case will never appear). So, if you remove staging you > also need to > >>> remove > >>> the prestage queue and restart the data server. > >>> > >>> Andy > >>> > >>> On Fri, 3 Jun 2005, Emmanuel Olaiya wrote: > >>> > >>> > >>>> Hi Andy > >>>> > >>>> One other issue we have spotted at RAL. We removed the staging > >>>> capabilities and restarted the director and server. > However we found > >>>> previous requests for a file that were told to wait > continued being > >>>> told > >>>> to wait. We also found that if somebody else asked for > this same file > >>>> that was not on disk they were also told to wait rather > than being told > >>>> the file could not be found. We needed to kill the > previous request and > >>>> restart the server and directory for xrootd to know the > file was not on > >>>> disk. > >>>> > >>>> cheers > >>>> > >>>> Manny > >>>> > >>>> Andrew Hanushevsky wrote: > >>>> > >>>>> Hi Chris, > >>>>> > >>>>> Oh yeah, different problem. I think that Bill Weeks fixed that. > >>>>> Bill did > >>>>> you fix that problem? > >>>>> > >>>>> Andy > >>>>> > >>>>> On Mon, 30 May 2005, Brew, CAJ (Chris) wrote: > >>>>> > >>>>> > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I might be being stupid but I don't see how this > relates to the > >>>>>> problem. > >>>>>> The files I wanted were on a different disk server > which then went > >>>>>> down. > >>>>>> The server in question was registered with the OLB as > being able to > >>>>>> stage in the name space so the request was redirected to it. If > >>>>>> mps_Stage is used without the PreStage queuing system > everything > >>>>>> works > >>>>>> as expected. If we try to go through the PreStage > queue to limit the > >>>>>> number of concurrent accesses to the tapestore the > stage in fails. > >>>>>> Apparently because the DIR_LOCK file does not exist (which it > >>>>>> doesn't, > >>>>>> since the file, and it's directory structure, has > never existed on > >>>>>> this > >>>>>> server). > >>>>>> > >>>>>> Yours, > >>>>>> Chris. > >>>>>> > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]] > >>>>>>> Sent: 28 May 2005 07:39 > >>>>>>> To: Brew, CAJ (Chris) > >>>>>>> Cc: [log in to unmask]; abh; Olaiya, EO (Emmanuel) > >>>>>>> Subject: RE: PreStage Problems > >>>>>>> > >>>>>>> Hi Chris, > >>>>>>> > >>>>>>> This was traced to overzealous testing. The syustem does not > >>>>>>> put in a new > >>>>>>> entry in the pre-stage queue until after about 10-20 minutes > >>>>>>> have elapsed > >>>>>>> since the last time the entry was added. So, this is not a > >>>>>>> bug but a test > >>>>>>> case that was not "real". Generally, files live in the disk > >>>>>>> cache for at > >>>>>>> least 10-20 minutes. > >>>>>>> > >>>>>>> Andy > >>>>>>> > >>>>>>> On Fri, 27 May 2005, Brew, CAJ (Chris) wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> At the meeting a couple of weeks ago, it was said > that someone was > >>>>>>>> looking into this but I haven't heard anything back. Is > >>>>>>> > >>>>>>> > >>>>>>> there any new? > >>>>>>> > >>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Chris. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Brew, CAJ (Chris) > >>>>>>>>> Sent: 17 May 2005 13:50 > >>>>>>>>> To: [log in to unmask]; abh > >>>>>>>>> Cc: Olaiya, EO (Emmanuel) > >>>>>>>>> Subject: PreStage Problems > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I've been running some more tests of the staging at RAL and > >>>>>>>>> have run into a problem somewhere in the > >>>>>>>>> mps_Stage/PreStage/prep system. > >>>>>>>>> > >>>>>>>>> Everything work fine staging file that was on the system and > >>>>>>>>> has been deleted but if I try to stage in a file > that was one > >>>>>>>>> a different server, hence the directory structure for the > >>>>>>>>> file does not exist on the staging server it fails and I see > >>>>>>>>> the following error in the PreStage log file: > >>>>>>>>> > >>>>>>>>> 12:45:43 [ 10859] mps_Stage: Open > >>>>>>>>> > '/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/ > >>>>>>>>> 001005/200002/DIR_LOCK' r/w failed; No such file or > directory. > >>>>>>>>> 12:45:43 [ 10859] do_stagein: xfr failed for > >>>>>>>>> > /store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100 > >>>>>>>>> 5_3247.01.root, rc=4, retry=1 > >>>>>>>>> 12:45:45 [ 3255] > >>>>>>>>> > file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_ > >>>>>>>>> 0010053247.01.root, rc=1024, reqid=ef000001:1cd2.425d27e1 > >>>>>>>>> :3762 > >>>>>>>>> > >>>>>>>>> If I create the directories and the DIR_LOCK file before > >>>>>>>>> running the import, everything works. > >>>>>>>>> > >>>>>>>>> The config file I'm using on the server is below. > >>>>>>>>> > >>>>>>>>> Is there some setting I'm missing which is needed to create > >>>>>>>>> the directories/DIR_LOCK file or does the code need fixing? > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Chris > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Chris Brew ([log in to unmask]) +44 1235 446326 > >>>>>>>>> Particle Physics Department > >>>>>>>>> Rutherford Appleton Laboratory > >>>>>>>>> Chilton, Didcot. Oxfordshire. > >>>>>>>>> OX11 0QX. United Kingdom. > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >> > > >