Hi Many, Yep! Andy ----- Original Message ----- From: "Emmanuel Olaiya" <[log in to unmask]> To: "Andy Hanushevsky" <[log in to unmask]> Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; <[log in to unmask]>; "Bill Weeks" <[log in to unmask]> Sent: Monday, June 06, 2005 4:37 PM Subject: Re: PreStage Problems > Hi Andy > > Ok, we can try and provide log files. > > Will you be at the meeting tomorrow? > > cheers > > Manny > > Andy Hanushevsky wrote: >> Hi Manny, >> >> No problem other than the rest of the week is tied up -- Seattle Tue/Wed, >> Oakland/Thu, home Fri. Thigh looking at log files is always possible >> here. >> >> Andy >> >> ----- Original Message ----- From: "Emmanuel Olaiya" <[log in to unmask]> >> To: "Andy Hanushevsky" <[log in to unmask]> >> Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; >> <[log in to unmask]>; "Bill Weeks" <[log in to unmask]> >> Sent: Monday, June 06, 2005 2:57 PM >> Subject: Re: PreStage Problems >> >> >>> Hi Andy >>> >>> Yes, it would be good if you could have a look at this with me. We can >>> arrange a time in the xrootd meeting tomorrow. >>> >>> cheers >>> >>> Manny >>> >>> Andy Hanushevsky wrote: >>> >>>> Hi Manny, >>>> >>>> I find this is quite mysterious as this should never be the case and, >>>> frankly, appears to violate causality. I suspect something else is >>>> going on. If this is reproducible then why don't we run a test with all >>>> debugging turned on. Yes? >>>> >>>> Andy >>>> >>>> ----- Original Message ----- From: "Emmanuel Olaiya" >>>> <[log in to unmask]> >>>> To: "Andrew Hanushevsky" <[log in to unmask]> >>>> Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; >>>> <[log in to unmask]>; "Bill Weeks" <[log in to unmask]> >>>> Sent: Monday, June 06, 2005 1:41 PM >>>> Subject: Re: PreStage Problems >>>> >>>> >>>>> Hi Andy >>>>> >>>>> I should have mentioned that we also remove the prestage queue and >>>>> restarted both the server and redirector. However the old request to >>>>> wait did not change. Moreover, any similar new requests were also told >>>>> to wait until the old request was terminated. >>>>> >>>>> cheers >>>>> >>>>> Manny >>>>> >>>>> Andrew Hanushevsky wrote: >>>>> >>>>>> Hi Manny, >>>>>> >>>>>> Yes, but who telling the client to wait? The redirector or the server >>>>>> that >>>>>> wanted to orginally stage the file in. When you restart the >>>>>> redirector it >>>>>> loses all it's memory but the data server does not. So, it will >>>>>> hapiily >>>>>> tell the redirector that it has the file eventhough the file is >>>>>> merely in >>>>>> the pre-stage queue. As long as the file is in the prestage queue and >>>>>> not on >>>>>> disk, the only option is to direct clients to where the file will be >>>>>> staged in and then the clients simply wait for the file (which in >>>>>> this >>>>>> case will never appear). So, if you remove staging you also need to >>>>>> remove >>>>>> the prestage queue and restart the data server. >>>>>> >>>>>> Andy >>>>>> >>>>>> On Fri, 3 Jun 2005, Emmanuel Olaiya wrote: >>>>>> >>>>>> >>>>>>> Hi Andy >>>>>>> >>>>>>> One other issue we have spotted at RAL. We removed the staging >>>>>>> capabilities and restarted the director and server. However we found >>>>>>> previous requests for a file that were told to wait continued being >>>>>>> told >>>>>>> to wait. We also found that if somebody else asked for this same >>>>>>> file >>>>>>> that was not on disk they were also told to wait rather than being >>>>>>> told >>>>>>> the file could not be found. We needed to kill the previous request >>>>>>> and >>>>>>> restart the server and directory for xrootd to know the file was not >>>>>>> on >>>>>>> disk. >>>>>>> >>>>>>> cheers >>>>>>> >>>>>>> Manny >>>>>>> >>>>>>> Andrew Hanushevsky wrote: >>>>>>> >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Oh yeah, different problem. I think that Bill Weeks fixed that. >>>>>>>> Bill did >>>>>>>> you fix that problem? >>>>>>>> >>>>>>>> Andy >>>>>>>> >>>>>>>> On Mon, 30 May 2005, Brew, CAJ (Chris) wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I might be being stupid but I don't see how this relates to the >>>>>>>>> problem. >>>>>>>>> The files I wanted were on a different disk server which then went >>>>>>>>> down. >>>>>>>>> The server in question was registered with the OLB as being able >>>>>>>>> to >>>>>>>>> stage in the name space so the request was redirected to it. If >>>>>>>>> mps_Stage is used without the PreStage queuing system everything >>>>>>>>> works >>>>>>>>> as expected. If we try to go through the PreStage queue to limit >>>>>>>>> the >>>>>>>>> number of concurrent accesses to the tapestore the stage in fails. >>>>>>>>> Apparently because the DIR_LOCK file does not exist (which it >>>>>>>>> doesn't, >>>>>>>>> since the file, and it's directory structure, has never existed on >>>>>>>>> this >>>>>>>>> server). >>>>>>>>> >>>>>>>>> Yours, >>>>>>>>> Chris. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]] >>>>>>>>>> Sent: 28 May 2005 07:39 >>>>>>>>>> To: Brew, CAJ (Chris) >>>>>>>>>> Cc: [log in to unmask]; abh; Olaiya, EO (Emmanuel) >>>>>>>>>> Subject: RE: PreStage Problems >>>>>>>>>> >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> This was traced to overzealous testing. The syustem does not >>>>>>>>>> put in a new >>>>>>>>>> entry in the pre-stage queue until after about 10-20 minutes >>>>>>>>>> have elapsed >>>>>>>>>> since the last time the entry was added. So, this is not a >>>>>>>>>> bug but a test >>>>>>>>>> case that was not "real". Generally, files live in the disk >>>>>>>>>> cache for at >>>>>>>>>> least 10-20 minutes. >>>>>>>>>> >>>>>>>>>> Andy >>>>>>>>>> >>>>>>>>>> On Fri, 27 May 2005, Brew, CAJ (Chris) wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> At the meeting a couple of weeks ago, it was said that someone >>>>>>>>>>> was >>>>>>>>>>> looking into this but I haven't heard anything back. Is >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> there any new? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Chris. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Brew, CAJ (Chris) >>>>>>>>>>>> Sent: 17 May 2005 13:50 >>>>>>>>>>>> To: [log in to unmask]; abh >>>>>>>>>>>> Cc: Olaiya, EO (Emmanuel) >>>>>>>>>>>> Subject: PreStage Problems >>>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I've been running some more tests of the staging at RAL and >>>>>>>>>>>> have run into a problem somewhere in the >>>>>>>>>>>> mps_Stage/PreStage/prep system. >>>>>>>>>>>> >>>>>>>>>>>> Everything work fine staging file that was on the system and >>>>>>>>>>>> has been deleted but if I try to stage in a file that was one >>>>>>>>>>>> a different server, hence the directory structure for the >>>>>>>>>>>> file does not exist on the staging server it fails and I see >>>>>>>>>>>> the following error in the PreStage log file: >>>>>>>>>>>> >>>>>>>>>>>> 12:45:43 [ 10859] mps_Stage: Open >>>>>>>>>>>> '/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/ >>>>>>>>>>>> 001005/200002/DIR_LOCK' r/w failed; No such file or directory. >>>>>>>>>>>> 12:45:43 [ 10859] do_stagein: xfr failed for >>>>>>>>>>>> /store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100 >>>>>>>>>>>> 5_3247.01.root, rc=4, retry=1 >>>>>>>>>>>> 12:45:45 [ 3255] >>>>>>>>>>>> file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_ >>>>>>>>>>>> 0010053247.01.root, rc=1024, reqid=ef000001:1cd2.425d27e1 >>>>>>>>>>>> :3762 >>>>>>>>>>>> >>>>>>>>>>>> If I create the directories and the DIR_LOCK file before >>>>>>>>>>>> running the import, everything works. >>>>>>>>>>>> >>>>>>>>>>>> The config file I'm using on the server is below. >>>>>>>>>>>> >>>>>>>>>>>> Is there some setting I'm missing which is needed to create >>>>>>>>>>>> the directories/DIR_LOCK file or does the code need fixing? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Chris Brew ([log in to unmask]) +44 1235 446326 >>>>>>>>>>>> Particle Physics Department >>>>>>>>>>>> Rutherford Appleton Laboratory >>>>>>>>>>>> Chilton, Didcot. Oxfordshire. >>>>>>>>>>>> OX11 0QX. United Kingdom. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>> >>>> >>> >> >