Hi Andy Ok, we can try and provide log files. Will you be at the meeting tomorrow? cheers Manny Andy Hanushevsky wrote: > Hi Manny, > > No problem other than the rest of the week is tied up -- Seattle > Tue/Wed, Oakland/Thu, home Fri. Thigh looking at log files is always > possible here. > > Andy > > ----- Original Message ----- From: "Emmanuel Olaiya" <[log in to unmask]> > To: "Andy Hanushevsky" <[log in to unmask]> > Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; > <[log in to unmask]>; "Bill Weeks" <[log in to unmask]> > Sent: Monday, June 06, 2005 2:57 PM > Subject: Re: PreStage Problems > > >> Hi Andy >> >> Yes, it would be good if you could have a look at this with me. We can >> arrange a time in the xrootd meeting tomorrow. >> >> cheers >> >> Manny >> >> Andy Hanushevsky wrote: >> >>> Hi Manny, >>> >>> I find this is quite mysterious as this should never be the case and, >>> frankly, appears to violate causality. I suspect something else is >>> going on. If this is reproducible then why don't we run a test with >>> all debugging turned on. Yes? >>> >>> Andy >>> >>> ----- Original Message ----- From: "Emmanuel Olaiya" <[log in to unmask]> >>> To: "Andrew Hanushevsky" <[log in to unmask]> >>> Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; >>> <[log in to unmask]>; "Bill Weeks" <[log in to unmask]> >>> Sent: Monday, June 06, 2005 1:41 PM >>> Subject: Re: PreStage Problems >>> >>> >>>> Hi Andy >>>> >>>> I should have mentioned that we also remove the prestage queue and >>>> restarted both the server and redirector. However the old request to >>>> wait did not change. Moreover, any similar new requests were also >>>> told to wait until the old request was terminated. >>>> >>>> cheers >>>> >>>> Manny >>>> >>>> Andrew Hanushevsky wrote: >>>> >>>>> Hi Manny, >>>>> >>>>> Yes, but who telling the client to wait? The redirector or the >>>>> server that >>>>> wanted to orginally stage the file in. When you restart the >>>>> redirector it >>>>> loses all it's memory but the data server does not. So, it will >>>>> hapiily >>>>> tell the redirector that it has the file eventhough the file is >>>>> merely in >>>>> the pre-stage queue. As long as the file is in the prestage queue >>>>> and not on >>>>> disk, the only option is to direct clients to where the file will be >>>>> staged in and then the clients simply wait for the file (which in this >>>>> case will never appear). So, if you remove staging you also need to >>>>> remove >>>>> the prestage queue and restart the data server. >>>>> >>>>> Andy >>>>> >>>>> On Fri, 3 Jun 2005, Emmanuel Olaiya wrote: >>>>> >>>>> >>>>>> Hi Andy >>>>>> >>>>>> One other issue we have spotted at RAL. We removed the staging >>>>>> capabilities and restarted the director and server. However we found >>>>>> previous requests for a file that were told to wait continued >>>>>> being told >>>>>> to wait. We also found that if somebody else asked for this same file >>>>>> that was not on disk they were also told to wait rather than being >>>>>> told >>>>>> the file could not be found. We needed to kill the previous >>>>>> request and >>>>>> restart the server and directory for xrootd to know the file was >>>>>> not on >>>>>> disk. >>>>>> >>>>>> cheers >>>>>> >>>>>> Manny >>>>>> >>>>>> Andrew Hanushevsky wrote: >>>>>> >>>>>>> Hi Chris, >>>>>>> >>>>>>> Oh yeah, different problem. I think that Bill Weeks fixed that. >>>>>>> Bill did >>>>>>> you fix that problem? >>>>>>> >>>>>>> Andy >>>>>>> >>>>>>> On Mon, 30 May 2005, Brew, CAJ (Chris) wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I might be being stupid but I don't see how this relates to the >>>>>>>> problem. >>>>>>>> The files I wanted were on a different disk server which then >>>>>>>> went down. >>>>>>>> The server in question was registered with the OLB as being able to >>>>>>>> stage in the name space so the request was redirected to it. If >>>>>>>> mps_Stage is used without the PreStage queuing system everything >>>>>>>> works >>>>>>>> as expected. If we try to go through the PreStage queue to limit >>>>>>>> the >>>>>>>> number of concurrent accesses to the tapestore the stage in fails. >>>>>>>> Apparently because the DIR_LOCK file does not exist (which it >>>>>>>> doesn't, >>>>>>>> since the file, and it's directory structure, has never existed >>>>>>>> on this >>>>>>>> server). >>>>>>>> >>>>>>>> Yours, >>>>>>>> Chris. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]] >>>>>>>>> Sent: 28 May 2005 07:39 >>>>>>>>> To: Brew, CAJ (Chris) >>>>>>>>> Cc: [log in to unmask]; abh; Olaiya, EO (Emmanuel) >>>>>>>>> Subject: RE: PreStage Problems >>>>>>>>> >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> This was traced to overzealous testing. The syustem does not >>>>>>>>> put in a new >>>>>>>>> entry in the pre-stage queue until after about 10-20 minutes >>>>>>>>> have elapsed >>>>>>>>> since the last time the entry was added. So, this is not a >>>>>>>>> bug but a test >>>>>>>>> case that was not "real". Generally, files live in the disk >>>>>>>>> cache for at >>>>>>>>> least 10-20 minutes. >>>>>>>>> >>>>>>>>> Andy >>>>>>>>> >>>>>>>>> On Fri, 27 May 2005, Brew, CAJ (Chris) wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> At the meeting a couple of weeks ago, it was said that someone >>>>>>>>>> was >>>>>>>>>> looking into this but I haven't heard anything back. Is >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> there any new? >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Chris. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Brew, CAJ (Chris) >>>>>>>>>>> Sent: 17 May 2005 13:50 >>>>>>>>>>> To: [log in to unmask]; abh >>>>>>>>>>> Cc: Olaiya, EO (Emmanuel) >>>>>>>>>>> Subject: PreStage Problems >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I've been running some more tests of the staging at RAL and >>>>>>>>>>> have run into a problem somewhere in the >>>>>>>>>>> mps_Stage/PreStage/prep system. >>>>>>>>>>> >>>>>>>>>>> Everything work fine staging file that was on the system and >>>>>>>>>>> has been deleted but if I try to stage in a file that was one >>>>>>>>>>> a different server, hence the directory structure for the >>>>>>>>>>> file does not exist on the staging server it fails and I see >>>>>>>>>>> the following error in the PreStage log file: >>>>>>>>>>> >>>>>>>>>>> 12:45:43 [ 10859] mps_Stage: Open >>>>>>>>>>> '/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/ >>>>>>>>>>> 001005/200002/DIR_LOCK' r/w failed; No such file or directory. >>>>>>>>>>> 12:45:43 [ 10859] do_stagein: xfr failed for >>>>>>>>>>> /store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100 >>>>>>>>>>> 5_3247.01.root, rc=4, retry=1 >>>>>>>>>>> 12:45:45 [ 3255] >>>>>>>>>>> file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_ >>>>>>>>>>> 0010053247.01.root, rc=1024, reqid=ef000001:1cd2.425d27e1 >>>>>>>>>>> :3762 >>>>>>>>>>> >>>>>>>>>>> If I create the directories and the DIR_LOCK file before >>>>>>>>>>> running the import, everything works. >>>>>>>>>>> >>>>>>>>>>> The config file I'm using on the server is below. >>>>>>>>>>> >>>>>>>>>>> Is there some setting I'm missing which is needed to create >>>>>>>>>>> the directories/DIR_LOCK file or does the code need fixing? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Chris Brew ([log in to unmask]) +44 1235 446326 >>>>>>>>>>> Particle Physics Department >>>>>>>>>>> Rutherford Appleton Laboratory >>>>>>>>>>> Chilton, Didcot. Oxfordshire. >>>>>>>>>>> OX11 0QX. United Kingdom. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>> >>> >> >