Print

Print


Hi Tim,

Bill Weeks should have the fix available. You can also find the fixed mps
scripts in /afs/slac/package/xrd/xrootd/utils (I think you just need an
update for mps_Stage and mps_prep).

Otherwise, the earliest time I can get together with Many is Monday. How
about the afternoon, say 1:30pm?

Andy

On Tue, 7 Jun 2005, Adye, TJ (Tim) wrote:

> Hi Guys,
>
> Did you manage to sort something out, despite the cancellation of the
> meeting? These are serious problems for us.
>
> Tim.
>
> > -----Original Message-----
> > From: [log in to unmask]
> > [mailto:[log in to unmask]] On Behalf Of
> > Emmanuel Olaiya
> > Sent: 06 June 2005 22:57
> > To: Andy Hanushevsky
> > Cc: Brew, CAJ (Chris); [log in to unmask]; Bill Weeks
> > Subject: Re: PreStage Problems
> >
> > Hi Andy
> >
> > Yes, it would be good if you could have a look at this with
> > me. We can
> > arrange a time in the xrootd meeting tomorrow.
> >
> > cheers
> >
> > Manny
> >
> > Andy Hanushevsky wrote:
> > > Hi Manny,
> > >
> > > I find this is quite mysterious as this should never be the
> > case and,
> > > frankly, appears to violate causality. I suspect something
> > else is going
> > > on. If this is reproducible then why don't we run a test with all
> > > debugging turned on. Yes?
> > >
> > > Andy
> > >
> > > ----- Original Message ----- From: "Emmanuel Olaiya"
> > <[log in to unmask]>
> > > To: "Andrew Hanushevsky" <[log in to unmask]>
> > > Cc: "Brew, CAJ (Chris)" <[log in to unmask]>;
> > > <[log in to unmask]>; "Bill Weeks" <[log in to unmask]>
> > > Sent: Monday, June 06, 2005 1:41 PM
> > > Subject: Re: PreStage Problems
> > >
> > >
> > >> Hi Andy
> > >>
> > >> I should have mentioned that we also remove the prestage queue and
> > >> restarted both the server and redirector. However the old
> > request to
> > >> wait did not change. Moreover, any similar new requests
> > were also told
> > >> to wait until the old request was terminated.
> > >>
> > >> cheers
> > >>
> > >> Manny
> > >>
> > >> Andrew Hanushevsky wrote:
> > >>
> > >>> Hi Manny,
> > >>>
> > >>> Yes, but who telling the client to wait? The redirector
> > or the server
> > >>> that
> > >>> wanted to orginally stage the file in. When you restart the
> > >>> redirector it
> > >>> loses all it's memory but the data server does not. So,
> > it will hapiily
> > >>> tell the redirector that it has the file eventhough the file is
> > >>> merely in
> > >>> the pre-stage queue. As long as the file is in the
> > prestage queue and
> > >>> not on
> > >>> disk, the only option is to direct clients to where the
> > file will be
> > >>> staged in and then the clients simply wait for the file
> > (which in this
> > >>> case will never appear). So, if you remove staging you
> > also need to
> > >>> remove
> > >>> the prestage queue and restart the data server.
> > >>>
> > >>> Andy
> > >>>
> > >>> On Fri, 3 Jun 2005, Emmanuel Olaiya wrote:
> > >>>
> > >>>
> > >>>> Hi Andy
> > >>>>
> > >>>> One other issue we have spotted at RAL. We removed the staging
> > >>>> capabilities and restarted the director and server.
> > However we found
> > >>>> previous requests for a file that were told to wait
> > continued being
> > >>>> told
> > >>>> to wait. We also found that if somebody else asked for
> > this same file
> > >>>> that was not on disk they were also told to wait rather
> > than being told
> > >>>> the file could not be found. We needed to kill the
> > previous request and
> > >>>> restart the server and directory for xrootd to know the
> > file was not on
> > >>>> disk.
> > >>>>
> > >>>> cheers
> > >>>>
> > >>>> Manny
> > >>>>
> > >>>> Andrew Hanushevsky wrote:
> > >>>>
> > >>>>> Hi Chris,
> > >>>>>
> > >>>>> Oh yeah, different problem. I think that Bill Weeks fixed that.
> > >>>>> Bill did
> > >>>>> you fix that problem?
> > >>>>>
> > >>>>> Andy
> > >>>>>
> > >>>>> On Mon, 30 May 2005, Brew, CAJ (Chris) wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> I might be being stupid but I don't see how this
> > relates to the
> > >>>>>> problem.
> > >>>>>> The files I wanted were on a different disk server
> > which then went
> > >>>>>> down.
> > >>>>>> The server in question was registered with the OLB as
> > being able to
> > >>>>>> stage in the name space so the request was redirected to it. If
> > >>>>>> mps_Stage is used without the PreStage queuing system
> > everything
> > >>>>>> works
> > >>>>>> as expected. If we try to go through the PreStage
> > queue to limit the
> > >>>>>> number of concurrent accesses to the tapestore the
> > stage in fails.
> > >>>>>> Apparently because the DIR_LOCK file does not exist (which it
> > >>>>>> doesn't,
> > >>>>>> since the file, and it's directory structure, has
> > never existed on
> > >>>>>> this
> > >>>>>> server).
> > >>>>>>
> > >>>>>> Yours,
> > >>>>>> Chris.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>> -----Original Message-----
> > >>>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]]
> > >>>>>>> Sent: 28 May 2005 07:39
> > >>>>>>> To: Brew, CAJ (Chris)
> > >>>>>>> Cc: [log in to unmask]; abh; Olaiya, EO (Emmanuel)
> > >>>>>>> Subject: RE: PreStage Problems
> > >>>>>>>
> > >>>>>>> Hi Chris,
> > >>>>>>>
> > >>>>>>> This was traced to overzealous testing. The syustem does not
> > >>>>>>> put in a new
> > >>>>>>> entry in the pre-stage queue until after about 10-20 minutes
> > >>>>>>> have elapsed
> > >>>>>>> since the last time the entry was added. So, this is not a
> > >>>>>>> bug but a test
> > >>>>>>> case that was not "real". Generally, files live in the disk
> > >>>>>>> cache for at
> > >>>>>>> least 10-20 minutes.
> > >>>>>>>
> > >>>>>>> Andy
> > >>>>>>>
> > >>>>>>> On Fri, 27 May 2005, Brew, CAJ (Chris) wrote:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> At the meeting a couple of weeks ago, it was said
> > that someone was
> > >>>>>>>> looking into this but I haven't heard anything back. Is
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> there any new?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Chris.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Brew, CAJ (Chris)
> > >>>>>>>>> Sent: 17 May 2005 13:50
> > >>>>>>>>> To: [log in to unmask]; abh
> > >>>>>>>>> Cc: Olaiya, EO (Emmanuel)
> > >>>>>>>>> Subject: PreStage Problems
> > >>>>>>>>>
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> I've been running some more tests of the staging at RAL and
> > >>>>>>>>> have run into a problem somewhere in the
> > >>>>>>>>> mps_Stage/PreStage/prep system.
> > >>>>>>>>>
> > >>>>>>>>> Everything work fine staging file that was on the system and
> > >>>>>>>>> has been deleted but if I try to stage in a file
> > that was one
> > >>>>>>>>> a different server, hence the directory structure for the
> > >>>>>>>>> file does not exist on the staging server it fails and I see
> > >>>>>>>>> the following error in the PreStage log file:
> > >>>>>>>>>
> > >>>>>>>>> 12:45:43 [ 10859] mps_Stage: Open
> > >>>>>>>>>
> > '/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/
> > >>>>>>>>> 001005/200002/DIR_LOCK' r/w failed; No such file or
> > directory.
> > >>>>>>>>> 12:45:43 [ 10859] do_stagein: xfr failed for
> > >>>>>>>>>
> > /store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100
> > >>>>>>>>> 5_3247.01.root, rc=4, retry=1
> > >>>>>>>>> 12:45:45 [  3255]
> > >>>>>>>>>
> > file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_
> > >>>>>>>>> 0010053247.01.root, rc=1024, reqid=ef000001:1cd2.425d27e1
> > >>>>>>>>> :3762
> > >>>>>>>>>
> > >>>>>>>>> If I create the directories and the DIR_LOCK file before
> > >>>>>>>>> running the import, everything works.
> > >>>>>>>>>
> > >>>>>>>>> The config file I'm using on the server is below.
> > >>>>>>>>>
> > >>>>>>>>> Is there some setting I'm missing which is needed to create
> > >>>>>>>>> the directories/DIR_LOCK file or does the code need fixing?
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Chris
> > >>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>> Chris Brew  ([log in to unmask])  +44 1235 446326
> > >>>>>>>>> Particle Physics Department
> > >>>>>>>>> Rutherford Appleton Laboratory
> > >>>>>>>>> Chilton, Didcot. Oxfordshire.
> > >>>>>>>>> OX11 0QX. United Kingdom.
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>
> > >
> >
>
>