In recent releases it comes with ROOT. In older releases it would print
the version number when it opens the file.
On Fri, 10 Jun 2005, Adye, TJ (Tim) wrote:
> Hi Fabrizio and Pete,
>
> Andrew Hanushevsky <[log in to unmask]> wrote:
>
> > The "continuing to hang" problem is a client problem. Here
> > the client is always asking for a cache refresh. So, either
> > an old client is being used (old clients had this bug and it
> > was fixed about 6 months ago) or the bug has returned under
> > this new scenario (I suspect that the latter is true).
>
> How can we check which version of the XTNetFile client we are using? Is
> it part of the release or a shared library installed somewhere else? How
> can we update?
>
> Hopefully with Manny's fix to the staging system we won't see this
> problem so often, but from what Andy says it's probably still lurking.
>
> Thanks,
> Tim.
>
> > -----Original Message-----
> > From: Andrew Hanushevsky <[log in to unmask]>
> > Sent: 09 June 2005 23:36
> > To: Bill Weeks
> > Cc: [log in to unmask]; Adye, TJ (Tim); Brew, CAJ (Chris);
> > Fabrizio Furano; [log in to unmask]
> > Subject: Re: PreStage Problems
> >
> > Hi Manny,
> >
> > I guess we will completely sort this out on Monday.
> > Distilling all of the below, there are only one saliant issue:
> >
> > a) Why is the file *not* getting basedir prepended to it? We
> > can figure this out by doing a diff on what you installed and
> > what is in utils to see why mps_PreStage is not prefixing the path.
> >
> > The "continuing to hang" problem is a client problem. Here
> > the client is always asking for a cache refresh. So, either
> > an old client is being used (old clients had this bug and it
> > was fixed about 6 months ago) or the bug has returned under
> > this new scenario (I suspect that the latter is true).
> >
> > So, Fabrizio, do you see anywhere in the client where the
> > code may get causght in a cache refresh loop?
> > Andy
> >
> > On Thu, 9 Jun 2005, Bill Weeks wrote:
> >
> > > Hi,
> > > I hope I can help sort out what's going on here, but it is
> > confusing.
> > > First off, mps_PreStage and mps_Stage never really handled "mssdir"
> > > and "basedir" correctly. This was never a problem for us
> > because these
> > > have always been the same. For RAL, this is not the case. So RAL
> > > (Chris?) changed mps_PreStage to add $basedir to the target
> > filename, e.g.
> > >
> > > $cmd = "$pstgcmd $rflag $Lflag $file $basedir/$file 2>&1";
> > >
> > > Once this was done, mps_Stage failed for a file whose path did not
> > > previously exist because $basedir/$file created a filepath
> > with a "//"
> > > in it and the MakePath subroutine didn't handle this properly. The
> > > change I made in version 1.9 of mps_Stage removed the
> > double //'s so
> > > MakePath would work properly.
> > >
> > > The problem you are now reporting seems to indicate that you have
> > > either removed your mod to mps_PreStage or have redefined
> > basedir in
> > > your config file because mps_Stage is trying to write into /store
> > > instead of /basedir/store, e.g.
> > /stage/bdata-data50/kanga/store. Is this what happened?
> > >
> > > I think once the file is correctly staged in, the waiting jobs that
> > > are polling for the file will continue.
> > >
> > > We still have some work to do to correctly handle the
> > situation where
> > > mssdir and basedir are different.
> > > --Bill Weeks, SLAC, (650) 926-2909
> > >
> > >
> > > >Date: Tue, 07 Jun 2005 14:30:56 -0700
> > > >From: Emmanuel Olaiya <[log in to unmask]>
> > > >User-Agent: Mozilla Thunderbird 0.9 (X11/20041103)
> > > >X-Accept-Language: en-us, en
> > > >MIME-Version: 1.0
> > > >To: Andrew Hanushevsky <[log in to unmask]>
> > > >CC: "Adye, TJ (Tim)" <[log in to unmask]>, "Brew, CAJ (Chris)"
> > > <[log in to unmask]>, [log in to unmask], Bill Weeks
> > > <[log in to unmask]>
> > > >Subject: Re: PreStage Problems
> > > >Content-Transfer-Encoding: 7bit
> > > >
> > > >Hi Andy, Bill
> > > >
> > > >I took the versions of mps_Stage and mps_prep from
> > > >/afs/slac/package/xrd/xrootd/utils. These are mps_Stage
> > and mps_prep
> > > >versions 1.9 and 1.8 respectively.
> > > >
> > > >I still see the problem Chris reported. Restarting the
> > directors and
> > > >the server (with prestaging on the server) I get the following
> > > >message in the prestage log when asking for a file that
> > doesn't exist
> > > >at RAL
> > > >
> > > >Starting new cycle, pstg proc = 0
> > > >21:17:41 [ 17543] getlock: locking file
> > > > >>/opt/xrootd/stageQ/PreStageQ.0.lock, flags 2
> > > >21:17:41 [ 17543] getlock: locking file
> > > >+</opt/xrootd/stageQ/PreStageQ.0.old, flags 2
> > > >21:17:41 [ 17543] unlock: unlocking file
> > > >/opt/xrootd/stageQ/PreStageQ.0.old
> > > >21:17:41 [ 17543] unlock: unlocking file
> > > >/opt/xrootd/stageQ/PreStageQ.0.lock
> > > >21:17:41 [ 17543] getlock: locking file
> > > > >>/opt/xrootd/stageQ/PreStageQ.1.lock, flags 2
> > > >21:17:41 [ 17543] unlock: unlocking file
> > > >/opt/xrootd/stageQ/PreStageQ.1.lock
> > > >21:21:29 [ 17772] mps_Stage: cannot create 'store' in
> > > >'/store/PRskims/R14/16.1.1b/BToPPP/58/'; Permission denied
> > > >21:21:29 [ 17772] mps_Stage: Invalid file system path,
> > > >'/store/PRskims/R14/16.1.1b/BToPPP/58/'.
> > > >21:21:29 [ 17772] do_stagein: xfr failed for
> > > >/store/PRskims/R14/16.1.1b/BToPPP/58/BToPPP_5831.01.root, rc=4,
> > > >retry=1
> > > >
> > > >Whilst my job just hangs. If I take the log file literally, it is
> > > >trying to write to /store when it should be trying to write to
> > > >/base_directory/store.
> > > >
> > > >Doing further tests I can reproduce the problem I reported earlier.
> > > >Whilst still asking for the above file I turn off staging, restart
> > > >the directors and servers and the request for the file
> > continues to
> > > >hang (is told to wait). Then I make another request for
> > the same file
> > > >and this request is also continually told to wait:
> > > >
> > > >050607 21:55:13 2915 odc_Locate:
> > olaiya.8042:[log in to unmask] asked
> > > >to wait 5 by xrootd107
> > > >path=/store/PRskims/R14/16.1.1b/BToPPP/58/BToPPP_5831.01.root
> > > >050607 21:55:14 2915 odc_Locate:
> > olaiya.23507:[log in to unmask] asked
> > > >to wait 5 by xrootd107
> > > >path=/store/PRskims/R14/16.1.1b/BToPPP/58/BToPPP_5831.01.root
> > > >050607 21:55:18 2915 odc_Locate:
> > olaiya.8042:[log in to unmask] asked
> > > >to wait 5 by xrootd107
> > > >path=/store/PRskims/R14/16.1.1b/BToPPP/58/BToPPP_5831.01.root
> > > >...
> > > >
> > > >
> > > >It is only after I kill the first request that anymore
> > requests for
> > > >this file return correctly with a message indicating that the file
> > > >cannot be found.
> > > >
> > > >cheers
> > > >
> > > >Manny
> > > >
> > > >Andrew Hanushevsky wrote:
> > > >> Hi Tim,
> > > >>
> > > >> Bill Weeks should have the fix available. You can also find the
> > > >> fixed mps scripts in /afs/slac/package/xrd/xrootd/utils (I think
> > > >> you just need an update for mps_Stage and mps_prep).
> > > >>
> > > >> Otherwise, the earliest time I can get together with Many is
> > > >> Monday. How about the afternoon, say 1:30pm?
> > > >>
> > > >> Andy
> > > >>
> > > >> On Tue, 7 Jun 2005, Adye, TJ (Tim) wrote:
> > > >>
> > > >>
> > > >>>Hi Guys,
> > > >>>
> > > >>>Did you manage to sort something out, despite the
> > cancellation of
> > > >>>the meeting? These are serious problems for us.
> > > >>>
> > > >>>Tim.
> > > >>>
> > > >>>
> > > >>>>-----Original Message-----
> > > >>>>From: [log in to unmask]
> > > >>>>[mailto:[log in to unmask]] On Behalf Of
> > > >>>>Emmanuel Olaiya
> > > >>>>Sent: 06 June 2005 22:57
> > > >>>>To: Andy Hanushevsky
> > > >>>>Cc: Brew, CAJ (Chris); [log in to unmask]; Bill Weeks
> > > >>>>Subject: Re: PreStage Problems
> > > >>>>
> > > >>>>Hi Andy
> > > >>>>
> > > >>>>Yes, it would be good if you could have a look at this
> > with me. We
> > > >>>>can arrange a time in the xrootd meeting tomorrow.
> > > >>>>
> > > >>>>cheers
> > > >>>>
> > > >>>>Manny
> > > >>>>
> > > >>>>Andy Hanushevsky wrote:
> > > >>>>
> > > >>>>>Hi Manny,
> > > >>>>>
> > > >>>>>I find this is quite mysterious as this should never be the
> > > >>>>
> > > >>>>case and,
> > > >>>>
> > > >>>>>frankly, appears to violate causality. I suspect something
> > > >>>>
> > > >>>>else is going
> > > >>>>
> > > >>>>>on. If this is reproducible then why don't we run a
> > test with all
> > > >>>>>debugging turned on. Yes?
> > > >>>>>
> > > >>>>>Andy
> > > >>>>>
> > > >>>>>----- Original Message ----- From: "Emmanuel Olaiya"
> > > >>>>
> > > >>>><[log in to unmask]>
> > > >>>>
> > > >>>>>To: "Andrew Hanushevsky" <[log in to unmask]>
> > > >>>>>Cc: "Brew, CAJ (Chris)" <[log in to unmask]>;
> > > >>>>><[log in to unmask]>; "Bill Weeks"
> > > >>>>><[log in to unmask]>
> > > >>>>>Sent: Monday, June 06, 2005 1:41 PM
> > > >>>>>Subject: Re: PreStage Problems
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>>Hi Andy
> > > >>>>>>
> > > >>>>>>I should have mentioned that we also remove the
> > prestage queue
> > > >>>>>>and restarted both the server and redirector. However the old
> > > >>>>
> > > >>>>request to
> > > >>>>
> > > >>>>>>wait did not change. Moreover, any similar new requests
> > > >>>>
> > > >>>>were also told
> > > >>>>
> > > >>>>>>to wait until the old request was terminated.
> > > >>>>>>
> > > >>>>>>cheers
> > > >>>>>>
> > > >>>>>>Manny
> > > >>>>>>
> > > >>>>>>Andrew Hanushevsky wrote:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>>Hi Manny,
> > > >>>>>>>
> > > >>>>>>>Yes, but who telling the client to wait? The redirector
> > > >>>>
> > > >>>>or the server
> > > >>>>
> > > >>>>>>>that
> > > >>>>>>>wanted to orginally stage the file in. When you restart the
> > > >>>>>>>redirector it loses all it's memory but the data server does
> > > >>>>>>>not. So,
> > > >>>>
> > > >>>>it will hapiily
> > > >>>>
> > > >>>>>>>tell the redirector that it has the file eventhough
> > the file is
> > > >>>>>>>merely in the pre-stage queue. As long as the file is in the
> > > >>>>
> > > >>>>prestage queue and
> > > >>>>
> > > >>>>>>>not on
> > > >>>>>>>disk, the only option is to direct clients to where the
> > > >>>>
> > > >>>>file will be
> > > >>>>
> > > >>>>>>>staged in and then the clients simply wait for the file
> > > >>>>
> > > >>>>(which in this
> > > >>>>
> > > >>>>>>>case will never appear). So, if you remove staging you
> > > >>>>
> > > >>>>also need to
> > > >>>>
> > > >>>>>>>remove
> > > >>>>>>>the prestage queue and restart the data server.
> > > >>>>>>>
> > > >>>>>>>Andy
> > > >>>>>>>
> > > >>>>>>>On Fri, 3 Jun 2005, Emmanuel Olaiya wrote:
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>>Hi Andy
> > > >>>>>>>>
> > > >>>>>>>>One other issue we have spotted at RAL. We removed
> > the staging
> > > >>>>>>>>capabilities and restarted the director and server.
> > > >>>>
> > > >>>>However we found
> > > >>>>
> > > >>>>>>>>previous requests for a file that were told to wait
> > > >>>>
> > > >>>>continued being
> > > >>>>
> > > >>>>>>>>told
> > > >>>>>>>>to wait. We also found that if somebody else asked for
> > > >>>>
> > > >>>>this same file
> > > >>>>
> > > >>>>>>>>that was not on disk they were also told to wait rather
> > > >>>>
> > > >>>>than being told
> > > >>>>
> > > >>>>>>>>the file could not be found. We needed to kill the
> > > >>>>
> > > >>>>previous request and
> > > >>>>
> > > >>>>>>>>restart the server and directory for xrootd to know the
> > > >>>>
> > > >>>>file was not on
> > > >>>>
> > > >>>>>>>>disk.
> > > >>>>>>>>
> > > >>>>>>>>cheers
> > > >>>>>>>>
> > > >>>>>>>>Manny
> > > >>>>>>>>
> > > >>>>>>>>Andrew Hanushevsky wrote:
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>>Hi Chris,
> > > >>>>>>>>>
> > > >>>>>>>>>Oh yeah, different problem. I think that Bill
> > Weeks fixed that.
> > > >>>>>>>>>Bill did
> > > >>>>>>>>>you fix that problem?
> > > >>>>>>>>>
> > > >>>>>>>>>Andy
> > > >>>>>>>>>
> > > >>>>>>>>>On Mon, 30 May 2005, Brew, CAJ (Chris) wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>>Hi,
> > > >>>>>>>>>>
> > > >>>>>>>>>>I might be being stupid but I don't see how this
> > > >>>>
> > > >>>>relates to the
> > > >>>>
> > > >>>>>>>>>>problem.
> > > >>>>>>>>>>The files I wanted were on a different disk server
> > > >>>>
> > > >>>>which then went
> > > >>>>
> > > >>>>>>>>>>down.
> > > >>>>>>>>>>The server in question was registered with the OLB as
> > > >>>>
> > > >>>>being able to
> > > >>>>
> > > >>>>>>>>>>stage in the name space so the request was
> > redirected to it.
> > > >>>>>>>>>>If mps_Stage is used without the PreStage queuing system
> > > >>>>
> > > >>>>everything
> > > >>>>
> > > >>>>>>>>>>works
> > > >>>>>>>>>>as expected. If we try to go through the PreStage
> > > >>>>
> > > >>>>queue to limit the
> > > >>>>
> > > >>>>>>>>>>number of concurrent accesses to the tapestore the
> > > >>>>
> > > >>>>stage in fails.
> > > >>>>
> > > >>>>>>>>>>Apparently because the DIR_LOCK file does not
> > exist (which
> > > >>>>>>>>>>it doesn't, since the file, and it's directory structure,
> > > >>>>>>>>>>has
> > > >>>>
> > > >>>>never existed on
> > > >>>>
> > > >>>>>>>>>>this
> > > >>>>>>>>>>server).
> > > >>>>>>>>>>
> > > >>>>>>>>>>Yours,
> > > >>>>>>>>>>Chris.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>>-----Original Message-----
> > > >>>>>>>>>>>From: Andrew Hanushevsky [mailto:[log in to unmask]]
> > > >>>>>>>>>>>Sent: 28 May 2005 07:39
> > > >>>>>>>>>>>To: Brew, CAJ (Chris)
> > > >>>>>>>>>>>Cc: [log in to unmask]; abh; Olaiya, EO
> > (Emmanuel)
> > > >>>>>>>>>>>Subject: RE: PreStage Problems
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>Hi Chris,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>This was traced to overzealous testing. The syustem does
> > > >>>>>>>>>>>not put in a new entry in the pre-stage queue
> > until after
> > > >>>>>>>>>>>about 10-20 minutes have elapsed since the last time the
> > > >>>>>>>>>>>entry was added. So, this is not a bug but a
> > test case that
> > > >>>>>>>>>>>was not "real". Generally, files live in the
> > disk cache for
> > > >>>>>>>>>>>at least 10-20 minutes.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>Andy
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>On Fri, 27 May 2005, Brew, CAJ (Chris) wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>>Hi,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>At the meeting a couple of weeks ago, it was said
> > > >>>>
> > > >>>>that someone was
> > > >>>>
> > > >>>>>>>>>>>>looking into this but I haven't heard anything back. Is
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>there any new?
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>>Thanks,
> > > >>>>>>>>>>>>Chris.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>>-----Original Message-----
> > > >>>>>>>>>>>>>From: Brew, CAJ (Chris)
> > > >>>>>>>>>>>>>Sent: 17 May 2005 13:50
> > > >>>>>>>>>>>>>To: [log in to unmask]; abh
> > > >>>>>>>>>>>>>Cc: Olaiya, EO (Emmanuel)
> > > >>>>>>>>>>>>>Subject: PreStage Problems
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>Hi,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>I've been running some more tests of the
> > staging at RAL
> > > >>>>>>>>>>>>>and have run into a problem somewhere in the
> > > >>>>>>>>>>>>>mps_Stage/PreStage/prep system.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>Everything work fine staging file that was on
> > the system
> > > >>>>>>>>>>>>>and has been deleted but if I try to stage in a file
> > > >>>>
> > > >>>>that was one
> > > >>>>
> > > >>>>>>>>>>>>>a different server, hence the directory
> > structure for the
> > > >>>>>>>>>>>>>file does not exist on the staging server it
> > fails and I
> > > >>>>>>>>>>>>>see the following error in the PreStage log file:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>12:45:43 [ 10859] mps_Stage: Open
> > > >>>>>>>>>>>>>
> > > >>>>
> > > >>>>'/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/
> > > >>>>
> > > >>>>>>>>>>>>>001005/200002/DIR_LOCK' r/w failed; No such file or
> > > >>>>
> > > >>>>directory.
> > > >>>>
> > > >>>>>>>>>>>>>12:45:43 [ 10859] do_stagein: xfr failed for
> > > >>>>>>>>>>>>>
> > > >>>>
> > > >>>>/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100
> > > >>>>
> > > >>>>>>>>>>>>>5_3247.01.root, rc=4, retry=1
> > > >>>>>>>>>>>>>12:45:45 [ 3255]
> > > >>>>>>>>>>>>>
> > > >>>>
> > > >>>>file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_
> > > >>>>
> > > >>>>>>>>>>>>>0010053247.01.root, rc=1024,
> > reqid=ef000001:1cd2.425d27e1
> > > >>>>>>>>>>>>>:3762
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>If I create the directories and the DIR_LOCK
> > file before
> > > >>>>>>>>>>>>>running the import, everything works.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>The config file I'm using on the server is below.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>Is there some setting I'm missing which is needed to
> > > >>>>>>>>>>>>>create the directories/DIR_LOCK file or does
> > the code need fixing?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>Thanks,
> > > >>>>>>>>>>>>>Chris
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>--
> > > >>>>>>>>>>>>>Chris Brew ([log in to unmask]) +44 1235 446326
> > > >>>>>>>>>>>>>Particle Physics Department Rutherford Appleton
> > > >>>>>>>>>>>>>Laboratory Chilton, Didcot. Oxfordshire.
> > > >>>>>>>>>>>>>OX11 0QX. United Kingdom.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>
> > >
> > >
> >
>
--
/------------------------------------+-------------------------\
|Stephen J. Gowdy | SLAC, MailStop 34, |
|http://www.slac.stanford.edu/~gowdy/ | 2575 Sand Hill Road, |
|http://calendar.yahoo.com/gowdy | Menlo Park CA 94025, USA |
|EMail: [log in to unmask] | Tel: +1 650 926 3144 |
\------------------------------------+-------------------------/
|