Print

Print


Hi Manny,

I find this is quite mysterious as this should never be the case and, 
frankly, appears to violate causality. I suspect something else is going on. 
If this is reproducible then why don't we run a test with all debugging 
turned on. Yes?

Andy

----- Original Message ----- 
From: "Emmanuel Olaiya" <[log in to unmask]>
To: "Andrew Hanushevsky" <[log in to unmask]>
Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; <[log in to unmask]>; 
"Bill Weeks" <[log in to unmask]>
Sent: Monday, June 06, 2005 1:41 PM
Subject: Re: PreStage Problems


> Hi Andy
>
> I should have mentioned that we also remove the prestage queue and 
> restarted both the server and redirector. However the old request to wait 
> did not change. Moreover, any similar new requests were also told to wait 
> until the old request was terminated.
>
> cheers
>
> Manny
>
> Andrew Hanushevsky wrote:
>> Hi Manny,
>>
>> Yes, but who telling the client to wait? The redirector or the server 
>> that
>> wanted to orginally stage the file in. When you restart the redirector it
>> loses all it's memory but the data server does not. So, it will hapiily
>> tell the redirector that it has the file eventhough the file is merely in
>> the pre-stage queue. As long as the file is in the prestage queue and not 
>> on
>> disk, the only option is to direct clients to where the file will be
>> staged in and then the clients simply wait for the file (which in this
>> case will never appear). So, if you remove staging you also need to 
>> remove
>> the prestage queue and restart the data server.
>>
>> Andy
>>
>> On Fri, 3 Jun 2005, Emmanuel Olaiya wrote:
>>
>>
>>>Hi Andy
>>>
>>>One other issue we have spotted at RAL. We removed the staging
>>>capabilities and restarted the director and server. However we found
>>>previous requests for a file that were told to wait continued being told
>>>to wait. We also found that if somebody else asked for this same file
>>>that was not on disk they were also told to wait rather than being told
>>>the file could not be found. We needed to kill the previous request and
>>>restart the server and directory for xrootd to know the file was not on
>>>disk.
>>>
>>>cheers
>>>
>>>Manny
>>>
>>>Andrew Hanushevsky wrote:
>>>
>>>>Hi Chris,
>>>>
>>>>Oh yeah, different problem. I think that Bill Weeks fixed that. Bill did
>>>>you fix that problem?
>>>>
>>>>Andy
>>>>
>>>>On Mon, 30 May 2005, Brew, CAJ (Chris) wrote:
>>>>
>>>>
>>>>
>>>>>Hi,
>>>>>
>>>>>I might be being stupid but I don't see how this relates to the 
>>>>>problem.
>>>>>The files I wanted were on a different disk server which then went 
>>>>>down.
>>>>>The server in question was registered with the OLB as being able to
>>>>>stage in the name space so the request was redirected to it. If
>>>>>mps_Stage is used without the PreStage queuing system everything works
>>>>>as expected. If we try to go through the PreStage queue to limit the
>>>>>number of concurrent accesses to the tapestore the stage in fails.
>>>>>Apparently because the DIR_LOCK file does not exist (which it doesn't,
>>>>>since the file, and it's directory structure, has never existed on this
>>>>>server).
>>>>>
>>>>>Yours,
>>>>>Chris.
>>>>>
>>>>>
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Andrew Hanushevsky [mailto:[log in to unmask]]
>>>>>>Sent: 28 May 2005 07:39
>>>>>>To: Brew, CAJ (Chris)
>>>>>>Cc: [log in to unmask]; abh; Olaiya, EO (Emmanuel)
>>>>>>Subject: RE: PreStage Problems
>>>>>>
>>>>>>Hi Chris,
>>>>>>
>>>>>>This was traced to overzealous testing. The syustem does not
>>>>>>put in a new
>>>>>>entry in the pre-stage queue until after about 10-20 minutes
>>>>>>have elapsed
>>>>>>since the last time the entry was added. So, this is not a
>>>>>>bug but a test
>>>>>>case that was not "real". Generally, files live in the disk
>>>>>>cache for at
>>>>>>least 10-20 minutes.
>>>>>>
>>>>>>Andy
>>>>>>
>>>>>>On Fri, 27 May 2005, Brew, CAJ (Chris) wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Hi,
>>>>>>>
>>>>>>>At the meeting a couple of weeks ago, it was said that someone was
>>>>>>>looking into this but I haven't heard anything back. Is
>>>>>>
>>>>>>there any new?
>>>>>>
>>>>>>
>>>>>>>Thanks,
>>>>>>>Chris.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>-----Original Message-----
>>>>>>>>From: Brew, CAJ (Chris)
>>>>>>>>Sent: 17 May 2005 13:50
>>>>>>>>To: [log in to unmask]; abh
>>>>>>>>Cc: Olaiya, EO (Emmanuel)
>>>>>>>>Subject: PreStage Problems
>>>>>>>>
>>>>>>>>Hi,
>>>>>>>>
>>>>>>>>I've been running some more tests of the staging at RAL and
>>>>>>>>have run into a problem somewhere in the
>>>>>>>>mps_Stage/PreStage/prep system.
>>>>>>>>
>>>>>>>>Everything work fine staging file that was on the system and
>>>>>>>>has been deleted but if I try to stage in a file that was one
>>>>>>>>a different server, hence the directory structure for the
>>>>>>>>file does not exist on the staging server it fails and I see
>>>>>>>>the following error in the PreStage log file:
>>>>>>>>
>>>>>>>>12:45:43 [ 10859] mps_Stage: Open
>>>>>>>>'/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/
>>>>>>>>001005/200002/DIR_LOCK' r/w failed; No such file or directory.
>>>>>>>>12:45:43 [ 10859] do_stagein: xfr failed for
>>>>>>>>/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100
>>>>>>>>5_3247.01.root, rc=4, retry=1
>>>>>>>>12:45:45 [  3255]
>>>>>>>>file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_
>>>>>>>>0010053247.01.root, rc=1024, reqid=ef000001:1cd2.425d27e1
>>>>>>>>:3762
>>>>>>>>
>>>>>>>>If I create the directories and the DIR_LOCK file before
>>>>>>>>running the import, everything works.
>>>>>>>>
>>>>>>>>The config file I'm using on the server is below.
>>>>>>>>
>>>>>>>>Is there some setting I'm missing which is needed to create
>>>>>>>>the directories/DIR_LOCK file or does the code need fixing?
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Chris
>>>>>>>>
>>>>>>>>--
>>>>>>>> Chris Brew  ([log in to unmask])  +44 1235 446326
>>>>>>>> Particle Physics Department
>>>>>>>> Rutherford Appleton Laboratory
>>>>>>>> Chilton, Didcot. Oxfordshire.
>>>>>>>> OX11 0QX. United Kingdom.
>>>>>>>>
>>>>>>>
>>>>>>>
>