Print

Print


Hi Many,

Yep!

Andy
----- Original Message ----- 
From: "Emmanuel Olaiya" <[log in to unmask]>
To: "Andy Hanushevsky" <[log in to unmask]>
Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; <[log in to unmask]>; 
"Bill Weeks" <[log in to unmask]>
Sent: Monday, June 06, 2005 4:37 PM
Subject: Re: PreStage Problems


> Hi Andy
>
> Ok, we can try and provide log files.
>
> Will you be at the meeting tomorrow?
>
> cheers
>
> Manny
>
> Andy Hanushevsky wrote:
>> Hi Manny,
>>
>> No problem other than the rest of the week is tied up -- Seattle Tue/Wed, 
>> Oakland/Thu, home Fri. Thigh looking at log files is always possible 
>> here.
>>
>> Andy
>>
>> ----- Original Message ----- From: "Emmanuel Olaiya" <[log in to unmask]>
>> To: "Andy Hanushevsky" <[log in to unmask]>
>> Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; 
>> <[log in to unmask]>; "Bill Weeks" <[log in to unmask]>
>> Sent: Monday, June 06, 2005 2:57 PM
>> Subject: Re: PreStage Problems
>>
>>
>>> Hi Andy
>>>
>>> Yes, it would be good if you could have a look at this with me. We can 
>>> arrange a time in the xrootd meeting tomorrow.
>>>
>>> cheers
>>>
>>> Manny
>>>
>>> Andy Hanushevsky wrote:
>>>
>>>> Hi Manny,
>>>>
>>>> I find this is quite mysterious as this should never be the case and, 
>>>> frankly, appears to violate causality. I suspect something else is 
>>>> going on. If this is reproducible then why don't we run a test with all 
>>>> debugging turned on. Yes?
>>>>
>>>> Andy
>>>>
>>>> ----- Original Message ----- From: "Emmanuel Olaiya" 
>>>> <[log in to unmask]>
>>>> To: "Andrew Hanushevsky" <[log in to unmask]>
>>>> Cc: "Brew, CAJ (Chris)" <[log in to unmask]>; 
>>>> <[log in to unmask]>; "Bill Weeks" <[log in to unmask]>
>>>> Sent: Monday, June 06, 2005 1:41 PM
>>>> Subject: Re: PreStage Problems
>>>>
>>>>
>>>>> Hi Andy
>>>>>
>>>>> I should have mentioned that we also remove the prestage queue and 
>>>>> restarted both the server and redirector. However the old request to 
>>>>> wait did not change. Moreover, any similar new requests were also told 
>>>>> to wait until the old request was terminated.
>>>>>
>>>>> cheers
>>>>>
>>>>> Manny
>>>>>
>>>>> Andrew Hanushevsky wrote:
>>>>>
>>>>>> Hi Manny,
>>>>>>
>>>>>> Yes, but who telling the client to wait? The redirector or the server 
>>>>>> that
>>>>>> wanted to orginally stage the file in. When you restart the 
>>>>>> redirector it
>>>>>> loses all it's memory but the data server does not. So, it will 
>>>>>> hapiily
>>>>>> tell the redirector that it has the file eventhough the file is 
>>>>>> merely in
>>>>>> the pre-stage queue. As long as the file is in the prestage queue and 
>>>>>> not on
>>>>>> disk, the only option is to direct clients to where the file will be
>>>>>> staged in and then the clients simply wait for the file (which in 
>>>>>> this
>>>>>> case will never appear). So, if you remove staging you also need to 
>>>>>> remove
>>>>>> the prestage queue and restart the data server.
>>>>>>
>>>>>> Andy
>>>>>>
>>>>>> On Fri, 3 Jun 2005, Emmanuel Olaiya wrote:
>>>>>>
>>>>>>
>>>>>>> Hi Andy
>>>>>>>
>>>>>>> One other issue we have spotted at RAL. We removed the staging
>>>>>>> capabilities and restarted the director and server. However we found
>>>>>>> previous requests for a file that were told to wait continued being 
>>>>>>> told
>>>>>>> to wait. We also found that if somebody else asked for this same 
>>>>>>> file
>>>>>>> that was not on disk they were also told to wait rather than being 
>>>>>>> told
>>>>>>> the file could not be found. We needed to kill the previous request 
>>>>>>> and
>>>>>>> restart the server and directory for xrootd to know the file was not 
>>>>>>> on
>>>>>>> disk.
>>>>>>>
>>>>>>> cheers
>>>>>>>
>>>>>>> Manny
>>>>>>>
>>>>>>> Andrew Hanushevsky wrote:
>>>>>>>
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> Oh yeah, different problem. I think that Bill Weeks fixed that. 
>>>>>>>> Bill did
>>>>>>>> you fix that problem?
>>>>>>>>
>>>>>>>> Andy
>>>>>>>>
>>>>>>>> On Mon, 30 May 2005, Brew, CAJ (Chris) wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I might be being stupid but I don't see how this relates to the 
>>>>>>>>> problem.
>>>>>>>>> The files I wanted were on a different disk server which then went 
>>>>>>>>> down.
>>>>>>>>> The server in question was registered with the OLB as being able 
>>>>>>>>> to
>>>>>>>>> stage in the name space so the request was redirected to it. If
>>>>>>>>> mps_Stage is used without the PreStage queuing system everything 
>>>>>>>>> works
>>>>>>>>> as expected. If we try to go through the PreStage queue to limit 
>>>>>>>>> the
>>>>>>>>> number of concurrent accesses to the tapestore the stage in fails.
>>>>>>>>> Apparently because the DIR_LOCK file does not exist (which it 
>>>>>>>>> doesn't,
>>>>>>>>> since the file, and it's directory structure, has never existed on 
>>>>>>>>> this
>>>>>>>>> server).
>>>>>>>>>
>>>>>>>>> Yours,
>>>>>>>>> Chris.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Andrew Hanushevsky [mailto:[log in to unmask]]
>>>>>>>>>> Sent: 28 May 2005 07:39
>>>>>>>>>> To: Brew, CAJ (Chris)
>>>>>>>>>> Cc: [log in to unmask]; abh; Olaiya, EO (Emmanuel)
>>>>>>>>>> Subject: RE: PreStage Problems
>>>>>>>>>>
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> This was traced to overzealous testing. The syustem does not
>>>>>>>>>> put in a new
>>>>>>>>>> entry in the pre-stage queue until after about 10-20 minutes
>>>>>>>>>> have elapsed
>>>>>>>>>> since the last time the entry was added. So, this is not a
>>>>>>>>>> bug but a test
>>>>>>>>>> case that was not "real". Generally, files live in the disk
>>>>>>>>>> cache for at
>>>>>>>>>> least 10-20 minutes.
>>>>>>>>>>
>>>>>>>>>> Andy
>>>>>>>>>>
>>>>>>>>>> On Fri, 27 May 2005, Brew, CAJ (Chris) wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> At the meeting a couple of weeks ago, it was said that someone 
>>>>>>>>>>> was
>>>>>>>>>>> looking into this but I haven't heard anything back. Is
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> there any new?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Chris.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Brew, CAJ (Chris)
>>>>>>>>>>>> Sent: 17 May 2005 13:50
>>>>>>>>>>>> To: [log in to unmask]; abh
>>>>>>>>>>>> Cc: Olaiya, EO (Emmanuel)
>>>>>>>>>>>> Subject: PreStage Problems
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I've been running some more tests of the staging at RAL and
>>>>>>>>>>>> have run into a problem somewhere in the
>>>>>>>>>>>> mps_Stage/PreStage/prep system.
>>>>>>>>>>>>
>>>>>>>>>>>> Everything work fine staging file that was on the system and
>>>>>>>>>>>> has been deleted but if I try to stage in a file that was one
>>>>>>>>>>>> a different server, hence the directory structure for the
>>>>>>>>>>>> file does not exist on the staging server it fails and I see
>>>>>>>>>>>> the following error in the PreStage log file:
>>>>>>>>>>>>
>>>>>>>>>>>> 12:45:43 [ 10859] mps_Stage: Open
>>>>>>>>>>>> '/stage/bdata-data50/kanga//store/SPskims/R12/16.0.2e/BtoKKKL/
>>>>>>>>>>>> 001005/200002/DIR_LOCK' r/w failed; No such file or directory.
>>>>>>>>>>>> 12:45:43 [ 10859] do_stagein: xfr failed for
>>>>>>>>>>>> /store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_00100
>>>>>>>>>>>> 5_3247.01.root, rc=4, retry=1
>>>>>>>>>>>> 12:45:45 [  3255]
>>>>>>>>>>>> file=/store/SPskims/R12/16.0.2e/BtoKKKL/001005/200002/BtoKKKL_
>>>>>>>>>>>> 0010053247.01.root, rc=1024, reqid=ef000001:1cd2.425d27e1
>>>>>>>>>>>> :3762
>>>>>>>>>>>>
>>>>>>>>>>>> If I create the directories and the DIR_LOCK file before
>>>>>>>>>>>> running the import, everything works.
>>>>>>>>>>>>
>>>>>>>>>>>> The config file I'm using on the server is below.
>>>>>>>>>>>>
>>>>>>>>>>>> Is there some setting I'm missing which is needed to create
>>>>>>>>>>>> the directories/DIR_LOCK file or does the code need fixing?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> -- 
>>>>>>>>>>>> Chris Brew  ([log in to unmask])  +44 1235 446326
>>>>>>>>>>>> Particle Physics Department
>>>>>>>>>>>> Rutherford Appleton Laboratory
>>>>>>>>>>>> Chilton, Didcot. Oxfordshire.
>>>>>>>>>>>> OX11 0QX. United Kingdom.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>
>>>>
>>>
>>
>