LISTSERV 16.5 - XCACHE-L Archives

Subscriber's Corner
Email Lists
XCACHE-L Archives

XCACHE-L@LISTSERV.SLAC.STANFORD.EDU

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		XCACHE-L Home
		XCACHE-L May 2018
Subject:
Re: Interest in an LBNL project
From:
Wilko Kroeger <[log in to unmask]>
Reply-To:
list for xcache development and deployment <[log in to unmask]>
Date:
Fri, 18 May 2018 12:21:38 -0700
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (206 lines)
On Fri, 18 May 2018, Andrew Hanushevsky wrote:

> From my reading Lustre blogs, the consensus is that is degrades performance 
> of the metadata server and unless you have an overpworing reason to enable 
> extended attributes, you should disbale them. I suspect that was the reason.

Yes, that what I also suspect. We (LCLS) also have it turned off (but I 
can turn it on).

Cheers,
   Wilko

> Andy
>
> On Fri, 18 May 2018, Wilko Kroeger wrote:
>
>> 
>> Hello Wei
>> 
>> I think the issue is that Lustre is mounted with no xattr support 
>> (user_xattr). Maybe one could ask nersc if they would be willing to do so, 
>> at least on the dtn's. There might be some reason why they don't do it.
>> 
>> Cheers,
>>  Wilko
>> 
>> 
>> On Fri, 18 May 2018, Yang, Wei wrote:
>> 
>>> Hi Andy,
>>> 
>>> In order to use NERSC Lustre, I following your suggestion to add noxattr 
>>> to all.export (at least that is what XrdOucConfig.cc says), but I got:
>>> 
>>> 180518 10:44:16 6234 proxy_Export: warning, invalid path option noxattr
>>> =====> all.export /atlas/rucio stage r/o noxattr
>>> 180518 10:44:16 6234 proxy_Export: warning, invalid path option noxattr
>>> =====> all.export /root:/ stage r/o noxattr
>>> 180518 10:44:16 6234 proxy_Export: warning, invalid path option noxattr
>>> =====> all.export /xroot:/ stage r/o noxattr
>>> ...
>>> 180518 10:56:37 6240 XrootdXeq: yangw.20594:27@cori09-224 pub IPv4 login
>>> 180518 10:56:37 6240 XrdFileCache_Manager: info Cache::Attach() 
>>> root://27@localfile:1094//atlas/rucio/transient/44/f0/panda.HITS.14132300._000997.pool.root.1.14132300-3933394871-13456989456-468-8.zip
>>> 180518 10:56:37 6240 ofs_FAttr: Unable to set attr XrdFrm.Pfn from 
>>> /global/cscratch1/sd/yangw/xcache/dtn04.nersc.gov/xrd/datafiles/data/00/F010FF5A5A18000000008037cd1500000000404%; 
>>> operation not supported
>>> 180518 10:56:37 6240 XrdFileCache_File: error File::Open() Create failed 
>>> for data file 
>>> /atlas/rucio/transient/44/f0/panda.HITS.14132300._000997.pool.root.1.14132300-3933394871-13456989456-468-8.zip, 
>>> err=Operation not supported 
>>> /atlas/rucio/transient/44/f0/panda.HITS.14132300._000997.pool.root.1.14132300-3933394871-13456989456-468-8.zip
>>> 180518 10:56:39 6240 XrdFileCache_IO: info IOEntireFile::Detach() 
>>> 0x1e19300
>>> 180518 10:56:39 6240 XrootdXeq: yangw.20594:27@cori09-224 disc 0:00:02
>>> 
>>> And again, file is not cache. So I switched back to use GPFS.
>>> 
>>> --
>>> Wei Yang  |  [log in to unmask]  |  650-926-3338 (O)
>>> 
>>> 
>>> On 5/17/18, 10:20 PM, "Yang, Wei" <[log in to unmask]> wrote:
>>>
>>>    The log is at 
>>> /global/project/projectdirs/atlas/xcache/cache/dtn04.nersc.gov/xrd/var/log/xrootd.log. 
>>> Nothing interesting there.
>>>
>>>    I met the same hanging issue on that file. RUCIO returns a long list of 
>>> data sources in metalink. I know that the 1st data source was Univ. of 
>>> Victory which does not work (even from SLAC). I manually changed the 1st 
>>> data source to Univ. of Chicago but it still hung. So I attached gdb and 
>>> tries a few other files, all works. But when I quite gdb, I see this:
>>>
>>>    (gdb) c
>>>    Continuing.
>>>    [New Thread 0x7f12cd764780 (LWP 12188)]
>>>
>>>    Program received signal SIGUSR1, User defined signal 1.
>>>    [Switching to Thread 0x7f12cd74a780 (LWP 8916)]
>>>    0x00007f12cc95879b in do_futex_wait.constprop.1 () from 
>>> /lib64/libpthread.so.0
>>>    (gdb) c
>>>    Continuing.
>>>    [Thread 0x7f12cd764780 (LWP 12188) exited]
>>>    [New Thread 0x7f12cd75c780 (LWP 12271)]
>>>
>>>    Program received signal SIGUSR1, User defined signal 1.
>>>    0x00007f12cc95879b in do_futex_wait.constprop.1 () from 
>>> /lib64/libpthread.so.0
>>>    (gdb) quit
>>>    A debugging session is active.
>>>
>>>    	Inferior 1 [process 8916] will be detached.
>>>
>>>    Quit anyway? (y or n) y
>>>    Detaching from program: 
>>> /global/project/projectdirs/atlas/xcache/test/git/xrdbld/src/xrootd, 
>>> process 8916
>>>
>>>    I don't know where is this SIGUSR1 coming from, and I quite anyway. 
>>> But then that hanging file started working! At this point I don't know 
>>> what is going on. The same xcache instance is still running. I will keep 
>>> an eye on it.
>>>
>>>    --
>>>    Wei Yang  |  [log in to unmask]  |  650-926-3338(O)
>>>
>>>    ?-----Original Message-----
>>>    From: Vakho Tsulaia <[log in to unmask]>
>>>    Date: Thursday, May 17, 2018 at 9:51 PM
>>>    To: Andrew Hanushevsky <[log in to unmask]>
>>>    Cc: Wei Yang <[log in to unmask]>, Zachary Marshall 
>>> <[log in to unmask]>, Paolo Calafiura <[log in to unmask]>
>>>    Subject: Re: Interest in an LBNL project
>>>
>>>        Hi Andy,
>>>
>>>         > Could you send the xrootd log from dtn04 (I really should get a 
>>> NERSC
>>>        account).
>>>
>>>        I don't know how to get this log. Perhaps Wei can help?
>>>
>>>        -- vakho
>>> 
>>>
>>>        On 05/17/2018 05:06 PM, Andrew Hanushevsky wrote:
>>>        > Hi Vakho,
>>>        >
>>>        > Something happened at the server on dtn04 and it thinks it 
>>> doesn't
>>>        > have access to the file but will some time in he future. So, it 
>>> is
>>>        > waiting for the future to arrive and stalling the client until 
>>> then.
>>>        > Could you send the xrootd log from dtn04 (I really should get a 
>>> NERSC
>>>        > account).
>>>        >
>>>        > Andy
>>>        >
>>>        > On Thu, 17 May 2018, Vakho Tsulaia wrote:
>>>        >
>>>        >> Hi Wei,
>>>        >>
>>>        >>> Take the first one as an example:
>>>        >>>
>>>        >>> yangw@cori02 $ ~yangw/bin/xrdcp -f
>>>        >>> 
>>> root://dtn04.nersc.gov//atlas/rucio/mc16_13TeV:EVNT.13836203._000001.pool.root.1
>>>        >>> /dev/null
>>>        >>> 
>>> [213.7MB/213.7MB][100%][==================================================][7.914MB/s]
>>>        >>>
>>>        >>> yangw@cori02 $ ~yangw/bin/xrdcp -f
>>>        >>> 
>>> root://dtn04.nersc.gov//atlas/rucio/mc16_13TeV:EVNT.13836203._000001.pool.root.1
>>>        >>> /dev/null
>>>        >>> 
>>> [213.7MB/213.7MB][100%][==================================================][213.7MB/s]
>>>        >>>
>>>        >> Yesterday I played around with it from a Shifter container. I
>>>        >> successfully downloaded several EVNT files using the commands
>>>        >> as the following one (for example):
>>>        >>
>>>        >> xrdcp -f
>>>        >> 
>>> root://dtn04.nersc.gov//atlas/rucio/mc16_13TeV:EVNT.13836203._000001.pool.root.1
>>>        >> EVNT.13836203._000001.pool.root.1
>>>        >>
>>>        >> But then at some point this command stopped working for me, it 
>>> was
>>>        >> hanging forever with no response. So I reran it with
>>>        >> '-d 3' and the thing started to generate a log which looked like 
>>> an
>>>        >> infinite loop. At some point I killed it and saved the log 
>>> (attached).
>>>        >>
>>>        >> Could you please have a look at it and tell me what's going on 
>>> there?
>>>        >>
>>>        >> Thanks,
>>>        >> -- vakho
>>>        >>
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ########################################################################
>>> Use REPLY-ALL to reply to list
>>> 
>>> To unsubscribe from the XCACHE-L list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XCACHE-L&A=1
>>> 
>> 
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XCACHE-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XCACHE-L&A=1
Top of Message | Previous Page | Permalink
Search Archives

Advanced Options
Options

		Log In
		Get Password

		Search Archives

		Subscribe or Unsubscribe