LISTSERV mailing list manager LISTSERV 16.5

Help for XCACHE-L Archives


XCACHE-L Archives

XCACHE-L Archives


XCACHE-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

XCACHE-L Home

XCACHE-L Home

XCACHE-L  May 2018

XCACHE-L May 2018

Subject:

Re: Interest in an LBNL project

From:

"Yang, Wei" <[log in to unmask]>

Reply-To:

list for xcache development and deployment <[log in to unmask]>

Date:

Fri, 18 May 2018 18:10:02 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (1 lines)

Hi Andy,

In order to use NERSC Lustre, I following your suggestion to add noxattr to all.export (at least that is what XrdOucConfig.cc says), but I got:

180518 10:44:16 6234 proxy_Export: warning, invalid path option noxattr
=====> all.export /atlas/rucio stage r/o noxattr
180518 10:44:16 6234 proxy_Export: warning, invalid path option noxattr
=====> all.export /root:/ stage r/o noxattr
180518 10:44:16 6234 proxy_Export: warning, invalid path option noxattr
=====> all.export /xroot:/ stage r/o noxattr
...
180518 10:56:37 6240 XrootdXeq: yangw.20594:27@cori09-224 pub IPv4 login
180518 10:56:37 6240 XrdFileCache_Manager: info Cache::Attach() root://27@localfile:1094//atlas/rucio/transient/44/f0/panda.HITS.14132300._000997.pool.root.1.14132300-3933394871-13456989456-468-8.zip
180518 10:56:37 6240 ofs_FAttr: Unable to set attr XrdFrm.Pfn from /global/cscratch1/sd/yangw/xcache/dtn04.nersc.gov/xrd/datafiles/data/00/F010FF5A5A18000000008037cd1500000000404%; operation not supported
180518 10:56:37 6240 XrdFileCache_File: error File::Open() Create failed for data file /atlas/rucio/transient/44/f0/panda.HITS.14132300._000997.pool.root.1.14132300-3933394871-13456989456-468-8.zip, err=Operation not supported /atlas/rucio/transient/44/f0/panda.HITS.14132300._000997.pool.root.1.14132300-3933394871-13456989456-468-8.zip
180518 10:56:39 6240 XrdFileCache_IO: info IOEntireFile::Detach() 0x1e19300
180518 10:56:39 6240 XrootdXeq: yangw.20594:27@cori09-224 disc 0:00:02

And again, file is not cache. So I switched back to use GPFS.

--
Wei Yang  |  [log in to unmask]  |  650-926-3338 (O)
 

On 5/17/18, 10:20 PM, "Yang, Wei" <[log in to unmask]> wrote:

    The log is at /global/project/projectdirs/atlas/xcache/cache/dtn04.nersc.gov/xrd/var/log/xrootd.log. Nothing interesting there.
    
    I met the same hanging issue on that file. RUCIO returns a long list of data sources in metalink. I know that the 1st data source was Univ. of Victory which does not work (even from SLAC). I manually changed the 1st data source to Univ. of Chicago but it still hung. So I attached gdb and tries a few other files, all works. But when I quite gdb, I see this:
    
    (gdb) c
    Continuing.
    [New Thread 0x7f12cd764780 (LWP 12188)]
    
    Program received signal SIGUSR1, User defined signal 1.
    [Switching to Thread 0x7f12cd74a780 (LWP 8916)]
    0x00007f12cc95879b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
    (gdb) c
    Continuing.
    [Thread 0x7f12cd764780 (LWP 12188) exited]
    [New Thread 0x7f12cd75c780 (LWP 12271)]
    
    Program received signal SIGUSR1, User defined signal 1.
    0x00007f12cc95879b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
    (gdb) quit
    A debugging session is active.
    
    	Inferior 1 [process 8916] will be detached.
    
    Quit anyway? (y or n) y
    Detaching from program: /global/project/projectdirs/atlas/xcache/test/git/xrdbld/src/xrootd, process 8916
    
    I don't know where is this SIGUSR1 coming from, and I quite anyway.  But then that hanging file started working! At this point I don't know what is going on. The same xcache instance is still running. I will keep an eye on it.
    
    --
    Wei Yang  |  [log in to unmask]  |  650-926-3338(O)
    
    ?-----Original Message-----
    From: Vakho Tsulaia <[log in to unmask]>
    Date: Thursday, May 17, 2018 at 9:51 PM
    To: Andrew Hanushevsky <[log in to unmask]>
    Cc: Wei Yang <[log in to unmask]>, Zachary Marshall <[log in to unmask]>, Paolo Calafiura <[log in to unmask]>
    Subject: Re: Interest in an LBNL project
    
        Hi Andy,
        
         > Could you send the xrootd log from dtn04 (I really should get a NERSC 
        account).
        
        I don't know how to get this log. Perhaps Wei can help?
        
        -- vakho
        
        
        On 05/17/2018 05:06 PM, Andrew Hanushevsky wrote:
        > Hi Vakho,
        >
        > Something happened at the server on dtn04 and it thinks it doesn't 
        > have access to the file but will some time in he future. So, it is 
        > waiting for the future to arrive and stalling the client until then. 
        > Could you send the xrootd log from dtn04 (I really should get a NERSC 
        > account).
        >
        > Andy
        >
        > On Thu, 17 May 2018, Vakho Tsulaia wrote:
        >
        >> Hi Wei,
        >>
        >>> Take the first one as an example:
        >>>
        >>> yangw@cori02 $ ~yangw/bin/xrdcp -f 
        >>> root://dtn04.nersc.gov//atlas/rucio/mc16_13TeV:EVNT.13836203._000001.pool.root.1 
        >>> /dev/null
        >>> [213.7MB/213.7MB][100%][==================================================][7.914MB/s] 
        >>>
        >>> yangw@cori02 $ ~yangw/bin/xrdcp -f 
        >>> root://dtn04.nersc.gov//atlas/rucio/mc16_13TeV:EVNT.13836203._000001.pool.root.1 
        >>> /dev/null
        >>> [213.7MB/213.7MB][100%][==================================================][213.7MB/s] 
        >>>
        >> Yesterday I played around with it from a Shifter container. I 
        >> successfully downloaded several EVNT files using the commands
        >> as the following one (for example):
        >>
        >> xrdcp -f 
        >> root://dtn04.nersc.gov//atlas/rucio/mc16_13TeV:EVNT.13836203._000001.pool.root.1 
        >> EVNT.13836203._000001.pool.root.1
        >>
        >> But then at some point this command stopped working for me, it was 
        >> hanging forever with no response. So I reran it with
        >> '-d 3' and the thing started to generate a log which looked like an 
        >> infinite loop. At some point I killed it and saved the log (attached).
        >>
        >> Could you please have a look at it and tell me what's going on there?
        >>
        >> Thanks,
        >> -- vakho
        >>
        
        
    
    


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XCACHE-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XCACHE-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

June 2018
May 2018
April 2018
March 2018
February 2018

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use