Print

Print


Hi all,

We setup an xrootd system using davs for Belle-II. While transfers between 
sites mostly work, between some sites we always see failures.
An example FTS log can be found here:
https://particle.phys.uvic.ca/~mebert/xrootd/fts.log

While pull doesn't work at all, push seems to work just fine initially;
the copy of the file is done and the checksum is also calculated and 
returned. However, in the end it seems FTS asks again if the file is 
there, and this time a server which did not get the file responds 
resulting in the failure of the whole process.

Setup is
- redirector rdc-redirector.belle.uvic.ca
- 3 servers xrd{1..3}.belle.uvic.ca

For the above file copy in push mode, xrd4 responds and gets the file. It 
also answers the checksum request. However, the last propfind is answered 
by xrd5 which does not have the file for some reason.

I would appreciate it if anyone has a clue why that would happen. Are 
there any known caching in dCache (at source) or FTS that remembers that 
the initial pull used xrd5? Anything we can do here?

In the logs on the servers at the time stamp of the last propfind, I see:

redirector cmsd.log:
240314 17:52:22 15461 cms_SelNode: xrd4.belle.uvic.ca serving /TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root
240314 17:52:26 16119 cms_SelNode: xrd4.belle.uvic.ca serving /TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root


redirector xrootd.log:
240314 17:52:26 16521 1f81d4c7.3635:[log in to unmask] ofs_stat:  fn=/TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root
240314 17:52:26 15507 cms_Decode: rdc-redirector redirects 1f81d4c7.3635:[log in to unmask] to xrd4.belle.uvic.ca:1094 /TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root
240314 17:52:26 16521 http_Req:  XrdHttpReq::Redir Redirecting to Location: https://xrd4.belle.uvic.ca:1094/TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root?authz=Bearer%20MDAxOWxvY2F0aW9uIFVWaWMtUkFXLVNFCjAwMzRpZGVudGlmaWVyIGFkZmRlZmIxLTE0ZmMtNGFlZC1hY2E5LTdkMDM1MDUwNTc0NAowMDE4Y2lkIG5hbWU6MWY4MWQ0YzcuMAowMDUyY2lkIGFjdGl2aXR5OlJFQURfTUVUQURBVEEsVVBMT0FELERPV05MT0FELERFTEVURSxNQU5BR0UsVVBEQVRFX01FVEFEQVRBLExJU1QKMDAyYmNpZCBhY3Rpdml0eTpNQU5BR0UsVVBMT0FELERFTEVURSxMSVNUCjAwNWJjaWQgcGF0aDovVE1QL2JlbGxlL1Jhdy9lMDAyNi9waHlzaWNzL3IwMDMxMS9zdWIwMC9waHlzaWNzLjAwMjYuMDAzMTEuSExUMy5mMDAwMDEucm9vdAowMDI0Y2lkIGJlZm9yZToyMDI0LTAzLTE1VDAyOjMxOjQzWgowMDJmc2lnbmF0dXJlIL-E0x0-46MFqEULHl4I8ISbjcM2XxievOP6ibdlf2-SCg
240314 17:52:26 16521 1f81d4c7.3635:[log in to unmask] http_Protocol: Sending resp: 307 header len:771
240314 17:52:26 16521 http_Protocol: Sending 771 bytes
240314 17:52:26 16521 http_Req:  XrdHttpReq request ended.
240314 17:52:26 16521 http_Protocol:  Cleanup
240314 17:52:26 16521 http_Protocol:  Reset
240314 17:52:26 16521 http_Req:  XrdHttpReq request ended.

xrd4 xrootd.log (last entry about the file):
240314 17:52:26 17204 1f81d4c7.616:[log in to unmask] http_Req: PostProcessHTTPReq req: 3 reqstate: 1 final_:True
240314 17:52:26 17204 1f81d4c7.616:[log in to unmask] http_Req: Checksum for HEAD /TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root adler32=25b7c302
240314 17:52:26 17204 1f81d4c7.616:[log in to unmask] http_Protocol: Sending resp: 200 header len:142


xrd4 cmsd.log:
no entry at all between 17:49 and 17:59

xrd5 cmsd.log:
no entry at all between 17:49 and 17:59

xrd5 xrootd.log:
240314 17:52:26 26833 1f81d4c7.478:[log in to unmask] ofs_stat:  fn=/TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root
240314 17:52:26 26833 ofs_stat: 1f81d4c7.478:[log in to unmask] Unable to locate /TMP/belle/Raw/e0026/physics/r00311/sub00/physics.0026.00311.HLT3.f00001.root; no such file or directory


What I don't understand is why xrd5 in the end answers.
(also why the initial pull doesn't work which seems to be an issue on our 
site)
Any help would be much appreciated!

Cheers,
  Marcus

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1