We learned some more about this bug. It relates to issue #1366. There are three ingredients for this problem: gfal-copy trying to write files, the root: protocol, and HDFS as the file system. The sequence of events is as follows.

  1. A request is made by gfal-copy to write a file using the root: protocol.
  2. At the data node, this request fails due to #1366. The file name is added to the posc.log.
  3. The sysadmin re-starts xrootd on the data node (for maintenance or other reasons).
  4. During start-up, xrootd goes through the posc.log to clean up the files. It tries to delete the files from step 2, but once again it fails because of bug #1366.
  5. Because xrootd can't clean up the files in the posc.log, it enters a non-responsive state and does nothing.


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <xrootd/xrootd/issues/1721/1184845715@github.com>

[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/xrootd/xrootd/issues/1721#issuecomment-1184845715", "url": "https://github.com/xrootd/xrootd/issues/1721#issuecomment-1184845715", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1