Print

Print


Previously, the underlying OFS file handle was not closed until the destructor on the XrdTpcStream object was called - this occurs _after_ the response was sent to a client.

In this case, subsequent operations may be done while the file is still open for writing, which may mean data has not been written out from the client cache to the distributed file system.  If the subsequent operation (such as `stat`) is load-balanced to a different data server, then the second operation may observe an incomplete file.

The fix is to explicitly cause the file handle to be closed prior to sending the response to the client.  Then, as long as the filesystem has open-to-close consistency (which NFS and HDFS do have -- a much looser consistency requirement than POSIX), this should remove the possibility of reading an incomplete file.

This bug was revealed while testing load-balanced Xrootd on top of a HDFS filesystem at Nebraska.  Depending on how often the race condition was triggered, it caused failure rates of up to 10%.
You can view, comment on, or merge this pull request online at:

  https://github.com/xrootd/xrootd/pull/891

-- Commit Summary --

  * Allow the state and stream objects to explicitly finalize.
  * Finalize successful transfers.

-- File Changes --

    M src/XrdTpc/XrdTpcMultistream.cc (7)
    M src/XrdTpc/XrdTpcState.cc (6)
    M src/XrdTpc/XrdTpcState.hh (9)
    M src/XrdTpc/XrdTpcStream.cc (21)
    M src/XrdTpc/XrdTpcStream.hh (14)

-- Patch Links --

https://github.com/xrootd/xrootd/pull/891.patch
https://github.com/xrootd/xrootd/pull/891.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/pull/891

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1