LISTSERV 16.5 - XROOTD-DEV Archives

The corruptions were first discovered in the CASTOR system while recalling data from tape. Below I will summaries the steps to fully reproduce this behavior which is not particular to CASTOR and can be observed on any installation where the disk is slow/ with a lot of I/O wait.

This happens when the XrdCl client enters the retry mechanism for an open operation. Therefore, I've added the following code in the XrdOfs.cc file in the "open" function to simulate a slow server:

```c++
diff --git a/src/XrdOfs/XrdOfs.cc b/src/XrdOfs/XrdOfs.cc
index dffe18a..7710d41 100644
--- a/src/XrdOfs/XrdOfs.cc
+++ b/src/XrdOfs/XrdOfs.cc
@@ -443,6 +443,14 @@ int XrdOfsFile::open(const char          *path,      // In
    int find_flag = open_mode & (SFS_O_NOWAIT | SFS_O_RESET | SFS_O_MULTIW);
    XrdOucEnv Open_Env(info,0,client);

+   static int counter = 0;
+
+   if (counter < 3) {
+     ++counter;
+     fprintf(stderr, "Dealy the open for 65 seconds ...\n");
+     sleep(65);
+   }
+
 // Trace entry
 //
```

Recompile and install the new libraries. Use the default "standalone" configuration for an XRootD server, but add the "nolock" option to the all.export directive. Also enable tracing at the OFS layer. Example:
```bash
all.export /tmp nolock
ofs.trace all
```
Now, just try to transfer a big file (2GB) to the server:
```bash 
xrdcp -f /tmp/2GB.dat root://localhost//tmp/test_file1.dat
```

The XRD_STREAMTIMEOUT is 60 seconds, therefore, the client will enter the open retry mechanism and will retry the open operation 3 times. Eventually, the transfer succeeds but the file on the server is corrupted. Namely, the beginning of the file is filled with zeros or the whole file is empty. This is due to one of the open recovery requests which truncates the file after some writes have gone though. 

As an example the contents of such a transferred file looks like this:
```bash
[esindril@esvm000 build]$ hexdump /tmp/test_fil.dat | head -10
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
6f5c0000 0f48 7591 537e 7554 2cc4 b31d 3b4c 3980
6f5c0010 96fc 8e76 fdc9 115e 1ef4 c423 62e2 66ff
6f5c0020 9beb 0211 10af 2af8 49a5 9939 3c6c 53b9
6f5c0030 a931 69b3 0a1b 045f 0955 fc71 4e1f 8971
6f5c0040 91b7 d556 1a01 8201 9d4c 3aa3 0966 39ec
6f5c0050 0cea 293e aee8 be5a 4d92 db5a b875 e397
6f5c0060 81e3 fad4 2c85 add0 9dff ac35 ca58 01b0
6f5c0070 175b 855d 347e 0c86 87cb 300f 8b17 2d76
```

Attached you have the full client logs in Dump mode:
[xrdcp.txt](https://github.com/xrootd/xrootd/files/891082/xrdcp.txt)

and the server logs:
[xrootd.txt](https://github.com/xrootd/xrootd/files/891073/xrootd.txt)

It's pretty clear from the server logs that one of the recovery open requests comes while the client is already pushing data and this leads to the truncation of the file. Michal and I looked together at the code toady and we have a suspicion that the recycling of SID might lead to this situation - but we haven't fully confirmed it.

Before going any further, I have a question for @abh3 : The key modification in this setup is the "nolock" option that allows multiple writers to the same file. In this scenario, is this corruption that we observe expected by the design of the protocol or this is really a bug in the client code? I suspect the latter ...


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/496

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1