Hi Andy,

Indeed using the current HEAD of the master the problem of the Invalid message goes way. At this point the client received a kXR_wait response but for 1 second and not for 0 seconds as it is specified here: https://github.com/xrootd/xrootd/blob/master/src/XrdXrootd/XrdXrootdCallBack.cc#L141

The client logs look like this:

[2015-10-06 09:45:29.953368 +0200][Dump   ][XRootDTransport   ] [msg: 0xb8000b70] Expecting 20 bytes of message body
[2015-10-06 09:45:29.953429 +0200][Dump   ][AsyncSock         ] [lxc2dev6d1.cern.ch:1095 #0.0] Received message header for 0xb8000b70 size: 8
[2015-10-06 09:45:29.953458 +0200][Dump   ][AsyncSock         ] [lxc2dev6d1.cern.ch:1095 #0.0] Received message 0xb8000b70 of 28 bytes
[2015-10-06 09:45:29.953483 +0200][Dump   ][PostMaster        ] [lxc2dev6d1.cern.ch:1095 #0] Queuing received message: 0xb8000b70.
[2015-10-06 09:45:29.953572 +0200][Dump   ][XRootD            ] [lxc2dev6d1.cern.ch:1095] Got an async response to message kXR_open (file: /castor/cern.ch/dev/e/esindril/dir_default/test2G_1.dat?tpc.key=000e6309712f6faf56137c15&tpc.or
[log in to unmask], mode: 00, flags: kXR_open_read kXR_async kXR_retstat ), processing it
[2015-10-06 09:45:29.953646 +0200][Dump   ][XRootD            ] [lxc2dev6d1.cern.ch:1095] Got kXR_wait response of 1 seconds to message kXR_open (file: /castor/cern.ch/dev/e/esindril/dir_default/test2G_1.dat?tpc.key=000e6309712f6faf56137c15&[log in to unmask], mode: 00, flags: kXR_open_read kXR_async kXR_retstat ): 

I've looked deeper into this and the problem actually comes for the latest commit. Therefore, instead of using commit f8ec5c6, I used the following patch:

diff --git a/src/XrdOfs/XrdOfsTPCAuth.cc b/src/XrdOfs/XrdOfsTPCAuth.cc
index b1f7f62..2367717 100644
--- a/src/XrdOfs/XrdOfsTPCAuth.cc
+++ b/src/XrdOfs/XrdOfsTPCAuth.cc
@@ -87,7 +87,7 @@ int XrdOfsTPCAuth::Add(XrdOfsTPC::Facts &Args)
       {if (aP->Info.cbP)
           {aP->expT = expT;
            aP->Next = authQ; authQ = aP;
-           aP->Info.Reply(SFS_STALL, 0, "", &authMutex);
+           aP->Info.Reply(SFS_OK, 0, "", &authMutex);
            return 1;
           } else {
            authMutex.UnLock();
diff --git a/src/XrdXrootd/XrdXrootdCallBack.cc b/src/XrdXrootd/XrdXrootdCallBack.cc
index aaf300f..6d37572 100644
--- a/src/XrdXrootd/XrdXrootdCallBack.cc
+++ b/src/XrdXrootd/XrdXrootdCallBack.cc
@@ -143,7 +143,7 @@ void XrdXrootdCBJob::DoIt()
 // the client to wait zero seconds. Protocol demands a client retry.
 //
    if (SFS_OK == Result)
-      {if (*(cbFunc->Func()) == 'o') cbFunc->sendResp(eInfo, kXR_wait, 0);
+     {if (*(cbFunc->Func()) == 'o') {int rc = 0; cbFunc->sendResp(eInfo, kXR_wait, &rc);}
           else {if (*(cbFunc->Func()) == 'x') DoStatx(eInfo);
                 cbFunc->sendResp(eInfo, kXR_ok, 0, eInfo->getErrText(),
                                                    eInfo->getErrTextLen());

This I believe fixes the underlying problem as in the XrdXrootdCBJob::DoIt function there is a special code path dealing with async responses for open which is not used if we return SFS_STALL in the XrdOfsTPCAuth::Add. The Invalid message was coming form the fact that the XrdXrootdCBJob::sendResp called above, was not properly building the message.

Let me know you thoughts on this and if it makes sense I can push it to the master.

Thanks,
Elvin


Reply to this email directly or view it on GitHub.



Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1