Print

Print


URL:
  <http://savannah.cern.ch/bugs/?99002>

                 Summary: Server still crashes in do_WriteNone (xrootd 3.2.5)
                 Project: XROOTD
            Submitted by: apeters
            Submitted on: 2012-11-22 10:26
                Severity: 3 - Normal
                Priority: 7 - High
                  Status: None
                 Privacy: Public
             Assigned to: None
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
      Fixed by commit(s): 

    _______________________________________________________

Details:

We still obeserve a crash in the do_WriteNone implementation.

I have found the work flow how it is triggered.

It is triggered by the client recovery mechanism (in our case it is a gridftp
gateway writing, but it does not matter who writes)

1 the xrootd client opens a file in create/truncate mode
2 client get's redirected to a server where a write fails
3 client tries to re-open the file on the disk server, EOS does not allow
re-open of files in create mode and bounces the client to the redirector
4 the client issues his open in append mode to the redirector, the redirector
fails it because the file has been cleaned in the meanwhile
5 the client ignores the error and replays the write request this time
towards the redirector

Although the client does some 'crappy' stuff, the server probably should not
SEGV.

Here is the stack trace:


ore was generated by `/usr//bin/xrootd -n mgm -c /etc/xrd.cf.mgm -m -l
/var/log/eos/xrdlog.mgm -b -Rd'.
Program terminated with signal 11, Segmentation fault.
#0  getErrText (this=0x1ab24540, rc=931135488, opC=0 '\000', myError=...,
Path=0x0) at /usr/src/debug/xrootd/xrootd/src/XrdOuc/XrdOucErrInfo.hh:90
90	                                   {ecode = ErrInfo.code; 
(gdb) where
#0  getErrText (this=0x1ab24540, rc=931135488, opC=0 '\000', myError=...,
Path=0x0) at /usr/src/debug/xrootd/xrootd/src/XrdOuc/XrdOucErrInfo.hh:90
#1  XrdXrootdProtocol::fsError (this=0x1ab24540, rc=931135488, opC=0 '\000',
myError=..., Path=0x0) at
/usr/src/debug/xrootd/xrootd/src/XrdXrootd/XrdXrootdXeq.cc:2367
#2  0x000000000041dfd5 in XrdXrootdProtocol::do_Write (this=0x1ab24540) at
/usr/src/debug/xrootd/xrootd/src/XrdXrootd/XrdXrootdXeq.cc:2244
#3  0x00000031af046b80 in XrdLink::DoIt (this=0x2aaac1dee4c8) at
/usr/src/debug/xrootd/xrootd/src/Xrd/XrdLink.cc:421
#4  0x00000031af04ae86 in XrdScheduler::Run (this=0x31af60a8c0) at
/usr/src/debug/xrootd/xrootd/src/Xrd/XrdScheduler.cc:287
#5  0x00000031af04b019 in XrdStartWorking (carg=0x1ab24540) at
/usr/src/debug/xrootd/xrootd/src/Xrd/XrdScheduler.cc:65
#6  0x00000031af01bfaf in XrdSysThread_Xeq (myargs=<value optimized out>) at
/usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.cc:67
#7  0x00000032a7e0677d in start_thread () from /lib64/libpthread.so.0
#8  0x00000032a76d3c1d in clone () from /lib64/libc.so.6

There is something weird here:
(gdb) frame 2
#2  0x000000000041dfd5 in XrdXrootdProtocol::do_Write (this=0x1ab24540) at
/usr/src/debug/xrootd/xrootd/src/XrdXrootd/XrdXrootdXeq.cc:2244
2244	                       return do_WriteNone();
(gdb) print this->FTab
$1 = (XrdXrootdFileTable *) 0x0
(gdb) print this->myFile
$2 = (XrdXrootdFile *) 0x2aab305650c0

If FTab is 0 it should never call do_WriteNone from line 2244.
So FTab must have been !=0 but then it was re-initialized or something ...
hmmm ...





    _______________________________________________________

Reply to this item at:

  <http://savannah.cern.ch/bugs/?99002>

_______________________________________________
  Message sent via/by LCG Savannah
  http://savannah.cern.ch/

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1