Print

Print


Here's my latest mystery: http://glidemon.web.cern.ch/glidemon/show.php?log=http://vocms0109.cern.ch/mon/cms1590/141115_202118_crab3test-4:bbockelm_crab_xrd_multi2/job_out.758.0.txt

This segfaults in Thread 3, which is XrdCl::XRootDMsgHandler::~XRootDMsgHandler.  To the best of my knowledge, all the code in this function looks right (and the message handler appears to work correctly).  The segfault occurs at "2014-11-15 21:59:57", seemingly after calling File handler at 0xf4ca7180.  Curiously this File was opened at "2014-11-15 21:07:45" - 50 minutes earlier - with a timeout of 180 seconds.

I believe the file open that started at 21:07:45 ended in error at 21:09:09 on the server ingrid-se03.cism.ucl.ac.be.  Note it does say:

== CMSSW:  [2014-11-15 21:09:09.459555 +0000][Error  ][XRootDTransport   ] Message 0x2ffe070, stream [2, 0] is a response that we're no longer interested in (timed out)

Here's my theory (although I haven't been able to find the bug in the code!) -- there's a logic error somewhere that, when the above log line happened, the queued message handler for xrootd-cms.infn.it didn't get deleted (and the callback didn't get called !!!).  Then, the next time the client logs into xrootd-cms.infn.it, the wrong message handler fires - leading to a double-deletion.

Thoughts?

---
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/163

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1