Print

Print


Have another case and it looks like a dead-lock triggered by lock inversion in the poller implementation.

This are two threads blocking each other:
 26 Thread 0x41cbc940 (LWP 2364)  0x0000003da6a0d524 in __lll_lock_wait () from /lib64/libpthread.so.0
 21 Thread 0x41dbd940 (LWP 2391)  0x0000003da6a0d524 in __lll_lock_wait () from /lib64/libpthread.so.0

gdb) thread 26
[Switching to thread 26 (Thread 0x41cbc940 (LWP 2364))]#6  0x000000326f22286c in XrdSys::IOEvents::Poller::CbkTMO (this=0x2aaaaec1d180) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:548
548	        CbkXeq(cP, cP->dlType, 0, 0);
(gdb) where
#0  0x0000003da6a0d524 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003da6a08e35 in _L_lock_1127 () from /lib64/libpthread.so.0
#2  0x0000003da6a08d33 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x000000326f222637 in Lock (this=0x2aaaaec1d180, cP=0x2aaaacc91200, events=<value optimized out>, eNum=0, eTxt=0x0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:149
#4  Lock (this=0x2aaaaec1d180, cP=0x2aaaacc91200, events=<value optimized out>, eNum=0, eTxt=0x0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:197
#5  XrdSys::IOEvents::Poller::CbkXeq (this=0x2aaaaec1d180, cP=0x2aaaacc91200, events=<value optimized out>, eNum=0, eTxt=0x0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:626
#6  0x000000326f22286c in XrdSys::IOEvents::Poller::CbkTMO (this=0x2aaaaec1d180) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:548
#7  0x000000326f2229bb in XrdSys::IOEvents::PollE::Begin (this=0x2aaaaec1d180, syncsem=<value optimized out>, retcode=<value optimized out>, eTxt=<value optimized out>)
    at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEventsPollE.icc:202
#8  0x000000326f221aa4 in XrdSys::IOEvents::BootStrap::Start (parg=0x415652b0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:110
#9  0x000000326f21fc2f in XrdSysThread_Xeq (myargs=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.cc:86
#10 0x0000003da6a0673d in start_thread () from /lib64/libpthread.so.0
#11 0x0000003da62d44bd in clone () from /lib64/libc.so.6


(gdb) thread 21
[Switching to thread 21 (Thread 0x41dbd940 (LWP 2391))]#0  0x0000003da6a0d524 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) where
#0  0x0000003da6a0d524 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003da6a08e35 in _L_lock_1127 () from /lib64/libpthread.so.0
#2  0x0000003da6a08d33 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x000000326f22211c in Lock (this=0x2aaaaec1d180, cP=0x80) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:149
#4  XrdSysMutexHelper (this=0x2aaaaec1d180, cP=0x80) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:208
#5  XrdSys::IOEvents::Poller::TmoAdd (this=0x2aaaaec1d180, cP=0x80) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:965
#6  0x000000326f22300a in XrdSys::IOEvents::Channel::Enable (this=0x2aaaacc91200, events=<value optimized out>, timeout=1, eText=0x41dbb458) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:373
#7  0x00002aaaad663293 in XrdCl::PollerBuiltIn::EnableWriteNotification (this=<value optimized out>, socket=0x2aaaacf633d0, notify=<value optimized out>, timeout=1)
    at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClPollerBuiltIn.cc:404
#8  0x00002aaaad669423 in EnableUplink (this=0x2aaaacf93280, path=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClAsyncSocketHandler.hh:96
#9  XrdCl::Stream::EnableLink (this=0x2aaaacf93280, path=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:186
#10 0x00002aaaad669729 in XrdCl::Stream::Send (this=0x2aaaacf93280, msg=0x2aaab001d100, handler=0x2aaab0026888, stateful=true, expires=1366069376)
    at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:273
#11 0x00002aaaad665ede in XrdCl::Channel::Send (this=0x2aaaacf93140, msg=0x2aaab001d100, handler=0x2aaab0026888, stateful=true, expires=1366069376)
    at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClChannel.cc:266
#12 0x00002aaaad66539e in XrdCl::PostMaster::Send (this=<value optimized out>, url=<value optimized out>, msg=0x2aaab001d100, handler=0x2aaab0026888, stateful=true, expires=1366069376)
    at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClPostMaster.cc:169
#13 0x00002aaaad684f10 in XrdCl::MessageUtils::SendMessage (url=..., msg=0x2aaab001d100, handler=<value optimized out>, sendParams=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClMessageUtils.cc:109
#14 0x00002aaaad68e836 in XrdCl::FileStateHandler::SendOrQueue (this=0x2aaaae517ac0, url=<value optimized out>, msg=0x2aaab001d100, handler=0x2aaab002c5c0, sendParams=...)
    at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClFileStateHandler.cc:1129
#15 0x00002aaaad690716 in XrdCl::FileStateHandler::Stat (this=0x2aaaae517ac0, force=<value optimized out>, handler=0x41dbbbf0, timeout=0)
    at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClFileStateHandler.cc:496
#16 0x00002aaaad687904 in XrdCl::File::Stat (this=<value optimized out>, force=false, handler=0xffffffffffffffff, timeout=53752) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClFile.cc:103
#17 0x00002aaaad68858e in XrdCl::File::Stat (this=0x2aaaaf0256a0, force=true, response=@0x41dbbd10, timeout=0) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClFile.cc:114
#18 0x00002aaaad2db749 in XrdMqClient::RecvMessage (this=0x2aaaad627130) at /afs/cern.ch/work/a/apeters/eos-master/eos/mq/XrdMqClient.cc:304
#19 0x00002aaaad1a27bd in eos::mgm::Iostat::Receive (this=0x2aaaad626a48) at /afs/cern.ch/work/a/apeters/eos-master/eos/mgm/Iostat.cc:258
#20 0x000000326f21fc2f in XrdSysThread_Xeq (myargs=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.cc:86
#21 0x0000003da6a0673d in start_thread () from /lib64/libpthread.so.0
#22 0x0000003da62d44bd in clone () from /lib64/libc.so.6

Thread 26 is stuck here:
cbkMHelp.Lock(&(cP->chMutex));

Thread 21 is stuck here:
XrdSysMutexHelper mHelper(toMutex);

In Thread 26 the owner of chMutex is ID=2391 => Thread21 
(gdb) print cP->chMutex
$28 = {<XrdSysMutex> = {cs = {__data = {__lock = 2, __count = 1, __owner = 2391, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = "\002\000\000\000\001\000\000\000W\t\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}}, <No data fields>}

In Thread 21 the owner of toMutex is ID=2364 => Thread 26
(gdb) print this->toMutex
$29 = {<XrdSysMutex> = {cs = {__data = {__lock = 2, __count = 1, __owner = 2364, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = "\002\000\000\000\001\000\000\000<\t\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}}, <No data fields>}


===> deadlock between both threads.


---
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/4#issuecomment-16439047

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1