The process I'm looking at has a thread in D state (530), making it cumbersome to stacktrace. I took a stacktrace to each of the threads that is not in D state [1] and it seems that the general scenario is a call from XrdCl::PostMaster that waits for a lock. This XrdSysRWLock from 12625 (and all other threads) seems to point that 576 is having the lock to write? ``` [root@b7s06p7796 ~]# /opt/rh/devtoolset-8/root/bin/gdb -p 12625 (gdb) bt #0 0x00007fe4bf918184 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 #1 0x00007fe4c38e3e45 in XrdSysRWLock::ReadLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419 #2 XrdSysRWLockHelper::XrdSysRWLockHelper (rd=true, l=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419 #3 XrdCl::PostMaster::Send (this=this@entry=0x7fe4bb00e2a0, url=..., msg=msg@entry=0x7fe47f428040, handler=handler@entry=0x7fe488c1a900, stateful=<optimized out>, expires=1674182914) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:220 #4 0x00007fe4c391b794 in XrdCl::MessageUtils::SendMessage (url=..., msg=msg@entry=0x7fe47f428040, handler=<optimized out>, sendParams=..., lFileHandler=lFileHandler@entry=0x0) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClMessageUtils.cc:127 #5 0x00007fe4c39088b9 in XrdCl::FileSystemData::Send (fs=std::shared_ptr<XrdCl::FileSystemData> (use count 3, weak count 0) = {...}, msg=msg@entry=0x7fe47f428040, handler=<optimized out>, handler@entry=0x7fe4663fcc80, params=...) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClFileSystem.cc:956 #6 0x00007fe4c38ff7c7 in XrdCl::FileSystem::Query (this=0x7fe488c0f040, queryCode=XrdCl::QueryCode::OpaqueFile, arg=..., handler=0x7fe4663fcc80, timeout=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClFileSystem.cc:1264 #7 0x00007fe4c38ff930 in XrdCl::FileSystem::Query (this=this@entry=0x7fe488c0f040, queryCode=queryCode@entry=XrdCl::QueryCode::OpaqueFile, arg=..., response=@0x7fe4663fce38: 0x0, timeout=timeout@entry=2) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClFileSystem.cc:1276 #8 0x0000000000544527 in backend::Query (this=0xacc7d8 <EosFuse::instance()::i+1624>, url=..., query_code=<optimized out>, arg=..., response=@0x7fe4663fce38: 0x0, rtimeout=2, noretry=true) at /builddir/build/BUILD/eos-5.1.5-1/fusex/backend/backend.cc:1140 #9 0x0000000000546e86 in backend::statvfs (this=0xacc7d8 <EosFuse::instance()::i+1624>, req=req@entry=0x7fe488c9c080, stbuf=stbuf@entry=0x7fe4663fd430) at /builddir/build/BUILD/eos-5.1.5-1/fusex/backend/backend.cc:989 #10 0x00000000004c9bec in metad::statvfs (this=<optimized out>, req=req@entry=0x7fe488c9c080, svfs=svfs@entry=0x7fe4663fd430) at /builddir/build/BUILD/eos-5.1.5-1/fusex/md/md.cc:1702 #11 0x000000000045913b in EosFuse::statfs (req=0x7fe488c9c080, ino=1) at /builddir/build/BUILD/eos-5.1.5-1/fusex/eosfuse.hh:205 #12 0x00007fe4c22ab73b in do_statfs () from /lib64/libfuse.so.2 #13 0x00007fe4c22aab6b in fuse_ll_process_buf () from /lib64/libfuse.so.2 #14 0x00007fe4c22a7401 in fuse_do_work () from /lib64/libfuse.so.2 #15 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0 #16 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6 (gdb) f 1 #1 0x00007fe4c38e3e45 in XrdSysRWLock::ReadLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419 (gdb) p (XrdSysRWLock) *0x7fe49e0692c8 $5 = {lock = {__data = {__lock = 0, __nr_readers = 0, __readers_wakeup = 1415, __writer_wakeup = 2996, __nr_readers_queued = 37, __nr_writers_queued = 2, __writer = 576, __shared = 0, __pad1 = 0, __pad2 = 0, __flags = 0}, __size = "\000\000\000\000\000\000\000\000\207\005\000\000\264\v\000\000%\000\000\000\002\000\000\000@\002", '\000' <repeats 29 times>, __align = 0}} ``` ``` [root@b7s06p7796 ~]# /opt/rh/devtoolset-8/root/bin/gdb -p 576 Missing separate debuginfos, use: debuginfo-install eos-fusex-core-5.1.5-1.el7.cern.x86_64 (gdb) bt #0 0x00007fe4bf91bf40 in __pause_nocancel () from /lib64/libpthread.so.0 #1 0x00007fe4bf912bcc in __pthread_mutex_lock_full () from /lib64/libpthread.so.0 #2 0x00007fe4c38e5a7f in XrdSysMutex::Lock (this=0x7fe49480f1d8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281 #3 XrdSysMutexHelper::XrdSysMutexHelper (mutex=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281 #4 XrdCl::TickGeneratorTask::Invalidate (this=0x7fe49480f1c0) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClChannel.cc:72 #5 XrdCl::Channel::~Channel (this=0x7fe49643a140, __in_chrg=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClChannel.cc:134 #6 0x00007fe4c38e28bd in XrdCl::PostMaster::ForceDisconnect (this=0x7fe4bb00e2a0, url=...) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:317 #7 0x00007fe4c38ebc6b in XrdCl::Stream::OnReadTimeout (this=0x7fe49647d1c0, substream=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClStream.cc:1064 #8 0x00007fe4c396887d in XrdCl::AsyncSocketHandler::OnReadTimeout (this=this@entry=0x7fe49646f300) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:690 #9 0x00007fe4c396aff7 in XrdCl::AsyncSocketHandler::Event (this=0x7fe49646f300, type=2 '\002') at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:239 #10 0x00007fe4c38e0368 in (anonymous namespace)::SocketCallBack::Event (this=0x7fe496415060, chP=<optimized out>, cbArg=<optimized out>, evFlags=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPollerBuiltIn.cc:83 #11 0x00007fe4c34f37db in XrdSys::IOEvents::Poller::CbkXeq (this=0x7fe49e0c43a0, cP=0x7fe48746d420, events=2, eNum=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:721 #12 0x00007fe4c34f3d9e in XrdSys::IOEvents::Poller::CbkTMO (this=0x7fe49e0c43a0) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:626 #13 0x00007fe4c34f4ca1 in XrdSys::IOEvents::PollE::Begin (syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>, this=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:216 #14 XrdSys::IOEvents::PollE::Begin (this=0x7fe49e0c43a0, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:196 #15 0x00007fe4c34f13bd in XrdSys::IOEvents::BootStrap::Start (parg=0x7fe49effd980) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:149 #16 0x00007fe4c34f9e27 in XrdSysThread_Xeq (myargs=0x7fe49e013a20) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.cc:86 #17 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0 #18 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6 (gdb) f 2 #2 0x00007fe4c38e5a7f in XrdSysMutex::Lock (this=0x7fe49480f1d8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281 281 {mutex.Lock(); (gdb) p (XrdSysMutex) *0x7fe49480f1d8 $1 = {cs = {__data = {__lock = -1052688063, __count = 1665941825, __owner = 909144128, __nusers = 926298934, __kind = 964243505, __spins = 13875, __elision = 12592, __list = {__prev = 0x68632e6e7265632e, __next = 0x333231313a}}, __size = "AAA\[log in to unmask]:1123\000\000", __align = 7155165658655834433}} (gdb) f 3 #3 XrdSysMutexHelper::XrdSysMutexHelper (mutex=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281 281 {mutex.Lock(); (gdb) p mutex $2 = (XrdSysMutex &) @0x7fe49480f1d8: {cs = {__data = {__lock = -1052688063, __count = 1665941825, __owner = 909144128, __nusers = 926298934, __kind = 964243505, __spins = 13875, __elision = 12592, __list = {__prev = 0x68632e6e7265632e, __next = 0x333231313a}}, __size = "AAA\[log in to unmask]:1123\000\000", __align = 7155165658655834433}} (gdb) f 5 #5 XrdCl::Channel::~Channel (this=0x7fe49643a140, __in_chrg=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClChannel.cc:134 134 pTickGenerator->Invalidate(); (gdb) p (XrdCl::Channel) *0x7fe49643a140 $3 = {pUrl = {pHostId = "[log in to unmask]:1094", pProtocol = "root", pUserName = "AAAAAALU", pPassword = "", pHostName = "eoshome-f.cern.ch", pPort = 1094, pPath = "", pParams = std::map with 19 elements = {["eos.app"] = "fuse::bi", ["fuse.exe"] = "/usr/bin/python3.6", ["fuse.gid"] = "2821", ["fuse.pid"] = "26766", ["fuse.uid"] = "115670", ["fuse.v"] = "3", ["fuse.ver"] = "5.1.5", ["mgm.child"] = "/#curl#categorical.cpython-36.pyc", ["mgm.cid"] = "115670:2821:[log in to unmask]:home-f", ["mgm.clock"] = "0", ["mgm.cmd"] = "fuseX", ["mgm.inode"] = "04854728", ["mgm.op"] = "GET", ["mgm.pcmd"] = "getfusex", ["mgm.uuid"] = "eba881d6-97aa-11ed-8234-3cecef5d2b8e", ["xrd.k5ccname"] = "/pool/condor/dir_26688/ffedship.cc", ["xrd.wantprot"] = "krb5,unix", ["xrdcl.secgid"] = "2821", ["xrdcl.secuid"] = "115670"}, pURL = "root:[log in to unmask]:1094/?eos.app=fuse::bi&fuse.exe=/usr/bin/python3.6&fuse.gid=2821&fuse.pid=26766&fuse.uid=115670&fuse.v=3&fuse.ver=5.1.5&mgm.child=/#curl#categorical.cpython-36.pyc&mg"...}, pPoller = 0x7fe49e09f240, pTransport = 0x7fe49e0d1040, pTaskManager = 0x7fe49e09f180, pStream = 0x7fe49647d1c0, pMutex = {cs = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, pChannelData = {pHolder = 0x7fe496418070, pTypeInfo = 0x7fe4c3bf6298 <typeinfo for XrdCl::XRootDChannelInfo*>, pOwn = true}, pIncoming = {pHandlers = std::map with 0 elements, pMutex = {<XrdSysMutex> = {cs = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, <No data fields>}}, pTickGenerator = 0x7fe49480f1c0, pJobManager = 0x7fe49e0c4100} ``` [1][stacktrace.crafted.gdb.txt](https://github.com/xrootd/xrootd/files/10466702/stacktrace.crafted.gdb.txt) ``` Thread 1 (Thread 0x7fe49d3fe700 (LWP 572)): #0 0x00007fe4bf91839e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x00007fe4c38e285c in XrdSysRWLock::WriteLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:420 #2 XrdSysRWLockHelper::XrdSysRWLockHelper (rd=false, l=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:420 #3 XrdCl::PostMaster::ForceDisconnect (this=0x7fe4bb00e2a0, url=...) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:309 #4 0x00007fe4c38ebc6b in XrdCl::Stream::OnReadTimeout (this=0x7fe49647dc40, substream=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClStream.cc:1064 #5 0x00007fe4c396887d in XrdCl::AsyncSocketHandler::OnReadTimeout (this=this@entry=0x7fe496470200) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:690 #6 0x00007fe4c396aff7 in XrdCl::AsyncSocketHandler::Event (this=0x7fe496470200, type=2 '\002') at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:239 #7 0x00007fe4c38e0368 in (anonymous namespace)::SocketCallBack::Event (this=0x7fe496415560, chP=<optimized out>, cbArg=<optimized out>, evFlags=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPollerBuiltIn.cc:83 #8 0x00007fe4c34f37db in XrdSys::IOEvents::Poller::CbkXeq (this=0x7fe49e0c42c0, cP=0x7fe478c39f20, events=2, eNum=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:721 #9 0x00007fe4c34f3d9e in XrdSys::IOEvents::Poller::CbkTMO (this=0x7fe49e0c42c0) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:626 #10 0x00007fe4c34f4ca1 in XrdSys::IOEvents::PollE::Begin (syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>, this=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:216 #11 XrdSys::IOEvents::PollE::Begin (this=0x7fe49e0c42c0, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:196 #12 0x00007fe4c34f13bd in XrdSys::IOEvents::BootStrap::Start (parg=0x7fe49effd980) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:149 #13 0x00007fe4c34f9e27 in XrdSysThread_Xeq (myargs=0x7fe49e0139e0) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.cc:86 #14 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0 #15 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7fe49c3fc700 (LWP 578)): #0 0x00007fe4bf918184 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 #1 0x00007fe4c38e3e45 in XrdSysRWLock::ReadLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419 #2 XrdSysRWLockHelper::XrdSysRWLockHelper (rd=true, l=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419 #3 XrdCl::PostMaster::Send (this=0x7fe4bb00e2a0, url=..., msg=0x7fe4931040c0, handler=handler@entry=0x7fe4837f9700, stateful=stateful@entry=true, expires=1674173233) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:220 #4 0x00007fe4c3914567 in XrdCl::XRootDMsgHandler::RetryAtServer (this=0x7fe4837f9700, url=..., entryType=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClXRootDMsgHandler.cc:2169 #5 0x00007fe4c3914b05 in XrdCl::XRootDMsgHandler::WaitDone (this=0x7fe4837f9700) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClXRootDMsgHandler.cc:1120 #6 0x00007fe4c3914bed in (anonymous namespace)::WaitTask::Run (this=<optimized out>, now=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClXRootDMsgHandler.cc:69 #7 0x00007fe4c38fb3ac in XrdCl::TaskManager::RunTasks (this=0x7fe49e09f180) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClTaskManager.cc:222 #8 0x00007fe4c38fb509 in RunRunnerThread (arg=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClTaskManager.cc:38 #9 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6 ``` -- Reply to this email directly or view it on GitHub: https://github.com/xrootd/xrootd/issues/1883 You are receiving this because you are subscribed to this thread. Message ID: <[log in to unmask]> ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1