Print

Print


The process I'm looking at has a thread in D state (530), making it cumbersome to stacktrace.
I took a stacktrace to each of the threads that is not in D state [1] and it seems that the general scenario is a call from XrdCl::PostMaster that waits for a lock.

This XrdSysRWLock from 12625  (and all other threads) seems to point that 576 is having the lock to write?

```
[root@b7s06p7796 ~]# /opt/rh/devtoolset-8/root/bin/gdb -p 12625
(gdb) bt
#0  0x00007fe4bf918184 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#1  0x00007fe4c38e3e45 in XrdSysRWLock::ReadLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419
#2  XrdSysRWLockHelper::XrdSysRWLockHelper (rd=true, l=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419
#3  XrdCl::PostMaster::Send (this=this@entry=0x7fe4bb00e2a0, url=..., msg=msg@entry=0x7fe47f428040, handler=handler@entry=0x7fe488c1a900, stateful=<optimized out>, 
    expires=1674182914) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:220
#4  0x00007fe4c391b794 in XrdCl::MessageUtils::SendMessage (url=..., msg=msg@entry=0x7fe47f428040, handler=<optimized out>, sendParams=..., lFileHandler=lFileHandler@entry=0x0)
    at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClMessageUtils.cc:127
#5  0x00007fe4c39088b9 in XrdCl::FileSystemData::Send (fs=std::shared_ptr<XrdCl::FileSystemData> (use count 3, weak count 0) = {...}, msg=msg@entry=0x7fe47f428040, 
    handler=<optimized out>, handler@entry=0x7fe4663fcc80, params=...) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClFileSystem.cc:956
#6  0x00007fe4c38ff7c7 in XrdCl::FileSystem::Query (this=0x7fe488c0f040, queryCode=XrdCl::QueryCode::OpaqueFile, arg=..., handler=0x7fe4663fcc80, timeout=<optimized out>)
    at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClFileSystem.cc:1264
#7  0x00007fe4c38ff930 in XrdCl::FileSystem::Query (this=this@entry=0x7fe488c0f040, queryCode=queryCode@entry=XrdCl::QueryCode::OpaqueFile, arg=..., 
    response=@0x7fe4663fce38: 0x0, timeout=timeout@entry=2) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClFileSystem.cc:1276
#8  0x0000000000544527 in backend::Query (this=0xacc7d8 <EosFuse::instance()::i+1624>, url=..., query_code=<optimized out>, arg=..., response=@0x7fe4663fce38: 0x0, rtimeout=2, 
    noretry=true) at /builddir/build/BUILD/eos-5.1.5-1/fusex/backend/backend.cc:1140
#9  0x0000000000546e86 in backend::statvfs (this=0xacc7d8 <EosFuse::instance()::i+1624>, req=req@entry=0x7fe488c9c080, stbuf=stbuf@entry=0x7fe4663fd430)
    at /builddir/build/BUILD/eos-5.1.5-1/fusex/backend/backend.cc:989
#10 0x00000000004c9bec in metad::statvfs (this=<optimized out>, req=req@entry=0x7fe488c9c080, svfs=svfs@entry=0x7fe4663fd430)
    at /builddir/build/BUILD/eos-5.1.5-1/fusex/md/md.cc:1702
#11 0x000000000045913b in EosFuse::statfs (req=0x7fe488c9c080, ino=1) at /builddir/build/BUILD/eos-5.1.5-1/fusex/eosfuse.hh:205
#12 0x00007fe4c22ab73b in do_statfs () from /lib64/libfuse.so.2
#13 0x00007fe4c22aab6b in fuse_ll_process_buf () from /lib64/libfuse.so.2
#14 0x00007fe4c22a7401 in fuse_do_work () from /lib64/libfuse.so.2
#15 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6
(gdb) f 1 
#1  0x00007fe4c38e3e45 in XrdSysRWLock::ReadLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419
(gdb) p (XrdSysRWLock) *0x7fe49e0692c8
$5 = {lock = {__data = {__lock = 0, __nr_readers = 0, __readers_wakeup = 1415, __writer_wakeup = 2996, __nr_readers_queued = 37, __nr_writers_queued = 2, __writer = 576, __shared = 0, __pad1 = 0, __pad2 = 0, __flags = 0}, 
    __size = "\000\000\000\000\000\000\000\000\207\005\000\000\264\v\000\000%\000\000\000\002\000\000\000@\002", '\000' <repeats 29 times>, __align = 0}}
```

```
[root@b7s06p7796 ~]# /opt/rh/devtoolset-8/root/bin/gdb -p 576
Missing separate debuginfos, use: debuginfo-install eos-fusex-core-5.1.5-1.el7.cern.x86_64
(gdb) bt
#0  0x00007fe4bf91bf40 in __pause_nocancel () from /lib64/libpthread.so.0
#1  0x00007fe4bf912bcc in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
#2  0x00007fe4c38e5a7f in XrdSysMutex::Lock (this=0x7fe49480f1d8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281
#3  XrdSysMutexHelper::XrdSysMutexHelper (mutex=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281
#4  XrdCl::TickGeneratorTask::Invalidate (this=0x7fe49480f1c0) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClChannel.cc:72
#5  XrdCl::Channel::~Channel (this=0x7fe49643a140, __in_chrg=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClChannel.cc:134
#6  0x00007fe4c38e28bd in XrdCl::PostMaster::ForceDisconnect (this=0x7fe4bb00e2a0, url=...) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:317
#7  0x00007fe4c38ebc6b in XrdCl::Stream::OnReadTimeout (this=0x7fe49647d1c0, substream=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClStream.cc:1064
#8  0x00007fe4c396887d in XrdCl::AsyncSocketHandler::OnReadTimeout (this=this@entry=0x7fe49646f300) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:690
#9  0x00007fe4c396aff7 in XrdCl::AsyncSocketHandler::Event (this=0x7fe49646f300, type=2 '\002') at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:239
#10 0x00007fe4c38e0368 in (anonymous namespace)::SocketCallBack::Event (this=0x7fe496415060, chP=<optimized out>, cbArg=<optimized out>, evFlags=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPollerBuiltIn.cc:83
#11 0x00007fe4c34f37db in XrdSys::IOEvents::Poller::CbkXeq (this=0x7fe49e0c43a0, cP=0x7fe48746d420, events=2, eNum=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:721
#12 0x00007fe4c34f3d9e in XrdSys::IOEvents::Poller::CbkTMO (this=0x7fe49e0c43a0) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:626
#13 0x00007fe4c34f4ca1 in XrdSys::IOEvents::PollE::Begin (syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>, this=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:216
#14 XrdSys::IOEvents::PollE::Begin (this=0x7fe49e0c43a0, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:196
#15 0x00007fe4c34f13bd in XrdSys::IOEvents::BootStrap::Start (parg=0x7fe49effd980) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:149
#16 0x00007fe4c34f9e27 in XrdSysThread_Xeq (myargs=0x7fe49e013a20) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.cc:86
#17 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0
#18 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6
(gdb) f 2
#2  0x00007fe4c38e5a7f in XrdSysMutex::Lock (this=0x7fe49480f1d8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281
281                     {mutex.Lock();
(gdb) p (XrdSysMutex) *0x7fe49480f1d8
$1 = {cs = {__data = {__lock = -1052688063, __count = 1665941825, __owner = 909144128, __nusers = 926298934, __kind = 964243505, __spins = 13875, __elision = 12592, __list = {__prev = 0x68632e6e7265632e, __next = 0x333231313a}}, __size = "AAA\[log in to unmask]:1123\000\000", 
    __align = 7155165658655834433}}
(gdb) f 3
#3  XrdSysMutexHelper::XrdSysMutexHelper (mutex=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:281
281                     {mutex.Lock();
(gdb) p mutex
$2 = (XrdSysMutex &) @0x7fe49480f1d8: {cs = {__data = {__lock = -1052688063, __count = 1665941825, __owner = 909144128, __nusers = 926298934, __kind = 964243505, __spins = 13875, __elision = 12592, __list = {__prev = 0x68632e6e7265632e, __next = 0x333231313a}}, 
    __size = "AAA\[log in to unmask]:1123\000\000", __align = 7155165658655834433}}
(gdb) f 5
#5  XrdCl::Channel::~Channel (this=0x7fe49643a140, __in_chrg=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClChannel.cc:134
134        pTickGenerator->Invalidate();
(gdb) p (XrdCl::Channel) *0x7fe49643a140
$3 = {pUrl = {pHostId = "[log in to unmask]:1094", pProtocol = "root", pUserName = "AAAAAALU", pPassword = "", pHostName = "eoshome-f.cern.ch", pPort = 1094, pPath = "", pParams = std::map with 19 elements = {["eos.app"] = "fuse::bi", ["fuse.exe"] = "/usr/bin/python3.6", ["fuse.gid"] = "2821", 
      ["fuse.pid"] = "26766", ["fuse.uid"] = "115670", ["fuse.v"] = "3", ["fuse.ver"] = "5.1.5", ["mgm.child"] = "/#curl#categorical.cpython-36.pyc", ["mgm.cid"] = "115670:2821:[log in to unmask]:home-f", ["mgm.clock"] = "0", ["mgm.cmd"] = "fuseX", ["mgm.inode"] = "04854728", ["mgm.op"] = "GET", 
      ["mgm.pcmd"] = "getfusex", ["mgm.uuid"] = "eba881d6-97aa-11ed-8234-3cecef5d2b8e", ["xrd.k5ccname"] = "/pool/condor/dir_26688/ffedship.cc", ["xrd.wantprot"] = "krb5,unix", ["xrdcl.secgid"] = "2821", ["xrdcl.secuid"] = "115670"}, 
    pURL = "root:[log in to unmask]:1094/?eos.app=fuse::bi&fuse.exe=/usr/bin/python3.6&fuse.gid=2821&fuse.pid=26766&fuse.uid=115670&fuse.v=3&fuse.ver=5.1.5&mgm.child=/#curl#categorical.cpython-36.pyc&mg"...}, pPoller = 0x7fe49e09f240, pTransport = 0x7fe49e0d1040, pTaskManager = 0x7fe49e09f180, 
  pStream = 0x7fe49647d1c0, pMutex = {cs = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, pChannelData = {pHolder = 0x7fe496418070, 
    pTypeInfo = 0x7fe4c3bf6298 <typeinfo for XrdCl::XRootDChannelInfo*>, pOwn = true}, pIncoming = {pHandlers = std::map with 0 elements, pMutex = {<XrdSysMutex> = {cs = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
          __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, <No data fields>}}, pTickGenerator = 0x7fe49480f1c0, pJobManager = 0x7fe49e0c4100}
```

[1][stacktrace.crafted.gdb.txt](https://github.com/xrootd/xrootd/files/10466702/stacktrace.crafted.gdb.txt)

```
Thread 1 (Thread 0x7fe49d3fe700 (LWP 572)):
#0  0x00007fe4bf91839e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
#1  0x00007fe4c38e285c in XrdSysRWLock::WriteLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:420
#2  XrdSysRWLockHelper::XrdSysRWLockHelper (rd=false, l=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:420
#3  XrdCl::PostMaster::ForceDisconnect (this=0x7fe4bb00e2a0, url=...) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:309
#4  0x00007fe4c38ebc6b in XrdCl::Stream::OnReadTimeout (this=0x7fe49647dc40, substream=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClStream.cc:1064
#5  0x00007fe4c396887d in XrdCl::AsyncSocketHandler::OnReadTimeout (this=this@entry=0x7fe496470200) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:690
#6  0x00007fe4c396aff7 in XrdCl::AsyncSocketHandler::Event (this=0x7fe496470200, type=2 '\002') at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClAsyncSocketHandler.cc:239
#7  0x00007fe4c38e0368 in (anonymous namespace)::SocketCallBack::Event (this=0x7fe496415560, chP=<optimized out>, cbArg=<optimized out>, evFlags=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPollerBuiltIn.cc:83
#8  0x00007fe4c34f37db in XrdSys::IOEvents::Poller::CbkXeq (this=0x7fe49e0c42c0, cP=0x7fe478c39f20, events=2, eNum=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:721
#9  0x00007fe4c34f3d9e in XrdSys::IOEvents::Poller::CbkTMO (this=0x7fe49e0c42c0) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:626
#10 0x00007fe4c34f4ca1 in XrdSys::IOEvents::PollE::Begin (syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>, this=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:216
#11 XrdSys::IOEvents::PollE::Begin (this=0x7fe49e0c42c0, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEventsPollE.icc:196
#12 0x00007fe4c34f13bd in XrdSys::IOEvents::BootStrap::Start (parg=0x7fe49effd980) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysIOEvents.cc:149
#13 0x00007fe4c34f9e27 in XrdSysThread_Xeq (myargs=0x7fe49e0139e0) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.cc:86
#14 0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fe49c3fc700 (LWP 578)):
#0  0x00007fe4bf918184 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#1  0x00007fe4c38e3e45 in XrdSysRWLock::ReadLock (this=0x7fe49e0692c8) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419
#2  XrdSysRWLockHelper::XrdSysRWLockHelper (rd=true, l=..., this=<synthetic pointer>) at /usr/src/debug/xrootd-5.5.4/src/XrdSys/XrdSysPthread.hh:419
#3  XrdCl::PostMaster::Send (this=0x7fe4bb00e2a0, url=..., msg=0x7fe4931040c0, handler=handler@entry=0x7fe4837f9700, stateful=stateful@entry=true, expires=1674173233) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClPostMaster.cc:220
#4  0x00007fe4c3914567 in XrdCl::XRootDMsgHandler::RetryAtServer (this=0x7fe4837f9700, url=..., entryType=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClXRootDMsgHandler.cc:2169
#5  0x00007fe4c3914b05 in XrdCl::XRootDMsgHandler::WaitDone (this=0x7fe4837f9700) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClXRootDMsgHandler.cc:1120
#6  0x00007fe4c3914bed in (anonymous namespace)::WaitTask::Run (this=<optimized out>, now=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClXRootDMsgHandler.cc:69
#7  0x00007fe4c38fb3ac in XrdCl::TaskManager::RunTasks (this=0x7fe49e09f180) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClTaskManager.cc:222
#8  0x00007fe4c38fb509 in RunRunnerThread (arg=<optimized out>) at /usr/src/debug/xrootd-5.5.4/src/XrdCl/XrdClTaskManager.cc:38
#9  0x00007fe4bf914ea5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fe4bf63db0d in clone () from /lib64/libc.so.6
```


-- 
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/1883
You are receiving this because you are subscribed to this thread.

Message ID: <[log in to unmask]>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1