Print

Print


So, a command has been sent to the poller thread in frame 5. In frame 5 
the pollTid member of *this is the LWP of the associated poller. So, what 
isit doing? Clearly something else because it's not responding to the 
commad and everyone is stacking on this poller.

On Mon, 27 Mar 2023, Brian P Bockelman wrote:

> Poked through the deadlock a bit; much of this ends in the XRootD client.  Here's an example:
>
> ```
> #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
> #1  0x00007f69f5f89eb6 in _L_lock_941 () from /lib64/libpthread.so.0
> #2  0x00007f69f5f89daf in __GI___pthread_mutex_lock ***@***.***=0x1e4600a0) at ../nptl/pthread_mutex_lock.c:113
> #3  0x00007f69f0765bc6 in Lock (this=0x1e4600a0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:222
> #4  XrdSysMutexHelper (mutex=..., this=0x7f69ea97ea48) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:281
> #5  XrdCl::Stream::OnConnectError ***@***.***=0x1e460000, subStream=<optimized out>, status=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:703
> #6  0x00007f69f07ee4aa in XrdCl::AsyncSocketHandler::OnFaultWhileHandshaking ***@***.***=0xf40171e0, st=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClAsyncSocketHandler.cc:674
> #7  0x00007f69f07f06fd in XrdCl::AsyncSocketHandler::OnReadWhileHandshaking (this=0xf40171e0) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClAsyncSocketHandler.cc:518
> #8  0x00007f69f07f189d in XrdCl::AsyncSocketHandler::Event (this=0xf40171e0, type=5 '\005') at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClAsyncSocketHandler.cc:227
> #9  0x00007f69f075ba97 in (anonymous namespace)::SocketCallBack::Event (this=0x105d0b080, chP=<optimized out>, cbArg=<optimized out>, evFlags=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClPollerBuiltIn.cc:83
> #10 0x00007f69f6e18dc3 in XrdSys::IOEvents::Poller::CbkXeq ***@***.***=0x173e750, ***@***.***=0x13aa2f360, events=5, eNum=<optimized out>, eTxt=0x0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:721
> #11 0x00007f69f6e19f43 in XrdSys::IOEvents::PollE::Dispatch ***@***.***=0x173e750, cP=0x13aa2f360, pollEv=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEventsPollE.icc:275
> #12 0x00007f69f6e1a100 in XrdSys::IOEvents::PollE::Begin (this=0x173e750, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEventsPollE.icc:230
> #13 0x00007f69f6e1691d in XrdSys::IOEvents::BootStrap::Start (parg=0x7f69f211b8b0) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:149
> #14 0x00007f69f6e1f347 in XrdSysThread_Xeq (myargs=0x2eb8420) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.cc:86
> #15 0x00007f69f5f87ea5 in start_thread (arg=0x7f69ea97f700) at pthread_create.c:307
> #16 0x00007f69f5cb0b0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> (gdb) thread 2194
> [Switching to thread 2194 (Thread 0x7f69e9977700 (LWP 693))]
> #0  0x00007f69f5f8db3b in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7f69e99769c0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> 43	      err = lll_futex_wait (futex, expected, private);
> (gdb) bt
> #0  0x00007f69f5f8db3b in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7f69e99769c0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> #1  do_futex_wait ***@***.***=0x7f69e99769c0, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:223
> #2  0x00007f69f5f8dbcf in __new_sem_wait_slow (sem=0x7f69e99769c0, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:292
> #3  0x00007f69f5f8dc6b in __new_sem_wait (sem=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:28
> #4  0x00007f69f6e18132 in Wait (this=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.hh:509
> #5  XrdSys::IOEvents::Poller::SendCmd ***@***.***=0x173e750, cmd=...) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:1002
> #6  0x00007f69f6e181fa in XrdSys::IOEvents::PollE::Exclude (this=0x173e750, cP=0x13aa2f360, ***@***.***: false, dover=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEventsPollE.icc:309
> #7  0x00007f69f6e173ea in XrdSys::IOEvents::Channel::Delete (this=0x13aa2f360) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysIOEvents.cc:326
> #8  0x00007f69f075bf38 in XrdCl::PollerBuiltIn::RemoveSocket (this=0x1748000, socket=0xf85b2910) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClPollerBuiltIn.cc:344
> #9  0x00007f69f07ed7fb in XrdCl::AsyncSocketHandler::Close (this=0xf40171e0) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClAsyncSocketHandler.cc:191
> #10 0x00007f69f0765bf3 in XrdCl::Stream::OnConnectError ***@***.***=0x1e460000, ***@***.***=0, status=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:705
> #11 0x00007f69f07666db in XrdCl::Stream::ForceConnect (this=0x1e460000) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:349
> #12 0x00007f69f07667dd in (anonymous namespace)::StreamConnectorTask::Run (this=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:410
> #13 0x00007f69f0778f0c in XrdCl::TaskManager::RunTasks (this=0x17ac000) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClTaskManager.cc:222
> #14 0x00007f69f07790f9 in RunRunnerThread (arg=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClTaskManager.cc:38
> #15 0x00007f69f5f87ea5 in start_thread (arg=0x7f69e9977700) at pthread_create.c:307
> #16 0x00007f69f5cb0b0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> ```
>
> Note there are two active `OnConnectError` callbacks on the same `XrdCl::Stream` object:
> ```
> #5  XrdCl::Stream::OnConnectError ***@***.***=0x1e460000, subStream=<optimized out>, status=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:703
> ```
> and
> ```
> #10 0x00007f69f0765bf3 in XrdCl::Stream::OnConnectError ***@***.***=0x1e460000, ***@***.***=0, status=...) at /usr/src/debug/xrootd/xrootd/src/XrdCl/XrdClStream.cc:705
> ```
>
> @simonmichal - is that supposed to happen?
>
> -- 
> Reply to this email directly or view it on GitHub:
> https://github.com/xrootd/xrootd/issues/1979#issuecomment-1485036585
> You are receiving this because you are subscribed to this thread.
>
> Message ID: ***@***.***>


-- 
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/1979#issuecomment-1485048455
You are receiving this because you commented.

Message ID: <[log in to unmask]>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1