Print

Print


Hi,

When using the built-in poller on OSX, there is an occasional condition whereby a client thread (thread 1) which attempts to connect a socket asynchronously does not get called back. This locks up the client until the connect operation times out.

This doesn't happen with the libevent poller. The frequency of occurrence of the problem can be increased by adding time delays in the client code, so it's more than likely a timing issue (maybe the poller invoking the callback too early?).

Here are the stack traces:

```
Thread 6 (process 17596):
#0  0x00007fff8d9da112 in sem_wait ()
#1  0x00000001000736c1 in XrdSysSemaphore::Wait (this=0x100512950) at XrdSysPthread.hh:335
#2  0x00000001000da19f in XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x100512a78) at XrdClSyncQueue.hh:66
#3  0x00000001000d9fa3 in XrdCl::JobManager::RunJobs (this=0x100512a60) at XrdClJobManager.cc:133
#4  0x00000001000d9cbd in RunRunnerThread (arg=0x100512a60) at XrdClJobManager.cc:33
#5  0x00007fff913497a2 in _pthread_start ()
#6  0x00007fff913361e1 in thread_start ()

Thread 5 (process 17596):
#0  0x00007fff8d9da112 in sem_wait ()
#1  0x00000001000736c1 in XrdSysSemaphore::Wait (this=0x100512950) at XrdSysPthread.hh:335
#2  0x00000001000da19f in XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x100512a78) at XrdClSyncQueue.hh:66
#3  0x00000001000d9fa3 in XrdCl::JobManager::RunJobs (this=0x100512a60) at XrdClJobManager.cc:133
#4  0x00000001000d9cbd in RunRunnerThread (arg=0x100512a60) at XrdClJobManager.cc:33
#5  0x00007fff913497a2 in _pthread_start ()
#6  0x00007fff913361e1 in thread_start ()

Thread 4 (process 17596):
#0  0x00007fff8d9da112 in sem_wait ()
#1  0x00000001000736c1 in XrdSysSemaphore::Wait (this=0x100512950) at XrdSysPthread.hh:335
#2  0x00000001000da19f in XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x100512a78) at XrdClSyncQueue.hh:66
#3  0x00000001000d9fa3 in XrdCl::JobManager::RunJobs (this=0x100512a60) at XrdClJobManager.cc:133
#4  0x00000001000d9cbd in RunRunnerThread (arg=0x100512a60) at XrdClJobManager.cc:33
#5  0x00007fff913497a2 in _pthread_start ()
#6  0x00007fff913361e1 in thread_start ()

Thread 3 (process 17596):
#0  0x00007fff8d9d9386 in __semwait_signal ()
#1  0x00007fff913d3800 in nanosleep ()
#2  0x00007fff913d368a in sleep ()
#3  0x000000010008c33f in XrdCl::TaskManager::RunTasks (this=0x100512980) at XrdClTaskManager.cc:238
#4  0x000000010008b49d in RunRunnerThread (arg=0x100512980) at XrdClTaskManager.cc:36
#5  0x00007fff913497a2 in _pthread_start ()
#6  0x00007fff913361e1 in thread_start ()

Thread 2 (process 17596):
#0  0x00007fff8d9d9f96 in poll ()
#1  0x00000001002e8be7 in XrdSys::IOEvents::PollPoll::Begin (this=0x100512d20, syncsem=0x7fff5fbfec98, retcode=@0x7fff5fbfec90, eTxt=0x7fff5fbfec88) at XrdSysIOEventsPollPoll.icc:191
#2  0x00000001002e6181 in XrdSys::IOEvents::BootStrap::Start (parg=0x7fff5fbfec80) at /Users/jussy/repos/xrootd/src/XrdSys/XrdSysIOEvents.cc:110
#3  0x00000001002e3f55 in XrdSysThread_Xeq (myargs=0x100512e60) at /Users/jussy/repos/xrootd/src/XrdSys/XrdSysPthread.cc:86
#4  0x00007fff913497a2 in _pthread_start ()
#5  0x00007fff913361e1 in thread_start ()

Thread 1 (process 17596):
#0  0x00007fff8d9da112 in sem_wait ()
#1  0x00000001000736c1 in XrdSysSemaphore::Wait (this=0x100511640) at XrdSysPthread.hh:335
#2  0x00000001000993f9 in XrdCl::SyncResponseHandler::WaitForResponse (this=0x7fff5fbff3c0) at XrdClMessageUtils.hh:85
#3  0x0000000100097eda in XrdCl::MessageUtils::WaitForResponse<XrdCl::StatInfo> (handler=0x7fff5fbff3c0, response=@0x7fff5fbff5d0) at XrdClMessageUtils.hh:135
#4  0x000000010009363b in XrdCl::FileSystem::Stat (this=0x100512530, path=@0x7fff5fbff600, response=@0x7fff5fbff5d0, timeout=0) at XrdClFileSystem.cc:709
#5  0x0000000100008344 in DoStat (fs=0x100512530, env=0x100512380, args=@0x7fff5fbff7e0) at XrdClFS.cc:740
#6  0x0000000100010ae7 in XrdCl::FSExecutor::Execute (this=0x1005124f0, commandline=@0x7fff5fbff930) at XrdClFSExecutor.cc:100
#7  0x000000010000ad0a in ExecuteCommand (ex=0x1005124f0, commandline=@0x7fff5fbff930) at XrdClFS.cc:1023
#8  0x000000010000b690 in ExecuteCommand (url=@0x7fff5fbff9c8, argc=2, argv=0x7fff5fbffad8) at XrdClFS.cc:1143
#9  0x000000010000ba0c in main (argc=4, argv=0x7fff5fbffac8) at XrdClFS.cc:1179
```

Cheers,
Justin

---
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/6

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1