Hi Brian,
Seems like this just started happening, did it not? If so, what else has
changed (e.g. Linux patches)? Anyway, that snippet of code was added to get
around a nasty Linux-only "feature". Could you add a cerr there to see if we
are actually getting to that code (display the two thread values)? There
should be no way that the code stops itself without something else going on.
Andy
-----Original Message-----
From: Brian Bockelman
Sent: Tuesday, August 23, 2011 1:26 PM
To: xrootd-dev
Subject: xrootd redirector repeatedly "crashing"
Hi,
Our global redirector is stops responding every 30 minutes or so; it's
actually not crashing, but appears to be getting SIGSTOP.
There's nothing on the system that would be sending this signal. However, I
see the following code in XrdLink:
if (tBound)
{tBound = 0;
#ifdef __linux__
if (!XrdSysThread::Same(curTID, XrdSysThread::ID()))
{XrdSysThread::Signal(curTID, SIGSTOP);
XrdSysThread::Signal(curTID, SIGCONT);
}
#endif
}
Are we 100% sure that's the right thing, and there's no way that SIGSTOP is
delivered to the wrong thread?
Brian
|