Print

Print


This looks like a plausible scenario. As you peruse the code you will find various comments that allude to trying to fix these kinds of problems. The fixes are not straightforward largely because epoll can generate error events even when you don't want them (i.e., there is no way to completely suppress all epoll events). So, you can get into this kind of thread race without even knowing about it. That's why we have the fd firewall and when we get into this situation the connections in the race are dropped to avoid more serious consequences. The decision to drop a connection is based on the fact that a fd was recycled by the OS but we still think it's in use and there are live objects associated with it. That's a last ditch effort to keep the server from crashing and it works but obviously has side-effects. On the other hand, since the client will simply try to reconnect those side-effects are rarely visible other than adding latency and logging serious messages.  I'll be very interested to see what you solution will be.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/1928#issuecomment-1441347371
You are receiving this because you are subscribed to this thread.

Message ID: <[log in to unmask]>
########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1