Yes, it has been pretty well stress-tested with a testing app running hundreds of threads doing constant posting and waiting on tens of semaphores. This has been run for weeks producing expected results. Also, Matevz and Alja run their proxy stress tests with this code in, so I am fairly confident it's OK. The fact that ATLAS works fine and that you saw this with the native implementation of semaphores would be a strong indicator that there is nothing wrong with this particular part of code.

If it comes to the stack traces. The code is doing what it is expected to do, ie. waiting for an answer. There are two possibilities:

1) The answer arrived but was not matched with the waiting handler. This is extremely unlikely, because this code runs in production for almost two years now and we haven't seen anything like this.
2) The answer does not arrive, in which case it should timeout eventually and return an error. So the questions is: Do you see these calls timeout?


Reply to this email directly or view it on GitHub.



Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1