Print

Print


Hi Serge,

I've looked things over and it looks like the Executive holds a shared 
pointer that will keep the ResponseRequester object alive during 
cancellation, so Finished(true) can go in QueryRequest::cancel(). I 
think this might be better, it gets rid of a few lines of code and seems 
sensible. A comment should be added indicating that _cleanup can behave 
like "delete this" depending on shared pointers.

void QueryRequest::_cleanup(bool shouldCancel) {
     bool ok = Finished(shouldCancel);
     if(!ok) {
          LOGF_ERROR("Error cleaning up QueryRequest");
     } else {
         LOGF_INFO("Request::Finished() with error (clean).");
     }
     _retryFunc.reset();
     _requester.reset();
}

-John

On 05/08/15 23:24, Serge Monkewitz wrote:
>> On May 8, 2015, at 9:31 PM, John Gates <[log in to unmask]> wrote:
>>
>> Comments below:
>>
>> On 5/8/2015 5:43 PM, Serge Monkewitz wrote:
>>> This doesn’t address the (potential?) issue with Finished() not getting called.
>>>> Resetting the shared_ptr's at the point you do would cause the current object to be destroyed before calling Finished(true).
>>> Are you sure? The way I see it, the current object (QueryRequest) gets deleted when cancelFunc is destructed, and cancelFunc is held by _requester, from which the cancellation must be invoked. So I think that the requester, cancelFunc, and hence QueryRequest are guaranteed to be alive while cancel() is executing.
>> As soon as the reference count goes to zero, _requester is deleted, which calls canceller's destructor, which deletes this QueryRequest. The odds of QueryRequest having the last two references to _requester are pretty good. Further, I think cancel can be called by another thread and having QueryRequest get deleted before xrootd calls ProcessResponseData would be bad.
> Right, but my point is that the reference count for requester and cancelFunc cannot go to zero when calling _cleanup anywhere inside QueryRequest::cancel(), because one must be holding a reference to ResponseRequester to call ResponseRequester::cancel (which invokes cancelFunc, which calls QueryRequest::cancel). And if QueryRequest::cancel() is fixed to call XrdSsiRequest::Finished(true), xrootd will presumably know not to call ProcessResponseData...
>
>> I've found that setting a flag to cancel/terminate a thread in the target thread is usually the safest course of action. The behavior is much more predictable. I also like sticking all the termination code in one function as duplication of the code can cause issues that are difficult to detect.
> Yeah, I think a redesign may be in order. I’m don’t feel like I understand things well enough in this area to make concrete suggestions yet, but I think it’s currently way to hard to understand what’s going on here.
>
>> It looks like nobody is calling XrdSsiRequest::Finished(), at least not inside of QueryRequest. I think we should look into that.
> Well, it’s called in _finish/_errorFinish, but (as far as I can tell) not if cancel() has been previously invoked.
>
>>> However, I do think this issue might be there in other places. For example, _importStream calls _retryFunc, which loops back to to Executive::_queryDispatch, eventually calling registerCancel() on the requester. I think that could cause the CancelFunc holding on to the QueryReqest to be destroyed, which could delete the QueryRequest while _importStream is executing.
>> The retry code scares me.
>>
>> Have a good weekend,
>> John
>
> By the way, I did manage to track down the remaining memory leaks in the testQDisp unit test, and the problem was that the mocks don’t fake enough of the real Ssi behavior (e.g. in some cases, ProvisionDone is not called on QueryResource, resulting in QueryResource leaks, etc...). So they aren’t real, though I’d still like to fix them.
>
> Hope you have a good weekend too!
> Serge

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1