Print

Print



On Fri, 4 Mar 2011, Lukasz Janyst wrote:

> OK, then. I will fix the bug in ROOT, the one pointed out by Fabrizio,
> add the setsockopt flags to support linux builtin keepalive, and in
> longer term work on something for simulating keepalive with kXR_ping.
We may want to make the "ping" mechanism selectable via an option of some 
kind. Just a thought.

Andy

>   Lukasz
>
> 2011/3/3 Andrew Hanushevsky <[log in to unmask]>:
>> Hi Lukasz,
>>
>> The client is always free to send a keep-alive to the server (a.k.a.
>> kXR_ping request). The timeout for that can be set very short since the
>> response occurs at the top-most layer in the server and turn-around should
>> be in the microsecond range + RTT. That will tell you if the connection is
>> alive. The server can send a ping to the client as well. But neither of
>> these actions will solve many router issues. That's why you need reasonable
>> timeouts. For instance (and why server-side pings are not all that useful),
>> a server-side ping will do little if the router already broke the
>> connection. Yes, the server-side connection *might* get closed but the
>> client will never know that.
>>
>> So, to better address the problem I would proceed as follows:
>> a) If you find out that a request has not been serviced in the normal
>> time-out window,
>> b) Issue a ping with a much shorter timeout.
>> c) If you don't get a response at that point, tear down the connection and
>> try again.
>>
>> For optimization purposes, I would keep track of the time between
>> reconnections. That should provide a window (eventually) of when you must
>> send a ping to keep the dumb router from dropping the connection.
>>
>> Andy
>>
>> -----Original Message----- From: Lukasz Janyst
>> Sent: Thursday, March 03, 2011 3:32 AM
>> To: Fabrizio Furano
>> Cc: Lukasz Janyst ; [log in to unmask] ; Gerardo Ganis ; Brian Bockelman
>> ; Andrew Hanushevsky ; [log in to unmask] ; [log in to unmask] ; Dirk
>> Duellmann ; [log in to unmask] ; [log in to unmask]
>> Subject: Re: [sr #119348] Root reports an error while unzipping the buckets
>> fetched via xroot
>>
>> Hi Fabrizio,
>>
>>  thanks for the info. As I say later in my comment, this would
>> indeed help, in the sense that the client would get the response
>> eventually, instead of hanging or crashing, but wouldn't eliminate the
>> real problem: how to handle misbehaving networks.
>>
>> Cheers,
>>  Lukasz
>>
>> 2011/3/3 Fabrizio Furano <[log in to unmask]>:
>>>
>>> Hi Lukasz,
>>>
>>>  something is fishy here:
>>>
>>>> Of course, on every request timeout I could assume that the connection is
>>>> just broken even though the socket is in a valid state and reconnect, no
>>>> problem about that.
>>>
>>>  ... in the sense that this is supposed to be the default, normal
>>> behavior.
>>> On every request timeout the connection must be considered broken, and
>>> completely wiped out. If it does not do it, then this is the issue to
>>> consider first, imo.
>>>
>>>  Fabrizio
>>>
>>>
>>
>>
>