On Fri, 4 Mar 2011, Lukasz Janyst wrote: > OK, then. I will fix the bug in ROOT, the one pointed out by Fabrizio, > add the setsockopt flags to support linux builtin keepalive, and in > longer term work on something for simulating keepalive with kXR_ping. We may want to make the "ping" mechanism selectable via an option of some kind. Just a thought. Andy > Lukasz > > 2011/3/3 Andrew Hanushevsky <[log in to unmask]>: >> Hi Lukasz, >> >> The client is always free to send a keep-alive to the server (a.k.a. >> kXR_ping request). The timeout for that can be set very short since the >> response occurs at the top-most layer in the server and turn-around should >> be in the microsecond range + RTT. That will tell you if the connection is >> alive. The server can send a ping to the client as well. But neither of >> these actions will solve many router issues. That's why you need reasonable >> timeouts. For instance (and why server-side pings are not all that useful), >> a server-side ping will do little if the router already broke the >> connection. Yes, the server-side connection *might* get closed but the >> client will never know that. >> >> So, to better address the problem I would proceed as follows: >> a) If you find out that a request has not been serviced in the normal >> time-out window, >> b) Issue a ping with a much shorter timeout. >> c) If you don't get a response at that point, tear down the connection and >> try again. >> >> For optimization purposes, I would keep track of the time between >> reconnections. That should provide a window (eventually) of when you must >> send a ping to keep the dumb router from dropping the connection. >> >> Andy >> >> -----Original Message----- From: Lukasz Janyst >> Sent: Thursday, March 03, 2011 3:32 AM >> To: Fabrizio Furano >> Cc: Lukasz Janyst ; [log in to unmask] ; Gerardo Ganis ; Brian Bockelman >> ; Andrew Hanushevsky ; [log in to unmask] ; [log in to unmask] ; Dirk >> Duellmann ; [log in to unmask] ; [log in to unmask] >> Subject: Re: [sr #119348] Root reports an error while unzipping the buckets >> fetched via xroot >> >> Hi Fabrizio, >> >> thanks for the info. As I say later in my comment, this would >> indeed help, in the sense that the client would get the response >> eventually, instead of hanging or crashing, but wouldn't eliminate the >> real problem: how to handle misbehaving networks. >> >> Cheers, >> Lukasz >> >> 2011/3/3 Fabrizio Furano <[log in to unmask]>: >>> >>> Hi Lukasz, >>> >>> something is fishy here: >>> >>>> Of course, on every request timeout I could assume that the connection is >>>> just broken even though the socket is in a valid state and reconnect, no >>>> problem about that. >>> >>> ... in the sense that this is supposed to be the default, normal >>> behavior. >>> On every request timeout the connection must be considered broken, and >>> completely wiped out. If it does not do it, then this is the issue to >>> consider first, imo. >>> >>> Fabrizio >>> >>> >> >> >