Print

Print


Hi Andreas,

I think the issue here is that the reason the list is not re-translated 
every time because he client picks (in a random order) through the list 
when it reconnects. If the DNS returns the list in random order itself (as 
most do now) then the process may, at best, become ineffecient (i.e., 
failing hosts being unnecessarily retried) and at worst, never converge.

Most larger sites setup a DNS entry with multiple addresses so that they 
can automatically fail-over this mode of operation. As I look at the 
code, this process works only if the list of addresses is stable (at least 
until all of them have been tried once). Hence, that's why the list is 
translated only once.

Three options exist: a) the easy one is to have an option (e.g. envar) 
control the behaviour with old behaviour being the default, b) only 
retranslated the list after all entries have been tried (I think this is 
much harder), and c) retranslate only if the DNS call returned a single 
entry (don't know if this is really what you are after but it's the 
safest).

I still don't quite understand how retranslating the DNS will allow you to 
be more fault tolerant unless the DNS knows to return an address that is 
working when the one it has is not working. Hint?

Andy

On Wed, 8 Jun 2011, Andreas-Joachim Peters wrote:

> Hi,
> I have the following request to change some basic behaviour of the xrootd
> client code.
>
> Currently when an XrdClient::Open was issued or XrdClientAdmin::Connect any
> DNS alias is only resolved once in the beginning and then stays forever as
> target in the loop which honours the settings for retry/reconnect etc.
>
> Andy ... are there any objections to change this behaviour and to resolve
> the alias again before each retry? I would need that behavour to have an
> active/passive failover via DNS alias configured.
>
> The code change is trivial (two lines inserted).
>
> Cheers Andreas.
>