Hi Andreas,
I think the issue here is that the reason the list is not re-translated
every time because he client picks (in a random order) through the list
when it reconnects. If the DNS returns the list in random order itself (as
most do now) then the process may, at best, become ineffecient (i.e.,
failing hosts being unnecessarily retried) and at worst, never converge.
Most larger sites setup a DNS entry with multiple addresses so that they
can automatically fail-over this mode of operation. As I look at the
code, this process works only if the list of addresses is stable (at least
until all of them have been tried once). Hence, that's why the list is
translated only once.
Three options exist: a) the easy one is to have an option (e.g. envar)
control the behaviour with old behaviour being the default, b) only
retranslated the list after all entries have been tried (I think this is
much harder), and c) retranslate only if the DNS call returned a single
entry (don't know if this is really what you are after but it's the
safest).
I still don't quite understand how retranslating the DNS will allow you to
be more fault tolerant unless the DNS knows to return an address that is
working when the one it has is not working. Hint?
Andy
On Wed, 8 Jun 2011, Andreas-Joachim Peters wrote:
> Hi,
> I have the following request to change some basic behaviour of the xrootd
> client code.
>
> Currently when an XrdClient::Open was issued or XrdClientAdmin::Connect any
> DNS alias is only resolved once in the beginning and then stays forever as
> target in the loop which honours the settings for retry/reconnect etc.
>
> Andy ... are there any objections to change this behaviour and to resolve
> the alias again before each retry? I would need that behavour to have an
> active/passive failover via DNS alias configured.
>
> The code change is trivial (two lines inserted).
>
> Cheers Andreas.
>
|