Print

Print


Ok,
it can be done even easier because the logic is constructed like that:

<Resolve URLs>;
for (try=0; try< maxretry; ....)  {
   while <resolved url-left> {
     // try connect
     if (failed) <remove tried url>;
   }
}

So I propose to resolve inside the retry loop, still all aliases will be
tried in the inner loop

for (try=0; try< maxretry; ....)  {
   <Resolve URLs>;
   while <resolved url-left> {
     // try connect
    if (failed) <remove tried url>;
   }
}

Is that ok?

Cheers Andreas.

On Wed, Jun 8, 2011 at 11:00 AM, Andrew Hanushevsky
<[log in to unmask]>wrote:

> Hi Andreas,
>
> If you limit it to a list size of 1 then it would be perfect. Please submit
> the patch.
>
>
> Andy
>
> On Wed, 8 Jun 2011, Andreas-Joachim Peters wrote:
>
>  Hi Andy,
>> the DNS is manipulated to point always to the working machine(machine in
>> production).
>>
>> If we could do it only in the case where the list size = 1, that would be
>> perfectly fine and the default case for real load balancing would still be
>> like before.
>>
>> Cheers Andreas.
>>
>> On Wed, Jun 8, 2011 at 5:02 AM, Andrew Hanushevsky <[log in to unmask]
>> >wrote:
>>
>>  Hi Andreas,
>>>
>>> I think the issue here is that the reason the list is not re-translated
>>> every time because he client picks (in a random order) through the list
>>> when
>>> it reconnects. If the DNS returns the list in random order itself (as
>>> most
>>> do now) then the process may, at best, become ineffecient (i.e., failing
>>> hosts being unnecessarily retried) and at worst, never converge.
>>>
>>> Most larger sites setup a DNS entry with multiple addresses so that they
>>> can automatically fail-over this mode of operation. As I look at the
>>> code,
>>> this process works only if the list of addresses is stable (at least
>>> until
>>> all of them have been tried once). Hence, that's why the list is
>>> translated
>>> only once.
>>>
>>> Three options exist: a) the easy one is to have an option (e.g. envar)
>>> control the behaviour with old behaviour being the default, b) only
>>> retranslated the list after all entries have been tried (I think this is
>>> much harder), and c) retranslate only if the DNS call returned a single
>>> entry (don't know if this is really what you are after but it's the
>>> safest).
>>>
>>> I still don't quite understand how retranslating the DNS will allow you
>>> to
>>> be more fault tolerant unless the DNS knows to return an address that is
>>> working when the one it has is not working. Hint?
>>>
>>> Andy
>>>
>>>
>>> On Wed, 8 Jun 2011, Andreas-Joachim Peters wrote:
>>>
>>>  Hi,
>>>
>>>> I have the following request to change some basic behaviour of the
>>>> xrootd
>>>> client code.
>>>>
>>>> Currently when an XrdClient::Open was issued or XrdClientAdmin::Connect
>>>> any
>>>> DNS alias is only resolved once in the beginning and then stays forever
>>>> as
>>>> target in the loop which honours the settings for retry/reconnect etc.
>>>>
>>>> Andy ... are there any objections to change this behaviour and to
>>>> resolve
>>>> the alias again before each retry? I would need that behavour to have an
>>>> active/passive failover via DNS alias configured.
>>>>
>>>> The code change is trivial (two lines inserted).
>>>>
>>>> Cheers Andreas.
>>>>
>>>>
>>>>
>>