Print

Print


ciao Jan, 
you completely got my point. Getting 40 sites to restart a daemon to catch a new IP (we are NEVER going to remove IPs, we just shutdown machines in case) is an overkill, and impossible on any decent time scale ...

ciao ciao

tmo


On Wed, Jul 30, 2014 at 12:34 PM, Jan Iven <[log in to unmask]> wrote:
On 07/29/2014 09:42 PM, Andrew Hanushevsky wrote:
On 7/21/14, 3:05 AM, Tommaso Boccali wrote:
Ciao,
we are in the situation where we would like to add another redirector to
the DNS-RR setup (so a third one under the same DNS entry). If I
understand correctly, everyone (servers or local cmsd) using the DNS-RR
would need to be restarted, since they resolve hostname -> IPs at the
start. This is quite painful, clearly.
[..]

> DNS caching does not exist in xrootd until 4.0.0 and later. We added
that because people were running into DNS issues when a lot of worker
machines would all of a sudden connect at the same time. While using the
nodnr option would completely avoid the DNS;  some sites were annoyed
that it leaves the log with nothing but IP addresses. Anyway, the
default is 3 hours but you can set it to whatever you want (see
xrd.network cache)....

My impression is that the question was specifically on multihomed DNS aliases, in particular those used with the '+'-syntax ("all.manager meta any DNSALIAS.DOMAIN+").
It looks like xrootd/cmsd does resolve the alias at start, then keeps track of all machines behind such an alias via their IP address/individual hostname. I think this is independent of the generic Hostname->IP caching in Xrootd4.

Is there a way to get the daemons to re-resolve the DNS alias at some stage? Ideally this would happen automatically e.g. on connection failures, or once/day etc.
Use case is of course new machine being brought into the service, usually without taking over the IP address of a previous machine - simple DNS-aliasing is much easier than fullblown IP-level HA.

For the specific case of federations, the clients (aka local redirectors) are in different administrative domains, so scheduling a full restart requires talking to lots of people - not just some SSHing.

Regards,
jan



--
Tommaso Boccali
INFN Pisa


Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1