Print

Print


ok, thanks to all... just one last question to make sure I understood.

If I set up
host xrootd.infn.it
1.1.1.1
2.2.2.2

(two or in general > 1 answers)

and then do

TFile::Open("root://xrootd.infn.it//store/foo.root")

did I get correctly that root (or CMS SW) will try to use either 1.1.1.1 or 2.2.2.2, and if that is down will try with the other one automatically?

thanks a lot!

tom



On Fri, Mar 28, 2014 at 9:33 PM, Matevz Tadel <[log in to unmask]> wrote:
Hi,

Won't the client try other IPs if the first one fails?

I'd also recommend not to mess with DNS ... changes can take a really long time to propagate ... probably longer than it takes to fix the problem :) Or you want to mess with local DNS at each cluster?

In short, I'd expect that one RR-DNS entry will work ok for both cases:
- jobs falling back to xrootd;
- managers connecting to meta managers.

Matevz


On 03/28/14 13:20, Andrew Hanushevsky wrote:
Hi Tommaso,

Indeed, if this is what you plan to implement then you would need one entry that
is stable and another that is not. While I don't quite understand what you are
trying to accomplish I can say that anything that relies on the presence or
absence of an entry in DNS will likely not work. Why? Because imagine that there
are jobs running between the time a server dies and the time then dead server is
removed from DNS, which could be a substantial delay in the eyes of the jobs.
What happens to those jobs? Likely they will fail because they rely on the DNS
entries to be correct. So, any scheme to avoid dead servers really has to be
handled by the application not some external agent.

Andy

On Fri, 28 Mar 2014, Tommaso Boccali wrote:

ciao andrew, understood & it makes sense, thanks.
But then. I need a different solution for stageout.
As you explained to me here,

all.manager meta all xrootd.infn.it+ 1213  (probably wrong since gmail is
playing with the text, but nevermind ... it is what you wrote ;)

needs xrootd.infn.it to be the list of all possible redirectors, regardless
of their state. Fine.

On the other hand, for CMSSW fallback I need to specify something like

if file /store/file.root not locally available --> try root://
xrootd.infn.it//store/file.root

in this case, instead, I want xrootd.infn.it to resolve only to those
redirectors which are _currently_ ok, right?

then I am afraid I need 2 DNS aliases

- xrootd-fulllist.infn.it : the machines which are installed as redirector,
from which nagios never removes anything
- xrootd.infn.it : the subset of the previous with only currently working
redirectors, to be used in the fallback statement.

Is this correct?

thanks again!

tom



On Fri, Mar 28, 2014 at 8:41 PM, Andrew Hanushevsky
<[log in to unmask]>wrote:

Hi Tommaso,

See below...


On Fri, 28 Mar 2014, Tommaso Boccali wrote:

 Let's say we prepare 2 regional redirectors, 1.1.1.1 and 2.2.2.2, and we
punt them in the DNS as xrootd.infn.it (no round robin: "host
xrootd.infn.it"
will return 2 IP addresses).
Since we want to use xrootd.infn.it as fallback, we plan to have a nagios
test which checks 1.1.1.1 and 2.2.2.2 periodically, and in case one is NOT
ok, it is removed from the DNS.

You shuld never remove anuthing from DNS, it will break all the
recoverability aspects of xrootd. Even if it's broken it should remian in
DNS. Hence, you don't need anything special. Just leave both servers in DNS
all teh time.

> So, question was:

let's say that site ABCD needs to restart the xrootd local servers, which
are configured as

all.manager meta all *xrootd.infn.it <http://xrootd.infn.it>*+ 1213

Uhm, tyhe above won't work/ Perhaps you really wanted to say


all.manager meta all xrootd.infn.it+ 1213

what happens if AT THE RESTART MOMENT xrootd.infn.it only resolves  to
1.1.1.1 (since eventually 2.2.2.2 is broken)? And even more, what if 2
hours later 2.2.2.2 enters again the DNS resolution for

It won't re-resolve. That's why you always leave both addresses in DNS.

Andy




--
Tommaso Boccali
INFN Pisa


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1




--
Tommaso Boccali
INFN Pisa


Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1