Print

Print


Hi Tommaso,

Indeed, if this is what you plan to implement then you would need one 
entry that is stable and another that is not. While I don't quite 
understand what you are trying to accomplish I can say that anything that 
relies on the presence or absence of an entry in DNS will likely not work. 
Why? Because imagine that there are jobs running between the time a server 
dies and the time then dead server is removed from DNS, which could be a 
substantial delay in the eyes of the jobs. What happens to those jobs? 
Likely they will fail because they rely on the DNS entries to be correct. 
So, any scheme to avoid dead servers really has to be handled by the 
application not some external agent.

Andy

On Fri, 28 Mar 2014, Tommaso Boccali wrote:

> ciao andrew, understood & it makes sense, thanks.
> But then. I need a different solution for stageout.
> As you explained to me here,
>
> all.manager meta all xrootd.infn.it+ 1213  (probably wrong since gmail is
> playing with the text, but nevermind ... it is what you wrote ;)
>
> needs xrootd.infn.it to be the list of all possible redirectors, regardless
> of their state. Fine.
>
> On the other hand, for CMSSW fallback I need to specify something like
>
> if file /store/file.root not locally available --> try root://
> xrootd.infn.it//store/file.root
>
> in this case, instead, I want xrootd.infn.it to resolve only to those
> redirectors which are _currently_ ok, right?
>
> then I am afraid I need 2 DNS aliases
>
> - xrootd-fulllist.infn.it : the machines which are installed as redirector,
> from which nagios never removes anything
> - xrootd.infn.it : the subset of the previous with only currently working
> redirectors, to be used in the fallback statement.
>
> Is this correct?
>
> thanks again!
>
> tom
>
>
>
> On Fri, Mar 28, 2014 at 8:41 PM, Andrew Hanushevsky
> <[log in to unmask]>wrote:
>
>> Hi Tommaso,
>>
>> See below...
>>
>>
>> On Fri, 28 Mar 2014, Tommaso Boccali wrote:
>>
>>  Let's say we prepare 2 regional redirectors, 1.1.1.1 and 2.2.2.2, and we
>>> punt them in the DNS as xrootd.infn.it (no round robin: "host
>>> xrootd.infn.it"
>>> will return 2 IP addresses).
>>> Since we want to use xrootd.infn.it as fallback, we plan to have a nagios
>>> test which checks 1.1.1.1 and 2.2.2.2 periodically, and in case one is NOT
>>> ok, it is removed from the DNS.
>>>
>> You shuld never remove anuthing from DNS, it will break all the
>> recoverability aspects of xrootd. Even if it's broken it should remian in
>> DNS. Hence, you don't need anything special. Just leave both servers in DNS
>> all teh time.
>>
>> > So, question was:
>>
>>> let's say that site ABCD needs to restart the xrootd local servers, which
>>> are configured as
>>>
>>> all.manager meta all *xrootd.infn.it <http://xrootd.infn.it>*+ 1213
>>>
>> Uhm, tyhe above won't work/ Perhaps you really wanted to say
>>
>>
>> all.manager meta all xrootd.infn.it+ 1213
>>
>>> what happens if AT THE RESTART MOMENT xrootd.infn.it only resolves  to
>>> 1.1.1.1 (since eventually 2.2.2.2 is broken)? And even more, what if 2
>>> hours later 2.2.2.2 enters again the DNS resolution for
>>>
>> It won't re-resolve. That's why you always leave both addresses in DNS.
>>
>> Andy
>>
>
>
>
> -- 
> Tommaso Boccali
> INFN Pisa
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1