On Friday, March 28, 2014, Andrew Hanushevsky <[log in to unmask]> wrote:
Hi Tommaso & Marian,

I apologize that we didn't get a prompt response back. Somehow this slipped through the cracks. So, let me intersperse my answers in the text below.

On Thu, 27 Mar 2014, Tommaso Boccali wrote:

ok, but what happens if a site restarts its cmsd when the redirector points
just to one redirector (because the other is off)?
when the second redirector comes back in the DNS list, will the site pick
it up or not?
When a site restarts it's cmsd when one of the redirectors is off.

uhm, not sure I was explaining correctly myself in first place.

Let's say we prepare 2 regional redirectors, and, and we punt them in the DNS as (no round robin: "host" will return 2 IP addresses).
Since we want to use as fallback, we plan to have a nagios test which checks and periodically, and in case one is NOT ok, it is removed from the DNS.

So, question was:
let's say that site ABCD needs to restart the xrootd local servers, which are configured as 

all.manager meta all 1213

what happens if AT THE RESTART MOMENT only resolves  to (since eventually is broken)? And even more, what if 2 hours later enters again the DNS resolution for (since it recovered and the nagios realized)? Has the cmsd running at site ABCD a way to recognize that the # of IPs served by has changed, and hence consider also as a manager?

When all is used, all traffic marked to go to the one that is offline is shifted to a particular redirector that is still working. When that redirector comes back online the traffic is shifted back as it would have been had the redirector been online at the time of restart.

When any is used, all traffic marked to go to that redirector, if any, is shifted to another working redirector. When that redirector comes back online the traffic is normally shifted back to that redirector if it was the redirector of choice (i.e. the one originally chosen to receive all of the traffic).

On Thu, Mar 27, 2014 at 4:08 PM, Marian Zvada <[log in to unmask]> wrote:

we have very same setup for CMS T2 and currently in commissioning process.
We have DNS round-robin alias between FNAL and UNL and use the following in
the config:

all.manager meta all 1213
( and behind that)
Yes, this is the correct way of doing this. XRootD will take over the parcelling out of traffic as DNS round-robbin is (and always has been) problematic for the way XRootD needs to load balance requests.

=> I think this should be adapted also for your setup as well I believe,
though pointing to different aliased host.

For the question of "all" as parameter I was already asking in developers
list what's beneficial, all || any, but didn't get response so far. Here
it's documented but I didn't get full understanding what's recommended from
operation point of view while using DNS-RR setup:
Indeed, we should have had a better explanation. Could either of you post a problem ticket that requests that the documentation discuss why you would choose one over the other so it doesn't fall through the cracks?

In absence of that let me explain.

Choosing any is suitable when your interaction rate with xrootd is not overtaxing the system. You may be overtaxing the system if the CPU usage of the cmsd is getting high or it's memory starts getting out of hand (sort of c,sd using more than 10% of the CPU or sitting on more than 2GB of memory -- though we haven' really benchmarked that). With all, all traffic goes to a single cmsd and the other redirectors listed are merely used as a fallback if the primary cmsd fails.

Choosing all is suitable when you interaction rate with xrootd is very high and you wish to distribute that load across all of the cmsd's that are available. This is suitable when all of the cmsd's are roughly of the same power (if not you will get rather odd skew effects in responsiveness). Here, requests are parcelled out using a determinstic algorithm so that each request relative to a particular file always goes to the same cmsd, thus maximizing the use of the routing cache.

However, we stick so far with:
all.manager meta all 1213

Also, currently each of top redirectors in the DNS-RR alias subscribes to
global redirector:
all.manager meta any 1098
(see 'any' in this case)
Yes, I see that. So is there any reason why you used any here while you used all elsewhere? Not that it likely matters but I am curious why the difference.


Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:

Tommaso Boccali

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link: