Print

Print


Hi Tommaso,

Hmmm, well I do understand. Unfortunately, this is a case of wanting the cake and eating too (or some silly such comparison If the two redirectors are not the same then you really can’t treat them the same. For instance, assume you used a static ENOENT redirect in each redirector to point to the global manager. So, if the client managed to use the cmsd with fewer registrants and the file wasn’t found then the client would be sent up the chain. The problem with this is that the redirector that sends the client up the chain also tells the meta manager to exclude it and it’s companion from further consideration. Why? Because they were declared equal even thought they are not and they have no way of knowing that. So, this is why I say this is a classic case of wanting two mutually exclusive things.

So the question is what can we do while this migration is in progress. Well, that depends on where we are in the process. If xrootd.ba.infn.it (50 servers registered) if a superset of xroot-redic.pi.infn.it (30 servers registered) then we would have a chance by creating a hierarchical scheme. But from what I gather the two redirectors have non-intersecting registrants which makes them quite different.

If so, your only solution is to create a meta-manager for both of them in the interim and let clients start there. The meta-manager will then create a union of the registrants to both redirectors and things will, once again, work the way you want. You can have as many of these as you want but one level will do, though you will need to alter the config a bit from the normal setup. Of course, migrating out of that once people convert will yet be another issue, sigh.

Does this make sense?

Andy

From: Tommaso Boccali
Sent: Tuesday, July 08, 2014 3:39 PM
To: xrootd/xrootd
Cc: Andrew Hanushevsky
Subject: Re: [xrootd] xrootd fallback only populates "tried" for one redirector (#124)

ciao Andy!

- I am pinging directly to xrootd-redic.pi.infn.it (and not to the DNS RR)
since I did not modify the other copy. did not want to risk too much here
(for the reason I am trying to explain below)

I have a doubt about the picture you sent via the link. It makes sense only
if all cmsd and xrootd are equal. In our setup (with ~40 sites connected),
we had before only xrootd.ba.infn.it as redir. Now we asked the sites to
move to xrootd-cms.infn.it, but after two weeks ~ 50% of the sites moved (a
lot of inertia, but it is difficult to go and twist arms ...).

so let's suppose
- xrootd.ba.infn.it has 50 servers registered in cmsd
- xroot-redic.pi.infn.it has 30.

what would happen if
xrootd.ba.infn.it:1094 -> xrootd-redic.pi.infn.it:1213 (which can happen if
I put as manager for xrootd.ba.infn.it the DNS RR...) ?
the cmsd would not have all the servers registered, and hence fail can fail
on a file? it can even be ok if fail means go up to the global redirector,
and then down to xrootd.ba.infn.it:1213 (even if at this point a random EU
or US site would be used).

So, how if the picture changing if the cmsd cannot be considered identical
as # of connected servers?

I know the correct answer would be "force them to change", but .... ok, I
guess you understand ;)

thanks a lot

tom



On Tue, Jul 8, 2014 at 11:37 PM, xrootd-dev <[log in to unmask]>
wrote:

> Hi Tommaso,
>
>
>
> From: Tommaso Boccali
> Sent: Monday, July 07, 2014 11:33 PM
> To: xrootd/xrootd
> Cc: xrootd-dev
> Subject: Re: [xrootd] xrootd fallback only populates "tried" for one
> redirector (#124)
>
> Ciao, on the CMS-Eu redirectors I implemented as explained here:
> we have 2 redirectors (xrootd.ba.infn.it and xrootd-redic.pi.infn.it),
> under DNS RR xrootd-cms.infn.it
>
> Now redirectors' configs read
>
> all.role manager
>
> The known managers
> all.manager xrootd-cms.infn.it+ 1213
> all.manager meta any cms-xrd-global.cern.ch+ 1098
>
> which I understand is the advised solution. What I see is strange.
> if I do something like
>
> xrdcp -d 3 root://
> xrootd-redic.pi.infn.it//store/test/xrootd/T2_ES_IFCA/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root
> .
>
> >>>>Why did you go directly to a particular redirector. Why didn’t you use
> xrootd-cms.infn.it? It would seem that if you want access to one or the
> other you would use the higher level name.
>
> and I look into the logs, I see this goes
> xrootd-redic.pi.infn.it:1094 -> xrootd.ba.infn.it:1213 ALWAYS
> (so indeed xrootd-redic.pi.infn.it:1213 does not seem to be serving files
> at all).
> If I switch off xrootd.ba.infn.it:1213, xrootd-redic.pi.infn.it:1213
> starts to serve files, but as soon as I start again xrootd.ba.infn.it:1213,
> the latter gets all the traffic.
>
> >>>Well, I am still confused. As none of the examples use
> xrootd-cms.infn.it as the target I don’t know what is going on. Going
> directly to “redic” goes directly to that.
>
> I tried with things like
>
> all.manager any/all xrootd-cms.infn.it+ 1213
>
> (never really understood the difference any/all, my fault ...), but this
> does not seem to change the picture
>
> >>>There is now a good explanation of all/any in the manual (see link at
> the end). However, it has nothing to do with clients so specifying all or
> any will have no effect on client access.
>
> eventually, on xrootd-redic.pi.infn.it cmsd.log I see traffic (with
> grep serving cmsd.log
> ) only when the other cmsd is down ... is this expected? is there a way to
> make sure
>
> all.manager xrootd-cms.infn.it+ 1213
>
> balances the cmsd calls between RR-DNS hosts?
>
> >>>DNS RR is immaterial here. The above directive simply says that your
> data servers have two managers (redic and ba) and they will subscribe to
> both. A client needs to access these redirectors via the higher level name
> (xrootd-cms) to get access to one or the other. The choice is random and
> the load would be uniformly distributed to each of the xrootd front-ends
> across all of the clients. In turn, those xrootd-front ends subscribe to
> the cmsd back ends (one on redic the other on ba). If you specify “all”
> then the xrootd front-ends will distribute the load across the two cmsd
> back ends. The default, any, uses a simple fail-over model (as you saw).
> The concept is described here:
>
>
>
> http://xrootd.org/doc/dev4/cms_config.htm#_Toc384307771
>
>
>
> Andy
>
> —
> Reply to this email directly or view it on GitHub
> <https://github.com/xrootd/xrootd/issues/124#issuecomment-48402749>.
>



--
Tommaso Boccali
INFN Pisa

Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.



Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1