Print

Print


ciao Andrew!
I think you do not need at this level, because by choice we have left the
redirectors w/o authentication.

So if you look to a command like

 xrdcp -d 10 root://
xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root.
>&log

(after a voms-proxy-destroy in my case) I still see the usual fixed order

-bash-3.2$ grep ShowUrls log
140417 01:12:15 001 Xrd: ShowUrls: The converted URLs count is 2
140417 01:12:15 001 Xrd: ShowUrls: URL n.1: root://
xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:12:15 001 Xrd: ShowUrls: URL n.2: root://
xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:12:15 001 Xrd: ShowUrls: The converted URLs count is 2
140417 01:12:15 001 Xrd: ShowUrls: URL n.1: root://
xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:12:15 001 Xrd: ShowUrls: URL n.2: root://
xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:12:20 001 Xrd: ShowUrls: The converted URLs count is 2
140417 01:12:20 001 Xrd: ShowUrls: URL n.1: root://
xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:12:20 001 Xrd: ShowUrls: URL n.2: root://
xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
...

eventually is this would not fail, then you would get an error when trying
to access the real file, but at least in my case I "die" before.

Of course for this to make sense I need to leave off one of the redirectors
(xrootd-redic.pi.infn.it).  Also, you can test the same behavior with

 xrdcp -d 10 root://xrootd-redic.pi.infn.it,
xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root.
>& log1 &

again, I get

-bash-3.2$ grep ShowUrls log1
140417 01:15:59 001 Xrd: ShowUrls: The converted URLs count is 2
140417 01:15:59 001 Xrd: ShowUrls: URL n.1: root://
xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:15:59 001 Xrd: ShowUrls: URL n.2: root://
xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:15:59 001 Xrd: ShowUrls: The converted URLs count is 2
140417 01:15:59 001 Xrd: ShowUrls: URL n.1: root://
xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:15:59 001 Xrd: ShowUrls: URL n.2: root://
xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:16:04 001 Xrd: ShowUrls: The converted URLs count is 2
140417 01:16:04 001 Xrd: ShowUrls: URL n.1: root://
xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
140417 01:16:04 001 Xrd: ShowUrls: URL n.2: root://
xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
.
...


I try and leave the redirector OFF for the night, if you want to try. I
hope I will not get big side effects  :(

tom





On Thu, Apr 17, 2014 at 12:33 AM, Andrew Hanushevsky <[log in to unmask]>wrote:

>   Hi Tommaso,
>
> So I need a certificate to reproduce your test from here? I also can
> supply you with access to xrdcopy if you happen to have AFS installed.
>
> Andy
>
>  *From:* Tommaso Boccali <[log in to unmask]>
> *Sent:* Tuesday, April 15, 2014 10:02 PM
> *To:* Andrew Hanushevsky <[log in to unmask]>
> *Cc:* [log in to unmask]
> *Subject:* Re: problem with aliased redirectors
>
>  ciao Andrew!
> I have problems checking with xrdcopy, since that is not distributed with
> CMS software, I have to find a way. For the moment, another hint something
> is not ok in the randomization in xrdcp:
>
> I tried (with xrootd.ba.infn.it ON and xrootd-redic.pi.infn.it OFF)
>
>  xrdcp -d 10 root://xrootd-redic.pi.infn.it,xrootd.ba.infn.it
> //store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root<http://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root>.
>
> so putting explicitly the list of servers in the command line.
> So, this always fails (xrootd-redic.pi.infn.it is always tried, 8 times,
> and the other never reached).
>
> Instead
>
>  xrdcp -d 10 root://xrootd.ba.infn.it,xrootd-redic.pi.infn.it
> //store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root<http://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root>.
>
> always works at the first attempt.
>
> In any case, I think we basically care about the behavior of TFile::Open()
> from our SW, not direct copy commands
>
>
> This for example should not fail:
>
>  root [5] TFile* ii = TFile::Open("root://xrootd-redic.pi.infn.it,
> xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
> ")
>
> 140416 06:58:05 001 Xrd: Connect: can't open connection to [
> xrootd-redic.pi.infn.it:1094]
> 140416 06:58:05 001 Xrd: XrdNetFile: Error creating logical connection to
> xrootd-redic.pi.infn.it:1094
> Error in <TXNetFile::CreateXClient>: open attempt failed on root://
> xrootd-redic.pi.infn.it,
> xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
>
> (does not seem to give a second try to the other server)
>
> and this seems even worse:
>
>  root [7] TFile* ii = TFile::Open("root://
> xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
> ")
>
> 140416 06:59:11 001 Xrd: Connect: can't open connection to [
> xrootd-redic.pi.infn.it:1094]
> 140416 06:59:11 001 Xrd: XrdNetFile: Error creating logical connection to
> xrootd-redic.pi.infn.it:1094
> Error in <TXNetFile::CreateXClient>: open attempt failed on root://
> xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
>
> so not even a second attempt is tried ....
>
> this instead works
>
>  root [1] TFile* ii = TFile::Open("root://xrootd.ba.infn.it,
> xrootd-redic.pi.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
> ")
>
> 140416 07:01:18 001 Xrd: GoToAnotherServer: Going to:
> t2-cms-xrootd01.desy.de:1094
> 140416 07:01:18 001 Xrd: GoToAnotherServer: Going to:
> dcache-cms-xrootd.desy.de:1094
> 140416 07:01:18 001 Xrd: GoToAnotherServer: Going to:
> 131.169.191.230:20982
>
> tommaso
>
>
>
>
> On Tue, Apr 15, 2014 at 11:08 PM, Andrew Hanushevsky <[log in to unmask]>wrote:
>
>>   Hi Tommaso,
>>
>> DNS round-robin, while it looks good in small scale tests, rarely works
>> all that well. The reason is that DNS round-robins whenever a look-up is
>> made regardless of the reason for the lookup. With a of clients that may
>> very well lead to suboptimal ordering. So, the xrootd client gets all of
>> the addresses and uses an algorithm that better spreads the access.
>>
>> As for why xrdcp didn’t go after the seconds entry is mysterious but I
>> would say it’s a bug. Could you try the same test again but use xrdcopy?
>> That’s the new version of the client.
>>
>> Andy
>>
>>  *From:* Tommaso Boccali <[log in to unmask]>
>> *Sent:* Tuesday, April 15, 2014 3:48 AM
>> *To:* [log in to unmask]
>> *Subject:* Re: problem with aliased redirectors
>>
>>  as additional info, the DNS seems to do well its RR job: from the same
>> machine
>>
>>  -bash-3.2$ host xrootd-cms.infn.it
>> xrootd-cms.infn.it has address 193.205.76.83
>> xrootd-cms.infn.it has address 90.147.66.75
>> -bash-3.2$ host xrootd-cms.infn.it
>> xrootd-cms.infn.it has address 90.147.66.75
>> xrootd-cms.infn.it has address 193.205.76.83
>> -bash-3.2$ host xrootd-cms.infn.it
>> xrootd-cms.infn.it has address 90.147.66.75
>> xrootd-cms.infn.it has address 193.205.76.83
>> -bash-3.2$ host xrootd-cms.infn.it
>> xrootd-cms.infn.it has address 90.147.66.75
>> xrootd-cms.infn.it has address 193.205.76.83
>> -bash-3.2$ host xrootd-cms.infn.it
>> xrootd-cms.infn.it has address 193.205.76.83
>> xrootd-cms.infn.it has address 90.147.66.75
>>
>> So each time the order returned is random, in case xrootd would need to
>> depend on this
>>
>> BUT: inside xrdcp log, the order seems always to be the same (*)
>>
>> is some caching done inside xrdcp killing the RR?
>>
>> tom
>>
>> *:
>>
>>  -bash-3.2$ grep DNS log
>> 140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:26 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:26 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:26 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:31 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:31 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:31 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:36 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:36 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:36 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:41 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:41 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:41 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:46 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:46 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:46 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:51 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:51 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:51 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:32:56 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
>> 140415 12:32:56 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:32:56 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>>
>>  On 15 Apr 2014, at 12:33, Tommaso Boccali <[log in to unmask]>
>> wrote:
>>
>>  Ciao,
>> as from a previous discussion, we have setup an aliased DNS xrootd
>> redirector,
>>
>> which is
>>
>>  -bash-3.2$ host xrootd-cms.infn.it
>> xrootd-cms.infn.it has address 90.147.66.75
>> xrootd-cms.infn.it has address 193.205.76.83
>>
>> I was playing with some crash tests, and I do not get the result.
>>
>> So: I switched off the redirector 193.205.76.83, while keeping it into
>> the alias, and I issued a
>>
>> xrdcp -d 10 root://
>> xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root.
>>
>> I was assuming that the client would have recognized the alias, and
>> eventually tried a second host if the first was not available.
>>
>> In the log ( https://www.dropbox.com/s/zmp9uyreqm4qwhg/xrootd.log )
>> I see eventually the client recognizes the situation:
>>
>>
>>   140415 12:25:11 001 Xrd: ConvertDNSAlias: found host
>> xrootd-redic.pi.infn.it with addr 193.205.76.83
>> 140415 12:25:11 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>> 140415 12:25:11 001 Xrd: ShowUrls: The converted URLs count is 2
>> 140415 12:25:11 001 Xrd: ShowUrls: URL n.1: root://
>> xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
>> .
>> 140415 12:25:11 001 Xrd: ShowUrls: URL n.2: root://
>> xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
>> .
>>
>> but then
>>
>>  140415 12:25:46 001 Xrd: Open: Trying to connect to
>> xrootd-redic.pi.infn.it:1094. Connect try 8
>> 140415 12:25:46 001 Xrd: XrdClientConn: Trying to connect to
>> 193.205.76.83:1094
>> 140415 12:25:46 001 Xrd: Connect: Creating a logical connection...
>> 140415 12:25:46 001 Xrd: Connect: Physical connection not found. Creating
>> a new one...
>> 140415 12:25:46 001 Xrd: Connect: Connecting to [
>> xrootd-redic.pi.infn.it:1094]
>> 140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Trying to connect to
>> xrootd-redic.pi.infn.it(193.205.76.83):1094 Windowsize=0 Timeout=120
>> 140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Connection
>> toxrootd-redic.pi.infn.it:1094 failed. (-1)
>> 140415 12:25:46 001 Xrd: Connect: can't open connection to [
>> xrootd-redic.pi.infn.it:1094]
>> 140415 12:25:46 001 Xrd: PhyConnection: Disconnecting socket...
>> 140415 12:25:46 001 Xrd: Connect: Connect(xrootd-redic.pi.infn.it, 1094)
>> returned -1
>> 140415 12:25:46 001 Xrd: XrdNetFile: Error creating logical connection to
>> xrootd-redic.pi.infn.it:1094
>> 140415 12:25:46 001 Xrd: Open: Disconnecting.
>> 140415 12:25:46 001 Xrd: Cache: Cache Status --------------------------
>> 140415 12:25:46 001 Xrd: Cache: --------------------------------------
>> fTotalByteCount = 0
>> Last server error 10000 ('')
>> Error accessing path/file for root://
>> xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
>>
>> so no attempt is done on the other. What is wrong here? all in all it
>> tries 8 times to connect to the SAME server, and 0 times to the other ...
>>
>>
>> thanks a lot
>>
>> tom
>>
>> --
>> Tommaso Boccali
>> INFN Pisa
>>
>> ------------------------------
>>
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>
>>
>>
>> ------------------------------
>>
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>
>
>
>
> --
> Tommaso Boccali
> INFN Pisa
>
> ------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>



-- 
Tommaso Boccali
INFN Pisa

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1