Hi Tommaso,So I need a certificate to reproduce your test from here? I also can supply you with access to xrdcopy if you happen to have AFS installed.AndyFrom: [log in to unmask]" href="mailto:[log in to unmask]" target="_blank">Tommaso BoccaliSent: Tuesday, April 15, 2014 10:02 PMTo: [log in to unmask]" href="mailto:[log in to unmask]" target="_blank">Andrew HanushevskyCc: [log in to unmask]" href="mailto:[log in to unmask]" target="_blank">[log in to unmask]Subject: Re: problem with aliased redirectorsciao Andrew!I have problems checking with xrdcopy, since that is not distributed with CMS software, I have to find a way. For the moment, another hint something is not ok in the randomization in xrdcp:I tried (with xrootd.ba.infn.it ON and xrootd-redic.pi.infn.it OFF)so putting explicitly the list of servers in the command line.So, this always fails (xrootd-redic.pi.infn.it is always tried, 8 times, and the other never reached).Insteadalways works at the first attempt.In any case, I think we basically care about the behavior of TFile::Open() from our SW, not direct copy commandsThis for example should not fail:root [5] TFile* ii = TFile::Open("root://xrootd-redic.pi.infn.it,xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root")140416 06:58:05 001 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]140416 06:58:05 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094Error in <TXNetFile::CreateXClient>: open attempt failed on root://xrootd-redic.pi.infn.it,xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root(does not seem to give a second try to the other server)and this seems even worse:root [7] TFile* ii = TFile::Open("root://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root")140416 06:59:11 001 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]140416 06:59:11 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094Error in <TXNetFile::CreateXClient>: open attempt failed on root://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.rootso not even a second attempt is tried ....this instead worksroot [1] TFile* ii = TFile::Open("root://xrootd.ba.infn.it,xrootd-redic.pi.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root")140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: t2-cms-xrootd01.desy.de:1094140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: dcache-cms-xrootd.desy.de:1094140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: 131.169.191.230:20982tommaso
On Tue, Apr 15, 2014 at 11:08 PM, Andrew Hanushevsky <[log in to unmask]> wrote:
Hi Tommaso,DNS round-robin, while it looks good in small scale tests, rarely works all that well. The reason is that DNS round-robins whenever a look-up is made regardless of the reason for the lookup. With a of clients that may very well lead to suboptimal ordering. So, the xrootd client gets all of the addresses and uses an algorithm that better spreads the access.As for why xrdcp didn’t go after the seconds entry is mysterious but I would say it’s a bug. Could you try the same test again but use xrdcopy? That’s the new version of the client.AndyFrom: [log in to unmask]" href="mailto:[log in to unmask]" target="_blank">Tommaso BoccaliSent: Tuesday, April 15, 2014 3:48 AMTo: [log in to unmask]" href="mailto:[log in to unmask]" target="_blank">[log in to unmask]Subject: Re: problem with aliased redirectorsas additional info, the DNS seems to do well its RR job: from the same machine-bash-3.2$ host xrootd-cms.infn.itxrootd-cms.infn.it has address 193.205.76.83xrootd-cms.infn.it has address 90.147.66.75-bash-3.2$ host xrootd-cms.infn.itxrootd-cms.infn.it has address 90.147.66.75xrootd-cms.infn.it has address 193.205.76.83-bash-3.2$ host xrootd-cms.infn.itxrootd-cms.infn.it has address 90.147.66.75xrootd-cms.infn.it has address 193.205.76.83-bash-3.2$ host xrootd-cms.infn.itxrootd-cms.infn.it has address 90.147.66.75xrootd-cms.infn.it has address 193.205.76.83-bash-3.2$ host xrootd-cms.infn.itxrootd-cms.infn.it has address 193.205.76.83xrootd-cms.infn.it has address 90.147.66.75So each time the order returned is random, in case xrootd would need to depend on thisBUT: inside xrdcp log, the order seems always to be the same (*)is some caching done inside xrdcp killing the RR?tom*:-bash-3.2$ grep DNS log140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:26 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:26 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:26 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:31 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:31 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:31 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:36 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:36 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:36 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:41 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:41 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:41 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:46 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:46 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:46 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:51 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:51 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:51 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:32:56 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it140415 12:32:56 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:32:56 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75On 15 Apr 2014, at 12:33, Tommaso Boccali <[log in to unmask]> wrote:
Ciao,as from a previous discussion, we have setup an aliased DNS xrootd redirector,which is-bash-3.2$ host xrootd-cms.infn.itxrootd-cms.infn.it has address 90.147.66.75xrootd-cms.infn.it has address 193.205.76.83I was playing with some crash tests, and I do not get the result.So: I switched off the redirector 193.205.76.83, while keeping it into the alias, and I issued axrdcp -d 10 root://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root .I was assuming that the client would have recognized the alias, and eventually tried a second host if the first was not available.In the log ( https://www.dropbox.com/s/zmp9uyreqm4qwhg/xrootd.log )I see eventually the client recognizes the situation:140415 12:25:11 001 Xrd: ConvertDNSAlias: found host xrootd-redic.pi.infn.it with addr 193.205.76.83140415 12:25:11 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.it with addr 90.147.66.75140415 12:25:11 001 Xrd: ShowUrls: The converted URLs count is 2140415 12:25:11 001 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root.140415 12:25:11 001 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root.but then140415 12:25:46 001 Xrd: Open: Trying to connect to xrootd-redic.pi.infn.it:1094. Connect try 8140415 12:25:46 001 Xrd: XrdClientConn: Trying to connect to 193.205.76.83:1094140415 12:25:46 001 Xrd: Connect: Creating a logical connection...140415 12:25:46 001 Xrd: Connect: Physical connection not found. Creating a new one...140415 12:25:46 001 Xrd: Connect: Connecting to [xrootd-redic.pi.infn.it:1094]140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Trying to connect to xrootd-redic.pi.infn.it(193.205.76.83):1094 Windowsize=0 Timeout=120140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Connection toxrootd-redic.pi.infn.it:1094 failed. (-1)140415 12:25:46 001 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]140415 12:25:46 001 Xrd: PhyConnection: Disconnecting socket...140415 12:25:46 001 Xrd: Connect: Connect(xrootd-redic.pi.infn.it, 1094) returned -1140415 12:25:46 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094140415 12:25:46 001 Xrd: Open: Disconnecting.140415 12:25:46 001 Xrd: Cache: Cache Status --------------------------140415 12:25:46 001 Xrd: Cache: -------------------------------------- fTotalByteCount = 0Last server error 10000 ('')Error accessing path/file for root://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.rootso no attempt is done on the other. What is wrong here? all in all it tries 8 times to connect to the SAME server, and 0 times to the other ...thanks a lottom--
Tommaso Boccali
INFN Pisa
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
--
Tommaso Boccali
INFN Pisa
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1