ciao Andrew! I have problems checking with xrdcopy, since that is not distributed with CMS software, I have to find a way. For the moment, another hint something is not ok in the randomization in xrdcp: I tried (with xrootd.ba.infn.it ON and xrootd-redic.pi.infn.it OFF) xrdcp -d 10 root://xrootd-redic.pi.infn.it,xrootd.ba.infn.it //store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root<http://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root> . so putting explicitly the list of servers in the command line. So, this always fails (xrootd-redic.pi.infn.it is always tried, 8 times, and the other never reached). Instead xrdcp -d 10 root://xrootd.ba.infn.it,xrootd-redic.pi.infn.it //store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root<http://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root> . always works at the first attempt. In any case, I think we basically care about the behavior of TFile::Open() from our SW, not direct copy commands This for example should not fail: root [5] TFile* ii = TFile::Open("root://xrootd-redic.pi.infn.it, xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root ") 140416 06:58:05 001 Xrd: Connect: can't open connection to [ xrootd-redic.pi.infn.it:1094] 140416 06:58:05 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094 Error in <TXNetFile::CreateXClient>: open attempt failed on root:// xrootd-redic.pi.infn.it, xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root (does not seem to give a second try to the other server) and this seems even worse: root [7] TFile* ii = TFile::Open("root:// xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root ") 140416 06:59:11 001 Xrd: Connect: can't open connection to [ xrootd-redic.pi.infn.it:1094] 140416 06:59:11 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094 Error in <TXNetFile::CreateXClient>: open attempt failed on root:// xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root so not even a second attempt is tried .... this instead works root [1] TFile* ii = TFile::Open("root://xrootd.ba.infn.it, xrootd-redic.pi.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root ") 140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: t2-cms-xrootd01.desy.de:1094 140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: dcache-cms-xrootd.desy.de:1094 140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: 131.169.191.230:20982 tommaso On Tue, Apr 15, 2014 at 11:08 PM, Andrew Hanushevsky <[log in to unmask]>wrote: > Hi Tommaso, > > DNS round-robin, while it looks good in small scale tests, rarely works > all that well. The reason is that DNS round-robins whenever a look-up is > made regardless of the reason for the lookup. With a of clients that may > very well lead to suboptimal ordering. So, the xrootd client gets all of > the addresses and uses an algorithm that better spreads the access. > > As for why xrdcp didn’t go after the seconds entry is mysterious but I > would say it’s a bug. Could you try the same test again but use xrdcopy? > That’s the new version of the client. > > Andy > > *From:* Tommaso Boccali <[log in to unmask]> > *Sent:* Tuesday, April 15, 2014 3:48 AM > *To:* [log in to unmask] > *Subject:* Re: problem with aliased redirectors > > as additional info, the DNS seems to do well its RR job: from the same > machine > > -bash-3.2$ host xrootd-cms.infn.it > xrootd-cms.infn.it has address 193.205.76.83 > xrootd-cms.infn.it has address 90.147.66.75 > -bash-3.2$ host xrootd-cms.infn.it > xrootd-cms.infn.it has address 90.147.66.75 > xrootd-cms.infn.it has address 193.205.76.83 > -bash-3.2$ host xrootd-cms.infn.it > xrootd-cms.infn.it has address 90.147.66.75 > xrootd-cms.infn.it has address 193.205.76.83 > -bash-3.2$ host xrootd-cms.infn.it > xrootd-cms.infn.it has address 90.147.66.75 > xrootd-cms.infn.it has address 193.205.76.83 > -bash-3.2$ host xrootd-cms.infn.it > xrootd-cms.infn.it has address 193.205.76.83 > xrootd-cms.infn.it has address 90.147.66.75 > > So each time the order returned is random, in case xrootd would need to > depend on this > > BUT: inside xrdcp log, the order seems always to be the same (*) > > is some caching done inside xrdcp killing the RR? > > tom > > *: > > -bash-3.2$ grep DNS log > 140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:26 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:26 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:26 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:31 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:31 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:31 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:36 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:36 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:36 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:41 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:41 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:41 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:46 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:46 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:46 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:51 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:51 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:51 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:32:56 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it > 140415 12:32:56 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:32:56 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > > On 15 Apr 2014, at 12:33, Tommaso Boccali <[log in to unmask]> > wrote: > > Ciao, > as from a previous discussion, we have setup an aliased DNS xrootd > redirector, > > which is > > -bash-3.2$ host xrootd-cms.infn.it > xrootd-cms.infn.it has address 90.147.66.75 > xrootd-cms.infn.it has address 193.205.76.83 > > I was playing with some crash tests, and I do not get the result. > > So: I switched off the redirector 193.205.76.83, while keeping it into the > alias, and I issued a > > xrdcp -d 10 root:// > xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root. > > I was assuming that the client would have recognized the alias, and > eventually tried a second host if the first was not available. > > In the log ( https://www.dropbox.com/s/zmp9uyreqm4qwhg/xrootd.log ) > I see eventually the client recognizes the situation: > > > 140415 12:25:11 001 Xrd: ConvertDNSAlias: found host > xrootd-redic.pi.infn.it with addr 193.205.76.83 > 140415 12:25:11 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75 > 140415 12:25:11 001 Xrd: ShowUrls: The converted URLs count is 2 > 140415 12:25:11 001 Xrd: ShowUrls: URL n.1: root:// > xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root > . > 140415 12:25:11 001 Xrd: ShowUrls: URL n.2: root:// > xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root > . > > but then > > 140415 12:25:46 001 Xrd: Open: Trying to connect to > xrootd-redic.pi.infn.it:1094. Connect try 8 > 140415 12:25:46 001 Xrd: XrdClientConn: Trying to connect to > 193.205.76.83:1094 > 140415 12:25:46 001 Xrd: Connect: Creating a logical connection... > 140415 12:25:46 001 Xrd: Connect: Physical connection not found. Creating > a new one... > 140415 12:25:46 001 Xrd: Connect: Connecting to [ > xrootd-redic.pi.infn.it:1094] > 140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Trying to connect to > xrootd-redic.pi.infn.it(193.205.76.83):1094 Windowsize=0 Timeout=120 > 140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Connection > toxrootd-redic.pi.infn.it:1094 failed. (-1) > 140415 12:25:46 001 Xrd: Connect: can't open connection to [ > xrootd-redic.pi.infn.it:1094] > 140415 12:25:46 001 Xrd: PhyConnection: Disconnecting socket... > 140415 12:25:46 001 Xrd: Connect: Connect(xrootd-redic.pi.infn.it, 1094) > returned -1 > 140415 12:25:46 001 Xrd: XrdNetFile: Error creating logical connection to > xrootd-redic.pi.infn.it:1094 > 140415 12:25:46 001 Xrd: Open: Disconnecting. > 140415 12:25:46 001 Xrd: Cache: Cache Status -------------------------- > 140415 12:25:46 001 Xrd: Cache: -------------------------------------- > fTotalByteCount = 0 > Last server error 10000 ('') > Error accessing path/file for root:// > xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root > > so no attempt is done on the other. What is wrong here? all in all it > tries 8 times to connect to the SAME server, and 0 times to the other ... > > > thanks a lot > > tom > > -- > Tommaso Boccali > INFN Pisa > > ------------------------------ > > Use REPLY-ALL to reply to list > > To unsubscribe from the XROOTD-L list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 > > > > ------------------------------ > > Use REPLY-ALL to reply to list > > To unsubscribe from the XROOTD-L list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 > -- Tommaso Boccali INFN Pisa ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1