Print

Print


ciao Andrew!
I have problems checking with xrdcopy, since that is not distributed with
CMS software, I have to find a way. For the moment, another hint something
is not ok in the randomization in xrdcp:

I tried (with xrootd.ba.infn.it ON and xrootd-redic.pi.infn.it OFF)

xrdcp -d 10 root://xrootd-redic.pi.infn.it,xrootd.ba.infn.it
//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root<http://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root>
 .

so putting explicitly the list of servers in the command line.
So, this always fails (xrootd-redic.pi.infn.it is always tried, 8 times,
 and the other never reached).

Instead

xrdcp -d 10 root://xrootd.ba.infn.it,xrootd-redic.pi.infn.it
//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root<http://xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root>
 .

always works at the first attempt.

In any case, I think we basically care about the behavior of TFile::Open()
from our SW, not direct copy commands


This for example should not fail:

root [5] TFile* ii = TFile::Open("root://xrootd-redic.pi.infn.it,
xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
")

140416 06:58:05 001 Xrd: Connect: can't open connection to [
xrootd-redic.pi.infn.it:1094]
140416 06:58:05 001 Xrd: XrdNetFile: Error creating logical connection to
xrootd-redic.pi.infn.it:1094
Error in <TXNetFile::CreateXClient>: open attempt failed on root://
xrootd-redic.pi.infn.it,
xrootd.ba.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root

(does not seem to give a second try to the other server)

and this seems even worse:

root [7] TFile* ii = TFile::Open("root://
xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
")

140416 06:59:11 001 Xrd: Connect: can't open connection to [
xrootd-redic.pi.infn.it:1094]
140416 06:59:11 001 Xrd: XrdNetFile: Error creating logical connection to
xrootd-redic.pi.infn.it:1094
Error in <TXNetFile::CreateXClient>: open attempt failed on root://
xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root

so not even a second attempt is tried ....

this instead works

root [1] TFile* ii = TFile::Open("root://xrootd.ba.infn.it,
xrootd-redic.pi.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
")

140416 07:01:18 001 Xrd: GoToAnotherServer: Going to:
t2-cms-xrootd01.desy.de:1094
140416 07:01:18 001 Xrd: GoToAnotherServer: Going to:
dcache-cms-xrootd.desy.de:1094
140416 07:01:18 001 Xrd: GoToAnotherServer: Going to: 131.169.191.230:20982

tommaso




On Tue, Apr 15, 2014 at 11:08 PM, Andrew Hanushevsky <[log in to unmask]>wrote:

>   Hi Tommaso,
>
> DNS round-robin, while it looks good in small scale tests, rarely works
> all that well. The reason is that DNS round-robins whenever a look-up is
> made regardless of the reason for the lookup. With a of clients that may
> very well lead to suboptimal ordering. So, the xrootd client gets all of
> the addresses and uses an algorithm that better spreads the access.
>
> As for why xrdcp didn’t go after the seconds entry is mysterious but I
> would say it’s a bug. Could you try the same test again but use xrdcopy?
> That’s the new version of the client.
>
> Andy
>
>  *From:* Tommaso Boccali <[log in to unmask]>
> *Sent:* Tuesday, April 15, 2014 3:48 AM
> *To:* [log in to unmask]
> *Subject:* Re: problem with aliased redirectors
>
> as additional info, the DNS seems to do well its RR job: from the same
> machine
>
>  -bash-3.2$ host xrootd-cms.infn.it
> xrootd-cms.infn.it has address 193.205.76.83
> xrootd-cms.infn.it has address 90.147.66.75
> -bash-3.2$ host xrootd-cms.infn.it
> xrootd-cms.infn.it has address 90.147.66.75
> xrootd-cms.infn.it has address 193.205.76.83
> -bash-3.2$ host xrootd-cms.infn.it
> xrootd-cms.infn.it has address 90.147.66.75
> xrootd-cms.infn.it has address 193.205.76.83
> -bash-3.2$ host xrootd-cms.infn.it
> xrootd-cms.infn.it has address 90.147.66.75
> xrootd-cms.infn.it has address 193.205.76.83
> -bash-3.2$ host xrootd-cms.infn.it
> xrootd-cms.infn.it has address 193.205.76.83
> xrootd-cms.infn.it has address 90.147.66.75
>
> So each time the order returned is random, in case xrootd would need to
> depend on this
>
> BUT: inside xrdcp log, the order seems always to be the same (*)
>
> is some caching done inside xrdcp killing the RR?
>
> tom
>
> *:
>
>  -bash-3.2$ grep DNS log
> 140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:21 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:21 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:26 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:26 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:26 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:31 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:31 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:31 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:36 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:36 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:36 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:41 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:41 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:41 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:46 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:46 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:46 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:51 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:51 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:51 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:32:56 001 Xrd: ConvertDNSAlias: resolving xrootd-cms.infn.it
> 140415 12:32:56 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:32:56 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
>
>  On 15 Apr 2014, at 12:33, Tommaso Boccali <[log in to unmask]>
> wrote:
>
>  Ciao,
> as from a previous discussion, we have setup an aliased DNS xrootd
> redirector,
>
> which is
>
>  -bash-3.2$ host xrootd-cms.infn.it
> xrootd-cms.infn.it has address 90.147.66.75
> xrootd-cms.infn.it has address 193.205.76.83
>
> I was playing with some crash tests, and I do not get the result.
>
> So: I switched off the redirector 193.205.76.83, while keeping it into the
> alias, and I issued a
>
> xrdcp -d 10 root://
> xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root.
>
> I was assuming that the client would have recognized the alias, and
> eventually tried a second host if the first was not available.
>
> In the log ( https://www.dropbox.com/s/zmp9uyreqm4qwhg/xrootd.log )
> I see eventually the client recognizes the situation:
>
>
>   140415 12:25:11 001 Xrd: ConvertDNSAlias: found host
> xrootd-redic.pi.infn.it with addr 193.205.76.83
> 140415 12:25:11 001 Xrd: ConvertDNSAlias: found host xrootd.ba.infn.itwith addr 90.147.66.75
> 140415 12:25:11 001 Xrd: ShowUrls: The converted URLs count is 2
> 140415 12:25:11 001 Xrd: ShowUrls: URL n.1: root://
> xrootd-redic.pi.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
> .
> 140415 12:25:11 001 Xrd: ShowUrls: URL n.2: root://
> xrootd.ba.infn.it:1094//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
> .
>
> but then
>
>  140415 12:25:46 001 Xrd: Open: Trying to connect to
> xrootd-redic.pi.infn.it:1094. Connect try 8
> 140415 12:25:46 001 Xrd: XrdClientConn: Trying to connect to
> 193.205.76.83:1094
> 140415 12:25:46 001 Xrd: Connect: Creating a logical connection...
> 140415 12:25:46 001 Xrd: Connect: Physical connection not found. Creating
> a new one...
> 140415 12:25:46 001 Xrd: Connect: Connecting to [
> xrootd-redic.pi.infn.it:1094]
> 140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Trying to connect to
> xrootd-redic.pi.infn.it(193.205.76.83):1094 Windowsize=0 Timeout=120
> 140415 12:25:46 001 Xrd: ClientSock::TryConnect_low: Connection
> toxrootd-redic.pi.infn.it:1094 failed. (-1)
> 140415 12:25:46 001 Xrd: Connect: can't open connection to [
> xrootd-redic.pi.infn.it:1094]
> 140415 12:25:46 001 Xrd: PhyConnection: Disconnecting socket...
> 140415 12:25:46 001 Xrd: Connect: Connect(xrootd-redic.pi.infn.it, 1094)
> returned -1
> 140415 12:25:46 001 Xrd: XrdNetFile: Error creating logical connection to
> xrootd-redic.pi.infn.it:1094
> 140415 12:25:46 001 Xrd: Open: Disconnecting.
> 140415 12:25:46 001 Xrd: Cache: Cache Status --------------------------
> 140415 12:25:46 001 Xrd: Cache: --------------------------------------
> fTotalByteCount = 0
> Last server error 10000 ('')
> Error accessing path/file for root://
> xrootd-cms.infn.it//store/data/Run2013A/MinimumBias/RECO/PromptReco-v1/000/212/188/00000/6C246B92-C67B-E211-BE02-003048D2BC62.root
>
> so no attempt is done on the other. What is wrong here? all in all it
> tries 8 times to connect to the SAME server, and 0 times to the other ...
>
>
> thanks a lot
>
> tom
>
> --
> Tommaso Boccali
> INFN Pisa
>
> ------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>
>
> ------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>



-- 
Tommaso Boccali
INFN Pisa

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1