Print

Print


ciao andy, the main difference with what I get is
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.1: root://xrootd.ba.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.2: root://xrootd-redic.pi.infn.it:1094//tmp/motd.

and

> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094//tmp/motd.

so the order is not fixed (as should be since we have a RR redirector).

In the mean time I tried to do the same from a clean machine, w/o NSCD installed, and I get the same as you (== it works):

140424 08:41:54 30982 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
140424 08:41:54 30982 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
…
140424 08:41:59 30982 Xrd: ShowUrls: URL n.1: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
140424 08:41:59 30982 Xrd: ShowUrls: URL n.2: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.

so order is not fixed, and sooner or later it works.

I am starting to believe it is really a problem of DNS caching. (even if “host xrootd-cms.infn.it” shows a RR behaviour .. maybe only some calls are cached?).

still, I had understood Xrootd client still randomizes the access to DNS-aliases hosts.
So, real question is: 
if the answer from DNS is consistently

> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094//tmp/motd.


would the client ever try xrootd.ba.infn.it (via internal client randomization order?)

By the way, I get exactly the same (== not working) behaviour on lxplus:

[tboccali@lxplus0164 tboccali]$ xrdcp -d 1 root://xrootd-cms.infn.it///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root .
…
(consistently)
140424 08:45:16 7573 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
140424 08:45:16 7573 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
…

140424 08:45:41 7573 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
Last server error 10000 ('')

where i have

[tboccali@lxplus0164 tboccali]$ ps -ef |grep nscd
tboccali  8039  7389  1 08:46 pts/14   00:00:00 grep nscd
nscd     28362     1  0 Apr11 ?        00:07:29 /usr/sbin/nscd


configured as (/etc/nscd.conf)
        enable-cache            hosts           yes

(so yes, enabling  DNS caching…)

as for the other question:
- on lxplus I do not have a root login file
- the full env dump is here

https://www.dropbox.com/s/4aoqbbynpyfrn6b/env_lxplus


thanks a lot

tom 


On 24 Apr 2014, at 03:58, Andrew Hanushevsky <[log in to unmask]> wrote:

> Hi Tommaso,
>  
> I tried this in my aliased config under 3.2.2 (I don’t have 3.2.4 handy) and it worked exactly like it was supposed to. I then tried it using xrootd-cms.infn.it and it worked equally well (see the below debug output). So, I am at a loss of why it works for me but not for you. Perhaps you have some strange xroot client envar set in the rootrc file or in your login rc file? Could you print out aall your environmental variable (i.e. printenv)?
>  
> Andy
>  
> 140423 16:43:45 5000 Xrd: Create: (C) 2004-2010 by the Xrootd group. XrdClient $Revision$ - Xrootd version: v3.2.2
> 140423 16:43:46 5000 Xrd: ShowUrls: The converted URLs count is 2
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: ShowUrls: The converted URLs count is 2
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.1: root://xrootd.ba.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: ShowUrls: URL n.2: root://xrootd-redic.pi.infn.it:1094//tmp/motd.
> 140423 16:43:46 5000 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]
> 140423 16:43:46 5000 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
> 140423 16:43:46 5000 Xrd: Open: Connection attempt failed. Sleeping 5 seconds.
> 140423 16:43:51 5000 Xrd: ShowUrls: The converted URLs count is 2
> 140423 16:43:51 5000 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094//tmp/motd.
> 140423 16:43:51 5000 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094//tmp/motd.
> 140423 16:43:52 5000 Xrd: Open: Access to server granted.
> 140423 16:43:52 5000 Xrd: Open: Opening the remote file /tmp/motd
> 140423 16:43:52 5000 Xrd: Open: File open in progress.
> 140423 16:43:52 5007 Xrd: CheckErrorStatus: Server [xrootd-cms.infn.it] declared: No servers are available to read the file.(error code: 3011)
> Last server error 3011 ('No servers are available to read the file.')
> Error accessing path/file for root://xrootd-cms.infn.it//tmp/motd
>  
> From: Tommaso Boccali
> Sent: Wednesday, April 23, 2014 7:49 AM
> To: Andrew Hanushevsky
> Cc: [log in to unmask]
> Subject: Re: problem with aliased redirectors
>  
> (sorry for the flooding, last mail today ;)
> … but I could get the same with a newer CMSSW bundle:
>  
> lrwxrwxrwx 1 498 497 53 Feb 14 14:46 /cvmfs/cms.cern.ch/slc5_amd64_gcc481/cms/cmssw/CMSSW_7_0_0/external/slc5_amd64_gcc481/bin/xrdcp -> ../../../../../../external/xrootd/3.2.4-cms/bin/xrdcp
>  
> 140423 16:46:03 12350 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
> 140423 16:46:03 12350 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
> 140423 16:46:03 12350 Xrd: ShowUrls: The converted URLs count is 2
> 140423 16:46:03 12350 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
> 140423 16:46:03 12350 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
> 140423 16:46:03 12350 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]
> 140423 16:46:03 12350 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
> 140423 16:46:03 12350 Xrd: Open: Connection attempt failed. Sleeping 5 seconds.
> 140423 16:46:08 12350 Xrd: ShowUrls: The converted URLs count is 2
> 140423 16:46:08 12350 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
> 140423 16:46:08 12350 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
> 140423 16:46:08 12350 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]
> 140423 16:46:08 12350 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
> 140423 16:46:08 12350 Xrd: Open: Connection attempt failed. Sleeping 5 seconds
> …
> 140423 16:46:38 12350 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]
> 140423 16:46:38 12350 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
> Last server error 10000 ('')
> Error accessing path/file for root://xrootd-cms.infn.it///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root
>  
> On 23 Apr 2014, at 16:38, Tommaso Boccali <[log in to unmask]> wrote:
> 
>> ah sorry, I forgot you question about versions ..
>> we use xrootd client bundles with CMSSW releases ... for Run1 data I see
>>  
>> lrwxrwxrwx 1 498 497 54 Mar  1 00:27 xrdcp -> ../../../../../../external/xrootd/3.1.0-cms2/bin/xrdcp
>>  
>> so quite older than yours. About ROOT, that also comes from the release, and it is
>>  
>> lrwxrwxrwx 1 498 497 53 Mar  1 00:27 root.exe -> ../../../../../../lcg/root/5.32.00-cms21/bin/root.exe
>>  
>> eventually, these are difficult to change, since they come with the full sw stack....
>>  
>> ciao ciao
>>  
>> tom
>>  
>> 
>> 
>> On Wed, Apr 23, 2014 at 4:20 PM, Tommaso Boccali <[log in to unmask]> wrote:
>> Ok , I came back and was able to do some more tests:
>> - DNS seems fine:
>> 
>> -bash-3.2$ dig xrootd-cms.infn.it
>> 
>> ; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> xrootd-cms.infn.it
>> ;; global options:  printcmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34947
>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 4, ADDITIONAL: 6
>> 
>> ;; QUESTION SECTION:
>> ;xrootd-cms.infn.it.            IN      A
>> 
>> ;; ANSWER SECTION:
>> xrootd-cms.infn.it.     69424   IN      A       90.147.66.75
>> xrootd-cms.infn.it.     69424   IN      A       193.205.76.83
>> …
>> 
>> so they are the same records as yours.
>> 
>> - I can still consistently get a failure with:
>> 
>> xrdcp -d 1 root://xrootd-cms.infn.it///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root .
>> 
>> …
>> 140423 16:13:49 001 Xrd: ShowUrls: The converted URLs count is 2
>> 140423 16:13:49 001 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:49 001 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:49 001 Xrd: ShowUrls: The converted URLs count is 2
>> 140423 16:13:49 001 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:49 001 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:49 001 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]
>> 140423 16:13:49 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
>> 140423 16:13:49 001 Xrd: Open: Connection attempt failed. Sleeping 5 seconds.
>> 140423 16:13:54 001 Xrd: ShowUrls: The converted URLs count is 2
>> 140423 16:13:54 001 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:54 001 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:54 001 Xrd: Connect: can't open connection to [xrootd-redic.pi.infn.it:1094]
>> 140423 16:13:54 001 Xrd: XrdNetFile: Error creating logical connection to xrootd-redic.pi.infn.it:1094
>> 140423 16:13:54 001 Xrd: Open: Connection attempt failed. Sleeping 5 seconds.
>> 140423 16:13:59 001 Xrd: ShowUrls: The converted URLs count is 2
>> 140423 16:13:59 001 Xrd: ShowUrls: URL n.1: root://xrootd-redic.pi.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> 140423 16:13:59 001 Xrd: ShowUrls: URL n.2: root://xrootd.ba.infn.it:1094///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root.
>> …
>> 
>> 
>> But: these machines, as well as CERN ones (at the very least), do use NSCD (so they cache the DNS answers). Could this be the problem? I was guessing it would not, since my understanding was that even Xrootd randomizes the entries got from DNS …
>> Can this be the problem? If yes, it would be tricky: we basically cannot ask sites to disable NSCD, sicne it is used to decrease load …
>> 
>> PS: maybe you can try directly
>> 
>> 
>> xrdcp -d 1 root://xrootd-cms.infn.it///store/test/xrootd/T2_IT_Pisa/store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root .
>> 
>> it should work for anyone up to the point where you are redirected (redirectors are w/o authentication, servers are….)
>> 
>> to see if it works for you
>> 
>> 
>> ciao ciao
>> 
>> tom
>> 
>> 
>> On 19 Apr 2014, at 05:02, Andrew Hanushevsky <[log in to unmask]> wrote:
>> 
>> > Hi Tommaso,
>> >
>> > OK, I created a setup similar to yours. Specifically, in the DNS I have:
>> >
>> > abh-rdr.slac.stanford.edu. 600  IN      A       134.79.120.146 // noric35
>> > abh-rdr.slac.stanford.edu. 600  IN      A       134.79.120.152 // noric40
>> >
>> > I started a redirector on noric35 and left noric40 with nothing. I then did an xrdcp (using the old client) and it worked as expected (see the trace below). Some things to consider:
>> >
>> > 1) Was your DNS setup up as mine was (i.e. using A records)?
>> > 2) What was the release you were using? I used what is now R4 and 3.2.6 (a much older release) with identical results. The recoery code was not changed in the 3.2.x or the 3.3.x series. So, no suprise.
>> >
>> > My only assumption is that the DNS record was not setup in a way that the client expected (though, the trace seems to counter that notion). So, at the moment I am at a loss. Anyway, to cut down the only tracing chatter so as to make the output more readable, just use -d 1 which is good enough.
>> >
>> > Andy
>> >
>> > ./xrdcp -f -d 1 root://abh-rdr.slac.stanford.edu//tmp/motd /tmp/motd
>> > 140418 19:38:24 28376 Xrd: main: (C) 2004-2011 by the XRootD collaboration. Version: v20140318-ef1e4ab
>> > 140418 19:38:24 28376 Xrd: Create: (C) 2004-2010 by the Xrootd group. XrdClient $Revision$ - Xrootd version: v20140321-cdb721c
>> > 140418 19:38:24 28376 Xrd: ShowUrls: The converted URLs count is 2
>> > 140418 19:38:24 28376 Xrd: ShowUrls: URL n.1: root://noric40.slac.stanford.edu:1094//tmp/motd.
>> > 140418 19:38:24 28376 Xrd: ShowUrls: URL n.2: root://noric35.slac.stanford.edu:1094//tmp/motd.
>> > 140418 19:38:24 28376 Xrd: ShowUrls: The converted URLs count is 2
>> > 140418 19:38:24 28376 Xrd: ShowUrls: URL n.1: root://noric35.slac.stanford.edu:1094//tmp/motd.
>> > 140418 19:38:24 28376 Xrd: ShowUrls: URL n.2: root://noric40.slac.stanford.edu:1094//tmp/motd.
>> > 140418 19:38:24 28376 Xrd: Connect: can't open connection to [noric40.slac.stanford.edu:1094]
>> > 140418 19:38:24 28376 Xrd: XrdNetFile: Error creating logical connection to noric40.slac.stanford.edu:1094
>> > 140418 19:38:24 28376 Xrd: Open: Connection attempt failed. Sleeping 5 seconds.
>> > 140418 19:38:29 28376 Xrd: ShowUrls: The converted URLs count is 2
>> > 140418 19:38:29 28376 Xrd: ShowUrls: URL n.1: root://noric40.slac.stanford.edu:1094//tmp/motd.
>> > 140418 19:38:29 28376 Xrd: ShowUrls: URL n.2: root://noric35.slac.stanford.edu:1094//tmp/motd.
>> > 140418 19:38:29 28376 Xrd: Open: Access to server granted.
>> > 140418 19:38:29 28376 Xrd: Open: Opening the remote file /tmp/motd
>> > 140418 19:38:29 28376 Xrd: Open: File open in progress.
>> > 140418 19:38:29 28379 Xrd: HandleServerError: Received redirection to [rhel6-64b.slac.stanford.edu:1094]. Token=[]]. Opaque=[].
>> > 140418 19:38:29 28376 Xrd: main: root://abh-rdr.slac.stanford.edu//tmp/motd --> /tmp/motd
>> > 140418 19:38:29 28381 Xrd: Read: Hole in the cache: offs=0, len=1536
>> > [xrootd] Total 0.00 MB  |====================| 100.00 % [inf MB/s]
>> >
>> > ########################################################################
>> > Use REPLY-ALL to reply to list
>> >
>> > To unsubscribe from the XROOTD-L list, click the following link:
>> > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>> 
>> ########################################################################
>> Use REPLY-ALL to reply to list
>> 
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>> 
>> 
>>  
>> -- 
>> Tommaso Boccali
>> INFN Pisa
>> 
>> Use REPLY-ALL to reply to list
>> 
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>> 
> 
>  
> 
> Use REPLY-ALL to reply to list
> 
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
> 


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1