Print

Print


Hi Pete,

I didn't think so. That message has been around forever. What did change was 
how the condition is handled so it's likely that the message comes out more 
"frequently" than it did before. What would be interesting is to figure out 
why that message comes out at all. That would require careful coordination 
between the two logs and looking at the load at the time it occurs.

Andy

----- Original Message ----- 
From: "Peter Elmer" <[log in to unmask]>
To: "abh" <[log in to unmask]>
Cc: <[log in to unmask]>
Sent: Monday, March 28, 2005 11:45 AM
Subject: Re: "xxx got no response" message in redirector logs


>  Hi Andy,
>
>  Ok, I see from the logs that it does happen occasionally (i.e. a handful
> of times over the past week on both redirectors at SLAC), but it sounds
> like it is handled gracefully. This is a new message that you added at
> some point since the last production build (20040907-0403), is that 
> correct?
>
>                                   Pete
>
>
> On Mon, Mar 28, 2005 at 11:41:58AM -0800, abh wrote:
>> Hi Pete,
>>
>> When xrootd sends a request to the olbd it expects a reply in about 5
>> seconds (tunable via a config parameter). If no reply is received, the
>> message comes out and the client is asked to wait and come back later. 
>> This
>> gives the xrootd time to find another olbd, if necessary.
>>
>> Andy
>>
>> ----- Original Message ----- 
>> From: "Peter Elmer" <[log in to unmask]>
>> To: <[log in to unmask]>
>> Sent: Monday, March 28, 2005 7:46 AM
>> Subject: "xxx got no response" message in redirector logs
>>
>>
>> > Hi Andy,
>> >
>> > I just played the game of removing all normal messages from the
>> >redirector
>> >logs to see what remains. On one of the redirectors at SLAC there was 
>> >one
>> >message which I didn't recognize:
>> >
>> >bbr-olb04> cat /var/adm/xrootd/logs/xrdlog | grep -v "asked to wait" |
>> >grep -v redirected | grep -v login | grep -v " disc " | grep -v "xrootd
>> >protocol anchor trim done"
>> >050328 00:00:00 059 (c) 2004 Stanford University/SLAC xrd version
>> >20050321-0425_dbg
>> >050328 02:14:08 097 odc_Locate: polci.3440:169@noma0080 got no response
>> >from bbr-rdr04
>> >path=/store/PR/R12/AllEvents/0002/42/12.3.4e/AllEvents_00024239_12.3.4eV00_C14.2.0bV01.01.root
>> >bbr-olb04>
>> >
>> > The full history for this client process is in (a) below. It _looks_ 
>> > like
>> >the client was subsequently properly redirected for that file (and he 
>> >came
>> >back to ask for another file 21 seconds later, so the application didn't
>> >exit with some error either). What does the "got no response from
>> >bbr-rdr04"
>> >message mean?
>> >
>> >                                  Pete
>> >
>> >(a)
>> >
>> >bbr-olb04> grep "polci.3440:169@noma0080" /var/adm/xrootd/logs/xrdlog
>> >050328 01:15:09 091 XrootdXeq: polci.3440:169@noma0080 login
>> >050328 01:15:09 091 odc_Locate: polci.3440:169@noma0080 redirected to
>> >kan024.sla
>> >c.stanford.edu:1094 by bbr-rdr04
>> >path=/store/PRskims/R12/14.4.0c/BRecoToDDstar/0
>> >1/BRecoToDDstar_0164.01.root
>> >050328 01:15:10 091 odc_Locate: polci.3440:169@noma0080 redirected to
>> >kan038.sla
>> >c.stanford.edu:1094 by bbr-rdr04
>> >path=/store/PRskims/R12/14.4.0c/BRecoToDDstar/0
>> >1/BRecoToDDstar_0164.02HUBC.root
>> >050328 01:15:10 091 odc_Locate: polci.3440:169@noma0080 asked to wait 5 
>> >by
>> >bbr-r
>> >dr04
>> >path=/store/PR/R12/AllEvents/0002/42/12.4.0g/AllEvents_00024299_12.4.0gV01_
>> >C14.2.0bV00.01.root
>> >050328 01:15:15 091 odc_Locate: polci.3440:169@noma0080 redirected to
>> >kan031.sla
>> >c.stanford.edu:1094 by bbr-rdr04
>> >path=/store/PR/R12/AllEvents/0002/42/12.4.0g/Al
>> >lEvents_00024299_12.4.0gV01_C14.2.0bV00.01.root
>> >050328 02:14:08 097 odc_Locate: polci.3440:169@noma0080 got no response
>> >from bbr
>> >-rdr04
>> >path=/store/PR/R12/AllEvents/0002/42/12.3.4e/AllEvents_00024239_12.3.4eV0
>> >0_C14.2.0bV01.01.root
>> >050328 02:14:13 029 odc_Locate: polci.3440:169@noma0080 redirected to
>> >kan035.sla
>> >c.stanford.edu:1094 by bbr-rdr04
>> >path=/store/PR/R12/AllEvents/0002/42/12.3.4e/Al
>> >lEvents_00024239_12.3.4eV00_C14.2.0bV01.01.root
>> >050328 04:20:34 097 odc_Locate: polci.3440:169@noma0080 redirected to
>> >kan005.sla
>> >c.stanford.edu:1094 by bbr-rdr04
>> >path=/store/PR/R12/AllEvents/0002/43/12.4.0g/Al
>> >lEvents_00024343_12.4.0gV01_C14.2.0bV01.01.root
>> >050328 04:53:36 032 XrootdXeq: polci.3440:169@noma0080 disc 3:38:27
>> >bbr-olb04>
>> >
>
>
>
> -------------------------------------------------------------------------
> Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
> Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
> -------------------------------------------------------------------------
>