Print

Print


  Hi Manfred and Rolf,

  Two other small things, I see:

fzk-babar2> ps -ef | grep xrootd
root      3349  3082  0 Nov10 pts/0    00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf
root      3361  3082  0 Nov10 pts/0    00:00:12 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf
elmer     8970  8225  0 10:02 pts/3    00:00:00 grep xrootd
fzk-babar2> less /opt/xrootd/etc/redirector.cf

  For security reasons, the daemon should not run as root, but as some 
generic babar account which would normally own the data in the unix file
owner sense (e.g. whatever account you use for importing babar data). I 
believe Andy has disallowed running as root in later versions xrootd/olbd, 
too.

  Also the xrootd protocol allows clients to survive server crashes (or
restarts), but you need to set something up to automatically restart the
xrootd should it crash for some reason. Checking and restarting the
server every 10 minutes should be within the period in which the client will
keep retrying to connect (eventually it times out and just gives up).

                                   Pete



On Thu, Nov 11, 2004 at 10:06:24AM +0100, Peter Elmer wrote:
>   Hi Manfred,
> 
> On Thu, Nov 11, 2004 at 09:52:28AM +0100, Manfred Alef wrote:
> > the redirector server was upgraded to SL 3.03. Now we could
> > start olbd from xrootd's RHEL RPM without any problem.
> 
>   Ok, that is strange. The only guess I have is that it could have been the 
> xinetd stuff interfering with starting by hand (if that is what you did).
> 
>   In any case, I looked at the logs for the xrootd/olbd on babar2 and there
> is still a problem. Normally the xrootd should connect to the olbd on
> the same machine, but there are errors in the xrootd log file:
> 
> 041110 09:43:54 3349 odc_Manager: Connected to l01-001-122
> 041110 09:43:54 3349 odc_GetLine: Unable to reading request ; connection reset b
> y peer
> 041110 09:43:54 3349 odc_Manager: Unable to receive msg from l01-001-122; connec
> tion reset by peer
> 
> and in the olbd log file:
> 
> 041110 09:43:54 3361 olb_Accept: Unable to accept connection from l01-001-122.gr
> idka.de; permission denied
> 
>   Looking at the config file, I see:
> 
> olb.allow host l01-001-122      # babar2.fzk.de
> olb.allow host f01-001-121
> olb.allow host f01-001-122
> 
>   I think you may need to specify the full hostname, including domain, i.e.
> 
> olb.allow host l01-001-122.gridka.de     # babar2.fzk.de
> olb.allow host f01-001-121.gridka.de
> olb.allow host f01-001-122.gridka.de
> 
>   Does that work?
> 
>                                    Pete
> 
> 
> > Peter Elmer wrote:
> > >   Hi Manfred and Rolf,
> > > 
> > >   Sorry for the late reply. (You picked a somewhat awkward time to try this
> > > since Andy is away and I'm just back from vacation in a series of 
> > > meetings/transits this past week!)
> > > 
> > >   I'll give this a try to see if I can reproduce it. I see, however, that
> > > you restarted things yesterday:
> > > 
> > > root      3349  3082  0 Nov10 pts/0    00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf
> > > root      3361  3082  0 Nov10 pts/0    00:00:11 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf
> > > 
> > > and I don't see anything in the log files about "Unable to bind socket; 
> > > address already in use". There are other problems related to the dataservers
> > > connecting to the redirector, I think, but I'll look at those now.
> > > 
> > >   One thing that I recall is that Jos and I looked at setting up xinitd
> > > style restarts of the server. That wasn't still there, was it? (I don't
> > > see it now, but presumably it would have interfered with separate attempts
> > > to start the daemons by hand.)
> > > 
> > >                                    Pete
> > > 
> > > On Fri, Nov 05, 2004 at 01:00:36PM +0100, Manfred Alef wrote:
> > > 
> > >>Hi Pete,
> > >>
> > >>the config files are from http://xrootd.slac.stanford.edu/
> > >>examples/multserver/index.html.
> > >>
> > >>Best regards
> > >>Manfred
> > >>
> > >>babar2 # cat redirector.cf
> > >>#
> > >># redirector.cf
> > >>#
> > >># xrootd
> > >>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so
> > >>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so
> > >>xrootd.export /data
> > >>odc.manager l01-001-122 3121
> > >>odc.trace redirect
> > >># olbd
> > >>olb.port 3121
> > >>#+olb.allow host kanrdr.slac.stanford.edu
> > >>#+olb.allow host kan001.slac.stanford.edu
> > >>#+olb.allow host kan002.slac.stanford.edu
> > >>olb.allow host l01-001-122    # babar2.fzk.de
> > >>olb.allow host f01-001-121
> > >>olb.allow host f01-001-122
> > >>babar2 #
> > >>
> > >>[root@f01-001-122 etc]# cat dataserver.cf
> > >>#
> > >># dataserver.cf
> > >>#
> > >># xrootd
> > >>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so
> > >>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so
> > >>xrootd.export /data
> > >>oss.readonly
> > >>odc.manager l01-001-122 3121
> > >># olbd
> > >>olb.port 3121
> > >>olb.subscribe l01-001-122 3121
> > >>[root@f01-001-122 etc]#
> > >>
> > >>
> > >>
> > >>Peter Elmer wrote:
> > >>
> > >>>  [CC the xrootd mailing list]
> > >>>
> > >>>  Hi Rolf,
> > >>>
> > >>>  Do you have the config files you are using to try to start xrootd and
> > >>>the olbd (on the redirector and the file servers)?
> > >>>
> > >>>                                   Pete
> > >>>
> > >>>On Fri, Nov 05, 2004 at 11:33:47AM +0100, Manfred Alef wrote:
> > >>>
> > >>>
> > >>>>Hi Pete,
> > >>>>
> > >>>>I am sitting here at GridKa together with Manfred Alef and we are trzing 
> > >>>>to install xrootd on two of the fileservers and on babar2, a login 
> > >>>>mashine which will also be the redirector.
> > >>>>We use the current production versin and had no problems starting xrootd 
> > >>>>and albd on one of the fileservers.  However, when we trz to start the 
> > >>>>olbd on the redirector, it exits with exit code 1.  The logfile is 
> > >>>>attached.  We made sure nothing else is going on on the mashine (reboot) 
> > >>>>and also removed anz old socket we could find in /tmp/.olb/
> > >>>>Do you have an idea what could go wrong or what else we could try?
> > >>>>
> > >>>>Cheers, Rolf
> > >>>>
> > >>>>---------------------------------------------------------------
> > >>>>41105 10:44:31 32156 olb_Config: (c) 2004 SLAC olbd version 
> > >>>>20040907-0403 initializing as Manager
> > >>>>041105 10:44:31 32156 olb_Bind: Unable to bind socket; address already 
> > >>>>in use
> > >>>>041105 10:44:31 32156 olb_Config: Manager initialization failed.
> > >>>>041105 10:46:15 32191 olb_Config: (c) 2004 SLAC olbd version 
> > >>>>20040907-0403 initializing as Manager
> > >>>>041105 10:46:15 32191 olb_Bind: Unable to bind socket; address already 
> > >>>>in use
> > >>>>041105 10:46:15 32191 olb_Config: Manager initialization failed.
> > >>>>041105 10:48:49 32248 Schedule scheduling midnight runner in 47471 seconds
> > >>>>041105 10:48:49 32248 olb_Config: (c) 2004 SLAC olbd version 
> > >>>>20040907-0403 initializing as Manager
> > >>>>041105 10:48:49 32248 olb_Bind: Unable to bind socket; address already 
> > >>>>in use
> > >>>>041105 10:48:49 32248 olb_Config: Manager initialization failed.
> > >>>>041105 11:10:37 3175 olb_Config: (c) 2004 SLAC olbd version 
> > >>>>20040907-0403 initializing as Manager
> > >>>>041105 11:10:37 3175 olb_Bind: Unable to bind socket; address already in use
> > >>>>041105 11:10:37 3175 olb_Config: Manager initialization failed.
> > >>>>041105 11:18:16 3332 olb_Config: (c) 2004 SLAC olbd version 
> > >>>>20040907-0403 initializing as Manager
> > >>>>041105 11:18:16 3332 olb_Bind: Unable to bind socket; address already in use
> > >>>>041105 11:18:16 3332 olb_Config: Manager initialization failed.
> > >>>>----------------------------------------------------------------
> > >>>
> > >>>
> > >>>
> > >>>



-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------