Print

Print


  Hi Manfred and Rolf,

  It doesn't look like xrootd/olbd are running on babar2 right now. Rolf, 
will you be starting it? I would have thought that xinetd would have started
it, though.  Ah, does xinetd itself need to be restarted in order to see
the changes in /etc/xinetd/?

  One other small thing: it looks like different config files are being
used for the xrootd and the olbd:

fzk-babar2> grep server_args /etc/xinetd.d/xrootd /etc/xinetd.d/olbd 
/etc/xinetd.d/xrootd:   server_args     = -r -l /var/log/xrootd -c /opt/xrootd/etc/redirector.cf
/etc/xinetd.d/olbd:     server_args     = -m -l /var/log/olbd -c /opt/xrootd/etc/xrootd_redirector.cf

Was that intentional? In general you should be able to use a single config
file for both the xrootd and olbd on any given single machine.

                                   Pete

On Thu, Nov 11, 2004 at 10:46:38AM +0100, Manfred Alef wrote:
> Hi Pete,
> 
> I have enabled the old xinetd.d files. You may now be able
> to start olbd and xrootd as babaradm.
> 
> The xinetd.d daemons use the old configuration file
> /opt/xrootd/etc/xrootd_redirector.cf.
> 
> I have killed the daemons running as root.
> 
> Regards
> Manfred
> 
> 
> Peter Elmer wrote:
> >   Hi Manfred and Rolf,
> > 
> >   Two other small things, I see:
> > 
> > fzk-babar2> ps -ef | grep xrootd
> > root      3349  3082  0 Nov10 pts/0    00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf
> > root      3361  3082  0 Nov10 pts/0    00:00:12 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf
> > elmer     8970  8225  0 10:02 pts/3    00:00:00 grep xrootd
> > fzk-babar2> less /opt/xrootd/etc/redirector.cf
> > 
> >   For security reasons, the daemon should not run as root, but as some 
> > generic babar account which would normally own the data in the unix file
> > owner sense (e.g. whatever account you use for importing babar data). I 
> > believe Andy has disallowed running as root in later versions xrootd/olbd, 
> > too.
> > 
> >   Also the xrootd protocol allows clients to survive server crashes (or
> > restarts), but you need to set something up to automatically restart the
> > xrootd should it crash for some reason. Checking and restarting the
> > server every 10 minutes should be within the period in which the client will
> > keep retrying to connect (eventually it times out and just gives up).
> > 
> >                                    Pete
> > 
> > 
> > 
> > On Thu, Nov 11, 2004 at 10:06:24AM +0100, Peter Elmer wrote:
> > 
> >>  Hi Manfred,
> >>
> >>On Thu, Nov 11, 2004 at 09:52:28AM +0100, Manfred Alef wrote:
> >>
> >>>the redirector server was upgraded to SL 3.03. Now we could
> >>>start olbd from xrootd's RHEL RPM without any problem.
> >>
> >>  Ok, that is strange. The only guess I have is that it could have been the 
> >>xinetd stuff interfering with starting by hand (if that is what you did).
> >>
> >>  In any case, I looked at the logs for the xrootd/olbd on babar2 and there
> >>is still a problem. Normally the xrootd should connect to the olbd on
> >>the same machine, but there are errors in the xrootd log file:
> >>
> >>041110 09:43:54 3349 odc_Manager: Connected to l01-001-122
> >>041110 09:43:54 3349 odc_GetLine: Unable to reading request ; connection reset b
> >>y peer
> >>041110 09:43:54 3349 odc_Manager: Unable to receive msg from l01-001-122; connec
> >>tion reset by peer
> >>
> >>and in the olbd log file:
> >>
> >>041110 09:43:54 3361 olb_Accept: Unable to accept connection from l01-001-122.gr
> >>idka.de; permission denied
> >>
> >>  Looking at the config file, I see:
> >>
> >>olb.allow host l01-001-122      # babar2.fzk.de
> >>olb.allow host f01-001-121
> >>olb.allow host f01-001-122
> >>
> >>  I think you may need to specify the full hostname, including domain, i.e.
> >>
> >>olb.allow host l01-001-122.gridka.de     # babar2.fzk.de
> >>olb.allow host f01-001-121.gridka.de
> >>olb.allow host f01-001-122.gridka.de
> >>
> >>  Does that work?
> >>
> >>                                   Pete
> >>
> >>
> >>
> >>>Peter Elmer wrote:
> >>>
> >>>>  Hi Manfred and Rolf,
> >>>>
> >>>>  Sorry for the late reply. (You picked a somewhat awkward time to try this
> >>>>since Andy is away and I'm just back from vacation in a series of 
> >>>>meetings/transits this past week!)
> >>>>
> >>>>  I'll give this a try to see if I can reproduce it. I see, however, that
> >>>>you restarted things yesterday:
> >>>>
> >>>>root      3349  3082  0 Nov10 pts/0    00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf
> >>>>root      3361  3082  0 Nov10 pts/0    00:00:11 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf
> >>>>
> >>>>and I don't see anything in the log files about "Unable to bind socket; 
> >>>>address already in use". There are other problems related to the dataservers
> >>>>connecting to the redirector, I think, but I'll look at those now.
> >>>>
> >>>>  One thing that I recall is that Jos and I looked at setting up xinitd
> >>>>style restarts of the server. That wasn't still there, was it? (I don't
> >>>>see it now, but presumably it would have interfered with separate attempts
> >>>>to start the daemons by hand.)
> >>>>
> >>>>                                   Pete
> >>>>
> >>>>On Fri, Nov 05, 2004 at 01:00:36PM +0100, Manfred Alef wrote:
> >>>>
> >>>>
> >>>>>Hi Pete,
> >>>>>
> >>>>>the config files are from http://xrootd.slac.stanford.edu/
> >>>>>examples/multserver/index.html.
> >>>>>
> >>>>>Best regards
> >>>>>Manfred
> >>>>>
> >>>>>babar2 # cat redirector.cf
> >>>>>#
> >>>>># redirector.cf
> >>>>>#
> >>>>># xrootd
> >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so
> >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so
> >>>>>xrootd.export /data
> >>>>>odc.manager l01-001-122 3121
> >>>>>odc.trace redirect
> >>>>># olbd
> >>>>>olb.port 3121
> >>>>>#+olb.allow host kanrdr.slac.stanford.edu
> >>>>>#+olb.allow host kan001.slac.stanford.edu
> >>>>>#+olb.allow host kan002.slac.stanford.edu
> >>>>>olb.allow host l01-001-122    # babar2.fzk.de
> >>>>>olb.allow host f01-001-121
> >>>>>olb.allow host f01-001-122
> >>>>>babar2 #
> >>>>>
> >>>>>[root@f01-001-122 etc]# cat dataserver.cf
> >>>>>#
> >>>>># dataserver.cf
> >>>>>#
> >>>>># xrootd
> >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so
> >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so
> >>>>>xrootd.export /data
> >>>>>oss.readonly
> >>>>>odc.manager l01-001-122 3121
> >>>>># olbd
> >>>>>olb.port 3121
> >>>>>olb.subscribe l01-001-122 3121
> >>>>>[root@f01-001-122 etc]#
> >>>>>
> >>>>>
> >>>>>
> >>>>>Peter Elmer wrote:
> >>>>>
> >>>>>
> >>>>>> [CC the xrootd mailing list]
> >>>>>>
> >>>>>> Hi Rolf,
> >>>>>>
> >>>>>> Do you have the config files you are using to try to start xrootd and
> >>>>>>the olbd (on the redirector and the file servers)?
> >>>>>>
> >>>>>>                                  Pete
> >>>>>>
> >>>>>>On Fri, Nov 05, 2004 at 11:33:47AM +0100, Manfred Alef wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Hi Pete,
> >>>>>>>
> >>>>>>>I am sitting here at GridKa together with Manfred Alef and we are trzing 
> >>>>>>>to install xrootd on two of the fileservers and on babar2, a login 
> >>>>>>>mashine which will also be the redirector.
> >>>>>>>We use the current production versin and had no problems starting xrootd 
> >>>>>>>and albd on one of the fileservers.  However, when we trz to start the 
> >>>>>>>olbd on the redirector, it exits with exit code 1.  The logfile is 
> >>>>>>>attached.  We made sure nothing else is going on on the mashine (reboot) 
> >>>>>>>and also removed anz old socket we could find in /tmp/.olb/
> >>>>>>>Do you have an idea what could go wrong or what else we could try?
> >>>>>>>
> >>>>>>>Cheers, Rolf
> >>>>>>>
> >>>>>>>---------------------------------------------------------------
> >>>>>>>41105 10:44:31 32156 olb_Config: (c) 2004 SLAC olbd version 
> >>>>>>>20040907-0403 initializing as Manager
> >>>>>>>041105 10:44:31 32156 olb_Bind: Unable to bind socket; address already 
> >>>>>>>in use
> >>>>>>>041105 10:44:31 32156 olb_Config: Manager initialization failed.
> >>>>>>>041105 10:46:15 32191 olb_Config: (c) 2004 SLAC olbd version 
> >>>>>>>20040907-0403 initializing as Manager
> >>>>>>>041105 10:46:15 32191 olb_Bind: Unable to bind socket; address already 
> >>>>>>>in use
> >>>>>>>041105 10:46:15 32191 olb_Config: Manager initialization failed.
> >>>>>>>041105 10:48:49 32248 Schedule scheduling midnight runner in 47471 seconds
> >>>>>>>041105 10:48:49 32248 olb_Config: (c) 2004 SLAC olbd version 
> >>>>>>>20040907-0403 initializing as Manager
> >>>>>>>041105 10:48:49 32248 olb_Bind: Unable to bind socket; address already 
> >>>>>>>in use
> >>>>>>>041105 10:48:49 32248 olb_Config: Manager initialization failed.
> >>>>>>>041105 11:10:37 3175 olb_Config: (c) 2004 SLAC olbd version 
> >>>>>>>20040907-0403 initializing as Manager
> >>>>>>>041105 11:10:37 3175 olb_Bind: Unable to bind socket; address already in use
> >>>>>>>041105 11:10:37 3175 olb_Config: Manager initialization failed.
> >>>>>>>041105 11:18:16 3332 olb_Config: (c) 2004 SLAC olbd version 
> >>>>>>>20040907-0403 initializing as Manager
> >>>>>>>041105 11:18:16 3332 olb_Bind: Unable to bind socket; address already in use
> >>>>>>>041105 11:18:16 3332 olb_Config: Manager initialization failed.
> >>>>>>>----------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> > 
> > 
> > 
> > -------------------------------------------------------------------------
> > Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
> > Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
> > -------------------------------------------------------------------------
> > 
> 



-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------