Hi Manfred and Rolf, It doesn't look like xrootd/olbd are running on babar2 right now. Rolf, will you be starting it? I would have thought that xinetd would have started it, though. Ah, does xinetd itself need to be restarted in order to see the changes in /etc/xinetd/? One other small thing: it looks like different config files are being used for the xrootd and the olbd: fzk-babar2> grep server_args /etc/xinetd.d/xrootd /etc/xinetd.d/olbd /etc/xinetd.d/xrootd: server_args = -r -l /var/log/xrootd -c /opt/xrootd/etc/redirector.cf /etc/xinetd.d/olbd: server_args = -m -l /var/log/olbd -c /opt/xrootd/etc/xrootd_redirector.cf Was that intentional? In general you should be able to use a single config file for both the xrootd and olbd on any given single machine. Pete On Thu, Nov 11, 2004 at 10:46:38AM +0100, Manfred Alef wrote: > Hi Pete, > > I have enabled the old xinetd.d files. You may now be able > to start olbd and xrootd as babaradm. > > The xinetd.d daemons use the old configuration file > /opt/xrootd/etc/xrootd_redirector.cf. > > I have killed the daemons running as root. > > Regards > Manfred > > > Peter Elmer wrote: > > Hi Manfred and Rolf, > > > > Two other small things, I see: > > > > fzk-babar2> ps -ef | grep xrootd > > root 3349 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf > > root 3361 3082 0 Nov10 pts/0 00:00:12 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf > > elmer 8970 8225 0 10:02 pts/3 00:00:00 grep xrootd > > fzk-babar2> less /opt/xrootd/etc/redirector.cf > > > > For security reasons, the daemon should not run as root, but as some > > generic babar account which would normally own the data in the unix file > > owner sense (e.g. whatever account you use for importing babar data). I > > believe Andy has disallowed running as root in later versions xrootd/olbd, > > too. > > > > Also the xrootd protocol allows clients to survive server crashes (or > > restarts), but you need to set something up to automatically restart the > > xrootd should it crash for some reason. Checking and restarting the > > server every 10 minutes should be within the period in which the client will > > keep retrying to connect (eventually it times out and just gives up). > > > > Pete > > > > > > > > On Thu, Nov 11, 2004 at 10:06:24AM +0100, Peter Elmer wrote: > > > >> Hi Manfred, > >> > >>On Thu, Nov 11, 2004 at 09:52:28AM +0100, Manfred Alef wrote: > >> > >>>the redirector server was upgraded to SL 3.03. Now we could > >>>start olbd from xrootd's RHEL RPM without any problem. > >> > >> Ok, that is strange. The only guess I have is that it could have been the > >>xinetd stuff interfering with starting by hand (if that is what you did). > >> > >> In any case, I looked at the logs for the xrootd/olbd on babar2 and there > >>is still a problem. Normally the xrootd should connect to the olbd on > >>the same machine, but there are errors in the xrootd log file: > >> > >>041110 09:43:54 3349 odc_Manager: Connected to l01-001-122 > >>041110 09:43:54 3349 odc_GetLine: Unable to reading request ; connection reset b > >>y peer > >>041110 09:43:54 3349 odc_Manager: Unable to receive msg from l01-001-122; connec > >>tion reset by peer > >> > >>and in the olbd log file: > >> > >>041110 09:43:54 3361 olb_Accept: Unable to accept connection from l01-001-122.gr > >>idka.de; permission denied > >> > >> Looking at the config file, I see: > >> > >>olb.allow host l01-001-122 # babar2.fzk.de > >>olb.allow host f01-001-121 > >>olb.allow host f01-001-122 > >> > >> I think you may need to specify the full hostname, including domain, i.e. > >> > >>olb.allow host l01-001-122.gridka.de # babar2.fzk.de > >>olb.allow host f01-001-121.gridka.de > >>olb.allow host f01-001-122.gridka.de > >> > >> Does that work? > >> > >> Pete > >> > >> > >> > >>>Peter Elmer wrote: > >>> > >>>> Hi Manfred and Rolf, > >>>> > >>>> Sorry for the late reply. (You picked a somewhat awkward time to try this > >>>>since Andy is away and I'm just back from vacation in a series of > >>>>meetings/transits this past week!) > >>>> > >>>> I'll give this a try to see if I can reproduce it. I see, however, that > >>>>you restarted things yesterday: > >>>> > >>>>root 3349 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf > >>>>root 3361 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf > >>>> > >>>>and I don't see anything in the log files about "Unable to bind socket; > >>>>address already in use". There are other problems related to the dataservers > >>>>connecting to the redirector, I think, but I'll look at those now. > >>>> > >>>> One thing that I recall is that Jos and I looked at setting up xinitd > >>>>style restarts of the server. That wasn't still there, was it? (I don't > >>>>see it now, but presumably it would have interfered with separate attempts > >>>>to start the daemons by hand.) > >>>> > >>>> Pete > >>>> > >>>>On Fri, Nov 05, 2004 at 01:00:36PM +0100, Manfred Alef wrote: > >>>> > >>>> > >>>>>Hi Pete, > >>>>> > >>>>>the config files are from http://xrootd.slac.stanford.edu/ > >>>>>examples/multserver/index.html. > >>>>> > >>>>>Best regards > >>>>>Manfred > >>>>> > >>>>>babar2 # cat redirector.cf > >>>>># > >>>>># redirector.cf > >>>>># > >>>>># xrootd > >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so > >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so > >>>>>xrootd.export /data > >>>>>odc.manager l01-001-122 3121 > >>>>>odc.trace redirect > >>>>># olbd > >>>>>olb.port 3121 > >>>>>#+olb.allow host kanrdr.slac.stanford.edu > >>>>>#+olb.allow host kan001.slac.stanford.edu > >>>>>#+olb.allow host kan002.slac.stanford.edu > >>>>>olb.allow host l01-001-122 # babar2.fzk.de > >>>>>olb.allow host f01-001-121 > >>>>>olb.allow host f01-001-122 > >>>>>babar2 # > >>>>> > >>>>>[root@f01-001-122 etc]# cat dataserver.cf > >>>>># > >>>>># dataserver.cf > >>>>># > >>>>># xrootd > >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so > >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so > >>>>>xrootd.export /data > >>>>>oss.readonly > >>>>>odc.manager l01-001-122 3121 > >>>>># olbd > >>>>>olb.port 3121 > >>>>>olb.subscribe l01-001-122 3121 > >>>>>[root@f01-001-122 etc]# > >>>>> > >>>>> > >>>>> > >>>>>Peter Elmer wrote: > >>>>> > >>>>> > >>>>>> [CC the xrootd mailing list] > >>>>>> > >>>>>> Hi Rolf, > >>>>>> > >>>>>> Do you have the config files you are using to try to start xrootd and > >>>>>>the olbd (on the redirector and the file servers)? > >>>>>> > >>>>>> Pete > >>>>>> > >>>>>>On Fri, Nov 05, 2004 at 11:33:47AM +0100, Manfred Alef wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>>Hi Pete, > >>>>>>> > >>>>>>>I am sitting here at GridKa together with Manfred Alef and we are trzing > >>>>>>>to install xrootd on two of the fileservers and on babar2, a login > >>>>>>>mashine which will also be the redirector. > >>>>>>>We use the current production versin and had no problems starting xrootd > >>>>>>>and albd on one of the fileservers. However, when we trz to start the > >>>>>>>olbd on the redirector, it exits with exit code 1. The logfile is > >>>>>>>attached. We made sure nothing else is going on on the mashine (reboot) > >>>>>>>and also removed anz old socket we could find in /tmp/.olb/ > >>>>>>>Do you have an idea what could go wrong or what else we could try? > >>>>>>> > >>>>>>>Cheers, Rolf > >>>>>>> > >>>>>>>--------------------------------------------------------------- > >>>>>>>41105 10:44:31 32156 olb_Config: (c) 2004 SLAC olbd version > >>>>>>>20040907-0403 initializing as Manager > >>>>>>>041105 10:44:31 32156 olb_Bind: Unable to bind socket; address already > >>>>>>>in use > >>>>>>>041105 10:44:31 32156 olb_Config: Manager initialization failed. > >>>>>>>041105 10:46:15 32191 olb_Config: (c) 2004 SLAC olbd version > >>>>>>>20040907-0403 initializing as Manager > >>>>>>>041105 10:46:15 32191 olb_Bind: Unable to bind socket; address already > >>>>>>>in use > >>>>>>>041105 10:46:15 32191 olb_Config: Manager initialization failed. > >>>>>>>041105 10:48:49 32248 Schedule scheduling midnight runner in 47471 seconds > >>>>>>>041105 10:48:49 32248 olb_Config: (c) 2004 SLAC olbd version > >>>>>>>20040907-0403 initializing as Manager > >>>>>>>041105 10:48:49 32248 olb_Bind: Unable to bind socket; address already > >>>>>>>in use > >>>>>>>041105 10:48:49 32248 olb_Config: Manager initialization failed. > >>>>>>>041105 11:10:37 3175 olb_Config: (c) 2004 SLAC olbd version > >>>>>>>20040907-0403 initializing as Manager > >>>>>>>041105 11:10:37 3175 olb_Bind: Unable to bind socket; address already in use > >>>>>>>041105 11:10:37 3175 olb_Config: Manager initialization failed. > >>>>>>>041105 11:18:16 3332 olb_Config: (c) 2004 SLAC olbd version > >>>>>>>20040907-0403 initializing as Manager > >>>>>>>041105 11:18:16 3332 olb_Bind: Unable to bind socket; address already in use > >>>>>>>041105 11:18:16 3332 olb_Config: Manager initialization failed. > >>>>>>>---------------------------------------------------------------- > >>>>>> > >>>>>> > >>>>>> > >>>>>> > > > > > > > > ------------------------------------------------------------------------- > > Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 > > Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland > > ------------------------------------------------------------------------- > > > ------------------------------------------------------------------------- Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland -------------------------------------------------------------------------