Hi Pete, I have enabled the old xinetd.d files. You may now be able to start olbd and xrootd as babaradm. The xinetd.d daemons use the old configuration file /opt/xrootd/etc/xrootd_redirector.cf. I have killed the daemons running as root. Regards Manfred Peter Elmer wrote: > Hi Manfred and Rolf, > > Two other small things, I see: > > fzk-babar2> ps -ef | grep xrootd > root 3349 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf > root 3361 3082 0 Nov10 pts/0 00:00:12 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf > elmer 8970 8225 0 10:02 pts/3 00:00:00 grep xrootd > fzk-babar2> less /opt/xrootd/etc/redirector.cf > > For security reasons, the daemon should not run as root, but as some > generic babar account which would normally own the data in the unix file > owner sense (e.g. whatever account you use for importing babar data). I > believe Andy has disallowed running as root in later versions xrootd/olbd, > too. > > Also the xrootd protocol allows clients to survive server crashes (or > restarts), but you need to set something up to automatically restart the > xrootd should it crash for some reason. Checking and restarting the > server every 10 minutes should be within the period in which the client will > keep retrying to connect (eventually it times out and just gives up). > > Pete > > > > On Thu, Nov 11, 2004 at 10:06:24AM +0100, Peter Elmer wrote: > >> Hi Manfred, >> >>On Thu, Nov 11, 2004 at 09:52:28AM +0100, Manfred Alef wrote: >> >>>the redirector server was upgraded to SL 3.03. Now we could >>>start olbd from xrootd's RHEL RPM without any problem. >> >> Ok, that is strange. The only guess I have is that it could have been the >>xinetd stuff interfering with starting by hand (if that is what you did). >> >> In any case, I looked at the logs for the xrootd/olbd on babar2 and there >>is still a problem. Normally the xrootd should connect to the olbd on >>the same machine, but there are errors in the xrootd log file: >> >>041110 09:43:54 3349 odc_Manager: Connected to l01-001-122 >>041110 09:43:54 3349 odc_GetLine: Unable to reading request ; connection reset b >>y peer >>041110 09:43:54 3349 odc_Manager: Unable to receive msg from l01-001-122; connec >>tion reset by peer >> >>and in the olbd log file: >> >>041110 09:43:54 3361 olb_Accept: Unable to accept connection from l01-001-122.gr >>idka.de; permission denied >> >> Looking at the config file, I see: >> >>olb.allow host l01-001-122 # babar2.fzk.de >>olb.allow host f01-001-121 >>olb.allow host f01-001-122 >> >> I think you may need to specify the full hostname, including domain, i.e. >> >>olb.allow host l01-001-122.gridka.de # babar2.fzk.de >>olb.allow host f01-001-121.gridka.de >>olb.allow host f01-001-122.gridka.de >> >> Does that work? >> >> Pete >> >> >> >>>Peter Elmer wrote: >>> >>>> Hi Manfred and Rolf, >>>> >>>> Sorry for the late reply. (You picked a somewhat awkward time to try this >>>>since Andy is away and I'm just back from vacation in a series of >>>>meetings/transits this past week!) >>>> >>>> I'll give this a try to see if I can reproduce it. I see, however, that >>>>you restarted things yesterday: >>>> >>>>root 3349 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf >>>>root 3361 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf >>>> >>>>and I don't see anything in the log files about "Unable to bind socket; >>>>address already in use". There are other problems related to the dataservers >>>>connecting to the redirector, I think, but I'll look at those now. >>>> >>>> One thing that I recall is that Jos and I looked at setting up xinitd >>>>style restarts of the server. That wasn't still there, was it? (I don't >>>>see it now, but presumably it would have interfered with separate attempts >>>>to start the daemons by hand.) >>>> >>>> Pete >>>> >>>>On Fri, Nov 05, 2004 at 01:00:36PM +0100, Manfred Alef wrote: >>>> >>>> >>>>>Hi Pete, >>>>> >>>>>the config files are from http://xrootd.slac.stanford.edu/ >>>>>examples/multserver/index.html. >>>>> >>>>>Best regards >>>>>Manfred >>>>> >>>>>babar2 # cat redirector.cf >>>>># >>>>># redirector.cf >>>>># >>>>># xrootd >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so >>>>>xrootd.export /data >>>>>odc.manager l01-001-122 3121 >>>>>odc.trace redirect >>>>># olbd >>>>>olb.port 3121 >>>>>#+olb.allow host kanrdr.slac.stanford.edu >>>>>#+olb.allow host kan001.slac.stanford.edu >>>>>#+olb.allow host kan002.slac.stanford.edu >>>>>olb.allow host l01-001-122 # babar2.fzk.de >>>>>olb.allow host f01-001-121 >>>>>olb.allow host f01-001-122 >>>>>babar2 # >>>>> >>>>>[root@f01-001-122 etc]# cat dataserver.cf >>>>># >>>>># dataserver.cf >>>>># >>>>># xrootd >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so >>>>>xrootd.export /data >>>>>oss.readonly >>>>>odc.manager l01-001-122 3121 >>>>># olbd >>>>>olb.port 3121 >>>>>olb.subscribe l01-001-122 3121 >>>>>[root@f01-001-122 etc]# >>>>> >>>>> >>>>> >>>>>Peter Elmer wrote: >>>>> >>>>> >>>>>> [CC the xrootd mailing list] >>>>>> >>>>>> Hi Rolf, >>>>>> >>>>>> Do you have the config files you are using to try to start xrootd and >>>>>>the olbd (on the redirector and the file servers)? >>>>>> >>>>>> Pete >>>>>> >>>>>>On Fri, Nov 05, 2004 at 11:33:47AM +0100, Manfred Alef wrote: >>>>>> >>>>>> >>>>>> >>>>>>>Hi Pete, >>>>>>> >>>>>>>I am sitting here at GridKa together with Manfred Alef and we are trzing >>>>>>>to install xrootd on two of the fileservers and on babar2, a login >>>>>>>mashine which will also be the redirector. >>>>>>>We use the current production versin and had no problems starting xrootd >>>>>>>and albd on one of the fileservers. However, when we trz to start the >>>>>>>olbd on the redirector, it exits with exit code 1. The logfile is >>>>>>>attached. We made sure nothing else is going on on the mashine (reboot) >>>>>>>and also removed anz old socket we could find in /tmp/.olb/ >>>>>>>Do you have an idea what could go wrong or what else we could try? >>>>>>> >>>>>>>Cheers, Rolf >>>>>>> >>>>>>>--------------------------------------------------------------- >>>>>>>41105 10:44:31 32156 olb_Config: (c) 2004 SLAC olbd version >>>>>>>20040907-0403 initializing as Manager >>>>>>>041105 10:44:31 32156 olb_Bind: Unable to bind socket; address already >>>>>>>in use >>>>>>>041105 10:44:31 32156 olb_Config: Manager initialization failed. >>>>>>>041105 10:46:15 32191 olb_Config: (c) 2004 SLAC olbd version >>>>>>>20040907-0403 initializing as Manager >>>>>>>041105 10:46:15 32191 olb_Bind: Unable to bind socket; address already >>>>>>>in use >>>>>>>041105 10:46:15 32191 olb_Config: Manager initialization failed. >>>>>>>041105 10:48:49 32248 Schedule scheduling midnight runner in 47471 seconds >>>>>>>041105 10:48:49 32248 olb_Config: (c) 2004 SLAC olbd version >>>>>>>20040907-0403 initializing as Manager >>>>>>>041105 10:48:49 32248 olb_Bind: Unable to bind socket; address already >>>>>>>in use >>>>>>>041105 10:48:49 32248 olb_Config: Manager initialization failed. >>>>>>>041105 11:10:37 3175 olb_Config: (c) 2004 SLAC olbd version >>>>>>>20040907-0403 initializing as Manager >>>>>>>041105 11:10:37 3175 olb_Bind: Unable to bind socket; address already in use >>>>>>>041105 11:10:37 3175 olb_Config: Manager initialization failed. >>>>>>>041105 11:18:16 3332 olb_Config: (c) 2004 SLAC olbd version >>>>>>>20040907-0403 initializing as Manager >>>>>>>041105 11:18:16 3332 olb_Bind: Unable to bind socket; address already in use >>>>>>>041105 11:18:16 3332 olb_Config: Manager initialization failed. >>>>>>>---------------------------------------------------------------- >>>>>> >>>>>> >>>>>> >>>>>> > > > > ------------------------------------------------------------------------- > Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 > Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland > ------------------------------------------------------------------------- >