On Thursday 11 November 2004 13:14, Peter Elmer wrote: > Hi Manfred and Rolf, > > It doesn't look like xrootd/olbd are running on babar2 right now. Rolf, > will you be starting it? I am teaching today. I won't have time until 1830. Sorry, Rolf > I would have thought that xinetd would have > started it, though. Ah, does xinetd itself need to be restarted in order > to see the changes in /etc/xinetd/? > > One other small thing: it looks like different config files are being > used for the xrootd and the olbd: > > fzk-babar2> grep server_args /etc/xinetd.d/xrootd /etc/xinetd.d/olbd > /etc/xinetd.d/xrootd: server_args = -r -l /var/log/xrootd -c > /opt/xrootd/etc/redirector.cf /etc/xinetd.d/olbd: server_args = -m > -l /var/log/olbd -c /opt/xrootd/etc/xrootd_redirector.cf > > Was that intentional? In general you should be able to use a single config > file for both the xrootd and olbd on any given single machine. > > Pete > > On Thu, Nov 11, 2004 at 10:46:38AM +0100, Manfred Alef wrote: > > Hi Pete, > > > > I have enabled the old xinetd.d files. You may now be able > > to start olbd and xrootd as babaradm. > > > > The xinetd.d daemons use the old configuration file > > /opt/xrootd/etc/xrootd_redirector.cf. > > > > I have killed the daemons running as root. > > > > Regards > > Manfred > > > > Peter Elmer wrote: > > > Hi Manfred and Rolf, > > > > > > Two other small things, I see: > > > > > > fzk-babar2> ps -ef | grep xrootd > > > root 3349 3082 0 Nov10 pts/0 00:00:11 /opt/xrootd/bin/xrootd > > > -r -l /var/log/babar2.xrdlog -c /opt/xrootd/etc/redirector.cf root > > > 3361 3082 0 Nov10 pts/0 00:00:12 /opt/xrootd/bin/olbd -m -l > > > /var/log/babar2.olblog -c /opt/xrootd/etc/redirector.cf elmer 8970 > > > 8225 0 10:02 pts/3 00:00:00 grep xrootd > > > fzk-babar2> less /opt/xrootd/etc/redirector.cf > > > > > > For security reasons, the daemon should not run as root, but as some > > > generic babar account which would normally own the data in the unix > > > file owner sense (e.g. whatever account you use for importing babar > > > data). I believe Andy has disallowed running as root in later versions > > > xrootd/olbd, too. > > > > > > Also the xrootd protocol allows clients to survive server crashes (or > > > restarts), but you need to set something up to automatically restart > > > the xrootd should it crash for some reason. Checking and restarting the > > > server every 10 minutes should be within the period in which the client > > > will keep retrying to connect (eventually it times out and just gives > > > up). > > > > > > Pete > > > > > > On Thu, Nov 11, 2004 at 10:06:24AM +0100, Peter Elmer wrote: > > >> Hi Manfred, > > >> > > >>On Thu, Nov 11, 2004 at 09:52:28AM +0100, Manfred Alef wrote: > > >>>the redirector server was upgraded to SL 3.03. Now we could > > >>>start olbd from xrootd's RHEL RPM without any problem. > > >> > > >> Ok, that is strange. The only guess I have is that it could have been > > >> the xinetd stuff interfering with starting by hand (if that is what > > >> you did). > > >> > > >> In any case, I looked at the logs for the xrootd/olbd on babar2 and > > >> there is still a problem. Normally the xrootd should connect to the > > >> olbd on the same machine, but there are errors in the xrootd log file: > > >> > > >>041110 09:43:54 3349 odc_Manager: Connected to l01-001-122 > > >>041110 09:43:54 3349 odc_GetLine: Unable to reading request ; > > >> connection reset b y peer > > >>041110 09:43:54 3349 odc_Manager: Unable to receive msg from > > >> l01-001-122; connec tion reset by peer > > >> > > >>and in the olbd log file: > > >> > > >>041110 09:43:54 3361 olb_Accept: Unable to accept connection from > > >> l01-001-122.gr idka.de; permission denied > > >> > > >> Looking at the config file, I see: > > >> > > >>olb.allow host l01-001-122 # babar2.fzk.de > > >>olb.allow host f01-001-121 > > >>olb.allow host f01-001-122 > > >> > > >> I think you may need to specify the full hostname, including domain, > > >> i.e. > > >> > > >>olb.allow host l01-001-122.gridka.de # babar2.fzk.de > > >>olb.allow host f01-001-121.gridka.de > > >>olb.allow host f01-001-122.gridka.de > > >> > > >> Does that work? > > >> > > >> Pete > > >> > > >>>Peter Elmer wrote: > > >>>> Hi Manfred and Rolf, > > >>>> > > >>>> Sorry for the late reply. (You picked a somewhat awkward time to > > >>>> try this since Andy is away and I'm just back from vacation in a > > >>>> series of meetings/transits this past week!) > > >>>> > > >>>> I'll give this a try to see if I can reproduce it. I see, however, > > >>>> that you restarted things yesterday: > > >>>> > > >>>>root 3349 3082 0 Nov10 pts/0 00:00:11 > > >>>> /opt/xrootd/bin/xrootd -r -l /var/log/babar2.xrdlog -c > > >>>> /opt/xrootd/etc/redirector.cf root 3361 3082 0 Nov10 pts/0 > > >>>> 00:00:11 /opt/xrootd/bin/olbd -m -l /var/log/babar2.olblog -c > > >>>> /opt/xrootd/etc/redirector.cf > > >>>> > > >>>>and I don't see anything in the log files about "Unable to bind > > >>>> socket; address already in use". There are other problems related to > > >>>> the dataservers connecting to the redirector, I think, but I'll look > > >>>> at those now. > > >>>> > > >>>> One thing that I recall is that Jos and I looked at setting up > > >>>> xinitd style restarts of the server. That wasn't still there, was > > >>>> it? (I don't see it now, but presumably it would have interfered > > >>>> with separate attempts to start the daemons by hand.) > > >>>> > > >>>> Pete > > >>>> > > >>>>On Fri, Nov 05, 2004 at 01:00:36PM +0100, Manfred Alef wrote: > > >>>>>Hi Pete, > > >>>>> > > >>>>>the config files are from http://xrootd.slac.stanford.edu/ > > >>>>>examples/multserver/index.html. > > >>>>> > > >>>>>Best regards > > >>>>>Manfred > > >>>>> > > >>>>>babar2 # cat redirector.cf > > >>>>># > > >>>>># redirector.cf > > >>>>># > > >>>>># xrootd > > >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so > > >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so > > >>>>>xrootd.export /data > > >>>>>odc.manager l01-001-122 3121 > > >>>>>odc.trace redirect > > >>>>># olbd > > >>>>>olb.port 3121 > > >>>>>#+olb.allow host kanrdr.slac.stanford.edu > > >>>>>#+olb.allow host kan001.slac.stanford.edu > > >>>>>#+olb.allow host kan002.slac.stanford.edu > > >>>>>olb.allow host l01-001-122 # babar2.fzk.de > > >>>>>olb.allow host f01-001-121 > > >>>>>olb.allow host f01-001-122 > > >>>>>babar2 # > > >>>>> > > >>>>>[root@f01-001-122 etc]# cat dataserver.cf > > >>>>># > > >>>>># dataserver.cf > > >>>>># > > >>>>># xrootd > > >>>>>#+xrootd.fslib /opt/xrootd/lib/libXrdOfs.so > > >>>>>xrootd.fslib /usr/local/xrootd/lib/i386_linux24/libXrdOfs.so > > >>>>>xrootd.export /data > > >>>>>oss.readonly > > >>>>>odc.manager l01-001-122 3121 > > >>>>># olbd > > >>>>>olb.port 3121 > > >>>>>olb.subscribe l01-001-122 3121 > > >>>>>[root@f01-001-122 etc]# > > >>>>> > > >>>>>Peter Elmer wrote: > > >>>>>> [CC the xrootd mailing list] > > >>>>>> > > >>>>>> Hi Rolf, > > >>>>>> > > >>>>>> Do you have the config files you are using to try to start xrootd > > >>>>>> and the olbd (on the redirector and the file servers)? > > >>>>>> > > >>>>>> Pete > > >>>>>> > > >>>>>>On Fri, Nov 05, 2004 at 11:33:47AM +0100, Manfred Alef wrote: > > >>>>>>>Hi Pete, > > >>>>>>> > > >>>>>>>I am sitting here at GridKa together with Manfred Alef and we are > > >>>>>>> trzing to install xrootd on two of the fileservers and on babar2, > > >>>>>>> a login mashine which will also be the redirector. > > >>>>>>>We use the current production versin and had no problems starting > > >>>>>>> xrootd and albd on one of the fileservers. However, when we trz > > >>>>>>> to start the olbd on the redirector, it exits with exit code 1. > > >>>>>>> The logfile is attached. We made sure nothing else is going on > > >>>>>>> on the mashine (reboot) and also removed anz old socket we could > > >>>>>>> find in /tmp/.olb/ Do you have an idea what could go wrong or > > >>>>>>> what else we could try? > > >>>>>>> > > >>>>>>>Cheers, Rolf > > >>>>>>> > > >>>>>>>--------------------------------------------------------------- > > >>>>>>>41105 10:44:31 32156 olb_Config: (c) 2004 SLAC olbd version > > >>>>>>>20040907-0403 initializing as Manager > > >>>>>>>041105 10:44:31 32156 olb_Bind: Unable to bind socket; address > > >>>>>>> already in use > > >>>>>>>041105 10:44:31 32156 olb_Config: Manager initialization failed. > > >>>>>>>041105 10:46:15 32191 olb_Config: (c) 2004 SLAC olbd version > > >>>>>>>20040907-0403 initializing as Manager > > >>>>>>>041105 10:46:15 32191 olb_Bind: Unable to bind socket; address > > >>>>>>> already in use > > >>>>>>>041105 10:46:15 32191 olb_Config: Manager initialization failed. > > >>>>>>>041105 10:48:49 32248 Schedule scheduling midnight runner in 47471 > > >>>>>>> seconds 041105 10:48:49 32248 olb_Config: (c) 2004 SLAC olbd > > >>>>>>> version 20040907-0403 initializing as Manager > > >>>>>>>041105 10:48:49 32248 olb_Bind: Unable to bind socket; address > > >>>>>>> already in use > > >>>>>>>041105 10:48:49 32248 olb_Config: Manager initialization failed. > > >>>>>>>041105 11:10:37 3175 olb_Config: (c) 2004 SLAC olbd version > > >>>>>>>20040907-0403 initializing as Manager > > >>>>>>>041105 11:10:37 3175 olb_Bind: Unable to bind socket; address > > >>>>>>> already in use 041105 11:10:37 3175 olb_Config: Manager > > >>>>>>> initialization failed. 041105 11:18:16 3332 olb_Config: (c) 2004 > > >>>>>>> SLAC olbd version 20040907-0403 initializing as Manager > > >>>>>>>041105 11:18:16 3332 olb_Bind: Unable to bind socket; address > > >>>>>>> already in use 041105 11:18:16 3332 olb_Config: Manager > > >>>>>>> initialization failed. > > >>>>>>> ---------------------------------------------------------------- > > > > > > ----------------------------------------------------------------------- > > >-- Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) > > > 767-4644 Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, > > > Switzerland > > > ----------------------------------------------------------------------- > > >-- > > ------------------------------------------------------------------------- > Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 > Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland > ------------------------------------------------------------------------- -- contacts: http://www.physi.uni-heidelberg.de/~dubitzky