On Fri, 18 Feb 2005, Peter Elmer wrote: > Hi Gregory, > > Just to be clear, it is the olbd that seems to be crashing, not the xrootd. yes, sorry, I meant olbd not xrootd. > Note that the "olb.allow" directives adding the two new servers (if they > can't be specified with wildcards) needed to be added also the _redirector_ > config file since that is the entity that will accept the connection from > (and provide access to) the two new dataservers. That might be the problem, > but it is a guess. > > (That said, the olbd shouldn't just crash if it doesn't succeed in > connecting to the redirector olbd. That sounds like a bug.) I tried to add the olb.allow lines on the redirector as well but it still crashes. -- gregory > > Pete > > On Fri, Feb 18, 2005 at 09:52:18AM +0100, Gregory Schott wrote: >> Hello, >> >> Does s/o have an idea as of why xrootd would crash on the SL3 redirector >> when adding the SL3 GPFS to the already running RH72 NAS boxes pool? I >> installed on GPFS the september SL3 binaries. I have the following olb >> logfile on the dataserver (the crash is silent on the redirector). >> >> 050218 09:38:17 10389 olb_Config: (c) 2004 SLAC olbd version 20040907-0403 >> initializing as Server >> 050218 09:38:17 10389 setupServer Config: thread 3063385008 assigned to >> ping monitor >> 050218 09:38:17 10389 olb_Config: Server initialization completed. >> 050218 09:38:17 10389 main Main: Thread 3052895152 handling notification >> traffic. >> 050218 09:38:17 10389 olb_Start: Waiting for primary server to login. >> 050218 09:38:17 10389 main Main: Thread 3042405296 handling admin traffic. >> 050218 09:38:17 10389 Admin_Login Initial admin request: 'login p 10388 >> port 1094' >> 050218 09:38:17 10389 olb_Admin_Login: Primary server 10388 logged in >> 050218 09:38:17 10389 AddManager Manager: Added babar2 to config; id=0 >> 050218 09:38:17 10389 FreeSpace Updated fs info; old=0K new=0K tot=0K >> 050218 09:38:17 10389 olb_Server: Logged into babar2 >> 050218 09:38:17 10389 olb_GetLine: Unable to reading request ; connection >> reset by peer >> 050218 09:38:17 10389 Receive Null line from babar2 >> 050218 09:38:17 10389 olb_Server: Unable to read response from babar2; >> connection reset by peer >> 050218 09:38:17 10389 Remove_Manager Removed babar2 manager 0.1 FD=10 >> 050218 09:38:32 10389 olb_Connect: Unable to connect to babar2; connection >> refused >> 050218 09:38:42 10389 olb_Connect: Unable to connect to babar2; connection >> refused >> 050218 09:38:53 10389 olb_Connect: Unable to connect to babar2; connection >> refused >> >> And the core file says: >> >> gdb /opt/xrootd/bin/olbd core.31310 >> GNU gdb Red Hat Linux (6.1post-1.20040607.17rh) >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols >> found)...Using host libthread_db library "/lib/tls/libthread_db.so.1". >> >> Core was generated by `/opt/xrootd//bin/olbd -m -l /tmp/babar2.olblog -c >> config/redirector.cf'. >> Program terminated with signal 11, Segmentation fault. >> Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done. >> Loaded symbols for /lib/libnsl.so.1 >> Reading symbols from /lib/tls/libpthread.so.0...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/tls/libpthread.so.0 >> Reading symbols from /lib/tls/librt.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/tls/librt.so.1 >> Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done. >> Loaded symbols for /lib/libdl.so.2 >> Reading symbols from /usr/lib/libstdc++.so.5...(no debugging symbols >> found)...done. >> Loaded symbols for /usr/lib/libstdc++.so.5 >> Reading symbols from /lib/tls/libm.so.6...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/tls/libm.so.6 >> Reading symbols from /lib/tls/libc.so.6...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/tls/libc.so.6 >> Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/libgcc_s.so.1 >> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/ld-linux.so.2 >> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/libnss_files.so.2 >> Reading symbols from /lib/libnss_dns.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/libnss_dns.so.2 >> Reading symbols from /lib/libresolv.so.2...(no debugging symbols >> found)...done. >> Loaded symbols for /lib/libresolv.so.2 >> #0 0x08067adb in XrdOucSecurity::Authorize () >> (gdb) backtrace >> #0 0x08067adb in XrdOucSecurity::Authorize () >> #1 0x0806673b in XrdOucNetwork::do_Accept () >> #2 0x08065dec in XrdOucNetwork::Accept () >> #3 0x08058024 in main () >> >> I also added two lines on the config file for the 2 new GPFS (see full >> dataserver config file below): >> >> olb.allow host f01-010-110.gridka.de >> olb.allow host f01-005-115.gridka.de >> >> >> Regards, >> Gregory >> >> >> # >> # dataserver.cf >> # >> >> # The Open Distributed Cache Section >> # >> odc.manager babar2 3121 >> >> # The Open Load Balancer Section >> # >> olb.allow host l01-001-122.gridka.de >> olb.allow host f01-001-1*.gridka.de >> olb.allow host f01-010-110.gridka.de >> olb.allow host f01-005-115.gridka.de >> olb.port 3121 >> olb.path r /store >> olb.sched cpu 100 >> olb.subscribe babar2 3121 >> olb.wait >> >> # The Open File System Section >> # >> ofs.redirect remote if l01-001-122.gridka.de >> ofs.redirect target >> #ofs.redirect target if f01-001-121.gridka.de >> #ofs.redirect target if f01-001-1*.gridka.de >> >> # The Open Storage System Section (cache & localroot are used by olb) >> # >> oss.alloc * * 80 >> oss.fdlimit * max >> oss.localroot /home/xrootd/disk/kanga/EventStore/ >> #oss.path /data/read r/o >> >> # The XRD Section >> # >> xrd.protocol xrootd * >> >> # The XROOTD Section >> # >> xrootd.fslib /home/xrootd/software/20040907-0403/lib/libXrdOfs.so >> xrootd.export /store >> xrootd.export /prod >> >> # Switch on debugging output >> # >> odc.trace redirect >> xrd.trace all >> xrootd.trace all >> olb.trace all >> oss.trace all >> >> >> -------------- Dr. Gregory Schott -------------- >> Institut fuer Experimentelle Kernphysik (IEKP) >> Universitaet Karlsruhe - Postfach 3640 >> 76021 Karlsruhe (Germany) >> tel.: +49-(0)724782-3537 >> fax.: +49-(0)724782-3414 >> e-mail: [log in to unmask] >> ----------------------------------------------- >> > > > > ------------------------------------------------------------------------- > Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 > Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland > ------------------------------------------------------------------------- > -------------- Dr. Gregory Schott -------------- Institut fuer Experimentelle Kernphysik (IEKP) Universitaet Karlsruhe - Postfach 3640 76021 Karlsruhe (Germany) tel.: +49-(0)724782-3537 fax.: +49-(0)724782-3414 e-mail: [log in to unmask] -----------------------------------------------