Print

Print


  [Add back Gregory and the mailing list, since Andy accidentally 
   dropped them.]

  Gregory, please see Andy's request below...


On Fri, Feb 18, 2005 at 02:43:44AM -0800, Andrew Hanushevsky wrote:
> Hi Pete,
> 
> 
> On Fri, 18 Feb 2005, Peter Elmer wrote:
> >   (That said, the olbd shouldn't just crash if it doesn't succeed in
> > connecting to the redirector olbd. That sounds like a bug.)
> Agreed, Gregory could you please compile with "--build=debug" so that we
> can at least see where it's crashing?
> 
> Andy
> 
> >                                    Pete
> >
> > On Fri, Feb 18, 2005 at 09:52:18AM +0100, Gregory Schott wrote:
> > > Hello,
> > >
> > >   Does s/o have an idea as of why xrootd would crash on the SL3 redirector
> > > when adding the SL3 GPFS to the already running RH72 NAS boxes pool? I
> > > installed on GPFS the september SL3 binaries. I have the following olb
> > > logfile on the dataserver (the crash is silent on the redirector).
> > >
> > > 050218 09:38:17 10389 olb_Config: (c) 2004 SLAC olbd version 20040907-0403
> > > initializing as Server
> > > 050218 09:38:17 10389 setupServer Config: thread 3063385008 assigned to
> > > ping monitor
> > > 050218 09:38:17 10389 olb_Config: Server initialization completed.
> > > 050218 09:38:17 10389 main Main: Thread 3052895152 handling notification
> > > traffic.
> > > 050218 09:38:17 10389 olb_Start: Waiting for primary server to login.
> > > 050218 09:38:17 10389 main Main: Thread 3042405296 handling admin traffic.
> > > 050218 09:38:17 10389 Admin_Login Initial admin request: 'login p 10388
> > > port 1094'
> > > 050218 09:38:17 10389 olb_Admin_Login: Primary server 10388 logged in
> > > 050218 09:38:17 10389 AddManager Manager: Added babar2 to config; id=0
> > > 050218 09:38:17 10389 FreeSpace Updated fs info; old=0K new=0K tot=0K
> > > 050218 09:38:17 10389 olb_Server: Logged into babar2
> > > 050218 09:38:17 10389 olb_GetLine: Unable to reading request ; connection
> > > reset by peer
> > > 050218 09:38:17 10389 Receive Null line from babar2
> > > 050218 09:38:17 10389 olb_Server: Unable to read response from babar2;
> > > connection reset by peer
> > > 050218 09:38:17 10389 Remove_Manager Removed babar2 manager 0.1 FD=10
> > > 050218 09:38:32 10389 olb_Connect: Unable to connect to babar2; connection
> > > refused
> > > 050218 09:38:42 10389 olb_Connect: Unable to connect to babar2; connection
> > > refused
> > > 050218 09:38:53 10389 olb_Connect: Unable to connect to babar2; connection
> > > refused
> > >
> > > And the core file says:
> > >
> > > gdb /opt/xrootd/bin/olbd core.31310
> > > GNU gdb Red Hat Linux (6.1post-1.20040607.17rh)
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and you are
> > > welcome to change it and/or distribute copies of it under certain
> > > conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > > This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
> > > found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
> > >
> > > Core was generated by `/opt/xrootd//bin/olbd -m -l /tmp/babar2.olblog -c
> > > config/redirector.cf'.
> > > Program terminated with signal 11, Segmentation fault.
> > > Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
> > > Loaded symbols for /lib/libnsl.so.1
> > > Reading symbols from /lib/tls/libpthread.so.0...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/tls/libpthread.so.0
> > > Reading symbols from /lib/tls/librt.so.1...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/tls/librt.so.1
> > > Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
> > > Loaded symbols for /lib/libdl.so.2
> > > Reading symbols from /usr/lib/libstdc++.so.5...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /usr/lib/libstdc++.so.5
> > > Reading symbols from /lib/tls/libm.so.6...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/tls/libm.so.6
> > > Reading symbols from /lib/tls/libc.so.6...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/tls/libc.so.6
> > > Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/libgcc_s.so.1
> > > Reading symbols from /lib/ld-linux.so.2...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/ld-linux.so.2
> > > Reading symbols from /lib/libnss_files.so.2...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/libnss_files.so.2
> > > Reading symbols from /lib/libnss_dns.so.2...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/libnss_dns.so.2
> > > Reading symbols from /lib/libresolv.so.2...(no debugging symbols
> > > found)...done.
> > > Loaded symbols for /lib/libresolv.so.2
> > > #0  0x08067adb in XrdOucSecurity::Authorize ()
> > > (gdb) backtrace
> > > #0  0x08067adb in XrdOucSecurity::Authorize ()
> > > #1  0x0806673b in XrdOucNetwork::do_Accept ()
> > > #2  0x08065dec in XrdOucNetwork::Accept ()
> > > #3  0x08058024 in main ()
> > >
> > > I also added two lines on the config file for the 2 new GPFS (see full
> > > dataserver config file below):
> > >
> > > olb.allow host f01-010-110.gridka.de
> > > olb.allow host f01-005-115.gridka.de
> > >
> > >
> > > Regards,
> > >   Gregory
> > >
> > >
> > > #
> > > # dataserver.cf
> > > #
> > >
> > > # The Open Distributed Cache Section
> > > #
> > > odc.manager babar2 3121
> > >
> > > # The Open Load Balancer Section
> > > #
> > > olb.allow host l01-001-122.gridka.de
> > > olb.allow host f01-001-1*.gridka.de
> > > olb.allow host f01-010-110.gridka.de
> > > olb.allow host f01-005-115.gridka.de
> > > olb.port 3121
> > > olb.path r /store
> > > olb.sched cpu 100
> > > olb.subscribe babar2 3121
> > > olb.wait
> > >
> > > # The Open File System Section
> > > #
> > > ofs.redirect remote if l01-001-122.gridka.de
> > > ofs.redirect target
> > > #ofs.redirect target if f01-001-121.gridka.de
> > > #ofs.redirect target if f01-001-1*.gridka.de
> > >
> > > # The Open Storage System Section (cache & localroot are used by olb)
> > > #
> > > oss.alloc * * 80
> > > oss.fdlimit * max
> > > oss.localroot /home/xrootd/disk/kanga/EventStore/
> > > #oss.path /data/read r/o
> > >
> > > # The XRD Section
> > > #
> > > xrd.protocol xrootd *
> > >
> > > # The XROOTD Section
> > > #
> > > xrootd.fslib /home/xrootd/software/20040907-0403/lib/libXrdOfs.so
> > > xrootd.export /store
> > > xrootd.export /prod
> > >
> > > # Switch on debugging output
> > > #
> > > odc.trace redirect
> > > xrd.trace all
> > > xrootd.trace all
> > > olb.trace all
> > > oss.trace all
> > >



-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------