Print

Print


Hi Gregory,

This looks like some problem in the DNS setup. It's certainly crashing in
the IP adress to DNS name lookup. Try using nslookup on the IP address
associated with the data server as well as the new ones you added to see
if you can do a reverse lookup. You should do that on the redirector
machine.

Andy

On Fri, 18 Feb 2005, Gregory Schott wrote:

>
> Hello,
>
>    Does s/o have an idea as of why xrootd would crash on the SL3 redirector
> when adding the SL3 GPFS to the already running RH72 NAS boxes pool? I
> installed on GPFS the september SL3 binaries. I have the following olb
> logfile on the dataserver (the crash is silent on the redirector).
>
> 050218 09:38:17 10389 olb_Config: (c) 2004 SLAC olbd version 20040907-0403 initializing as Server
> 050218 09:38:17 10389 setupServer Config: thread 3063385008 assigned to ping monitor
> 050218 09:38:17 10389 olb_Config: Server initialization completed.
> 050218 09:38:17 10389 main Main: Thread 3052895152 handling notification traffic.
> 050218 09:38:17 10389 olb_Start: Waiting for primary server to login.
> 050218 09:38:17 10389 main Main: Thread 3042405296 handling admin traffic.
> 050218 09:38:17 10389 Admin_Login Initial admin request: 'login p 10388 port 1094'
> 050218 09:38:17 10389 olb_Admin_Login: Primary server 10388 logged in
> 050218 09:38:17 10389 AddManager Manager: Added babar2 to config; id=0
> 050218 09:38:17 10389 FreeSpace Updated fs info; old=0K new=0K tot=0K
> 050218 09:38:17 10389 olb_Server: Logged into babar2
> 050218 09:38:17 10389 olb_GetLine: Unable to reading request ; connection reset by peer
> 050218 09:38:17 10389 Receive Null line from babar2
> 050218 09:38:17 10389 olb_Server: Unable to read response from babar2; connection reset by peer
> 050218 09:38:17 10389 Remove_Manager Removed babar2 manager 0.1 FD=10
> 050218 09:38:32 10389 olb_Connect: Unable to connect to babar2; connection refused
> 050218 09:38:42 10389 olb_Connect: Unable to connect to babar2; connection refused
> 050218 09:38:53 10389 olb_Connect: Unable to connect to babar2; connection refused
>
> And the core file says:
>
> gdb /opt/xrootd/bin/olbd core.31310
> GNU gdb Red Hat Linux (6.1post-1.20040607.17rh)
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
> found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
>
> Core was generated by `/opt/xrootd//bin/olbd -m -l /tmp/babar2.olblog -c config/redirector.cf'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
> Loaded symbols for /lib/libnsl.so.1
> Reading symbols from /lib/tls/libpthread.so.0...(no debugging symbols found)...done.
> Loaded symbols for /lib/tls/libpthread.so.0
> Reading symbols from /lib/tls/librt.so.1...(no debugging symbols found)...done.
> Loaded symbols for /lib/tls/librt.so.1
> Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/libdl.so.2
> Reading symbols from /usr/lib/libstdc++.so.5...(no debugging symbols found)...done.
> Loaded symbols for /usr/lib/libstdc++.so.5
> Reading symbols from /lib/tls/libm.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib/tls/libm.so.6
> Reading symbols from /lib/tls/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib/tls/libc.so.6
> Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
> Loaded symbols for /lib/libgcc_s.so.1
> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/ld-linux.so.2
> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/libnss_files.so.2
> Reading symbols from /lib/libnss_dns.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/libnss_dns.so.2
> Reading symbols from /lib/libresolv.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/libresolv.so.2
> #0  0x08067adb in XrdOucSecurity::Authorize ()
> (gdb) backtrace
> #0  0x08067adb in XrdOucSecurity::Authorize ()
> #1  0x0806673b in XrdOucNetwork::do_Accept ()
> #2  0x08065dec in XrdOucNetwork::Accept ()
> #3  0x08058024 in main ()
>
> I also added two lines on the config file for the 2 new GPFS (see full
> dataserver config file below):
>
> olb.allow host f01-010-110.gridka.de
> olb.allow host f01-005-115.gridka.de
>
>
> Regards,
>    Gregory
>
>
> #
> # dataserver.cf
> #
>
> # The Open Distributed Cache Section
> #
> odc.manager babar2 3121
>
> # The Open Load Balancer Section
> #
> olb.allow host l01-001-122.gridka.de
> olb.allow host f01-001-1*.gridka.de
> olb.allow host f01-010-110.gridka.de
> olb.allow host f01-005-115.gridka.de
> olb.port 3121
> olb.path r /store
> olb.sched cpu 100
> olb.subscribe babar2 3121
> olb.wait
>
> # The Open File System Section
> #
> ofs.redirect remote if l01-001-122.gridka.de
> ofs.redirect target
> #ofs.redirect target if f01-001-121.gridka.de
> #ofs.redirect target if f01-001-1*.gridka.de
>
> # The Open Storage System Section (cache & localroot are used by olb)
> #
> oss.alloc * * 80
> oss.fdlimit * max
> oss.localroot /home/xrootd/disk/kanga/EventStore/
> #oss.path /data/read r/o
>
> # The XRD Section
> #
> xrd.protocol xrootd *
>
> # The XROOTD Section
> #
> xrootd.fslib /home/xrootd/software/20040907-0403/lib/libXrdOfs.so
> xrootd.export /store
> xrootd.export /prod
>
> # Switch on debugging output
> #
> odc.trace redirect
> xrd.trace all
> xrootd.trace all
> olb.trace all
> oss.trace all
>
>
> -------------- Dr. Gregory Schott --------------
>   Institut fuer Experimentelle Kernphysik (IEKP)
>       Universitaet Karlsruhe - Postfach 3640
>             76021 Karlsruhe  (Germany)
>              tel.: +49-(0)724782-3537
>              fax.: +49-(0)724782-3414
>             e-mail: [log in to unmask]
> -----------------------------------------------
>
>