Hi Andy and Gregory, Is it possible that we are running into something specific with gpfs here? Recall that gpfs itself attempts to provide a global filesystem-like setup via these multiple servers (and may do this via DNS load balancing or some other similar scheme). Pete On Fri, Feb 18, 2005 at 02:41:11AM -0800, Andrew Hanushevsky wrote: > Hi Gregory, > > This looks like some problem in the DNS setup. It's certainly crashing in > the IP adress to DNS name lookup. Try using nslookup on the IP address > associated with the data server as well as the new ones you added to see > if you can do a reverse lookup. You should do that on the redirector > machine. > > Andy > > On Fri, 18 Feb 2005, Gregory Schott wrote: > > > > > Hello, > > > > Does s/o have an idea as of why xrootd would crash on the SL3 redirector > > when adding the SL3 GPFS to the already running RH72 NAS boxes pool? I > > installed on GPFS the september SL3 binaries. I have the following olb > > logfile on the dataserver (the crash is silent on the redirector). > > > > 050218 09:38:17 10389 olb_Config: (c) 2004 SLAC olbd version 20040907-0403 initializing as Server > > 050218 09:38:17 10389 setupServer Config: thread 3063385008 assigned to ping monitor > > 050218 09:38:17 10389 olb_Config: Server initialization completed. > > 050218 09:38:17 10389 main Main: Thread 3052895152 handling notification traffic. > > 050218 09:38:17 10389 olb_Start: Waiting for primary server to login. > > 050218 09:38:17 10389 main Main: Thread 3042405296 handling admin traffic. > > 050218 09:38:17 10389 Admin_Login Initial admin request: 'login p 10388 port 1094' > > 050218 09:38:17 10389 olb_Admin_Login: Primary server 10388 logged in > > 050218 09:38:17 10389 AddManager Manager: Added babar2 to config; id=0 > > 050218 09:38:17 10389 FreeSpace Updated fs info; old=0K new=0K tot=0K > > 050218 09:38:17 10389 olb_Server: Logged into babar2 > > 050218 09:38:17 10389 olb_GetLine: Unable to reading request ; connection reset by peer > > 050218 09:38:17 10389 Receive Null line from babar2 > > 050218 09:38:17 10389 olb_Server: Unable to read response from babar2; connection reset by peer > > 050218 09:38:17 10389 Remove_Manager Removed babar2 manager 0.1 FD=10 > > 050218 09:38:32 10389 olb_Connect: Unable to connect to babar2; connection refused > > 050218 09:38:42 10389 olb_Connect: Unable to connect to babar2; connection refused > > 050218 09:38:53 10389 olb_Connect: Unable to connect to babar2; connection refused > > > > And the core file says: > > > > gdb /opt/xrootd/bin/olbd core.31310 > > GNU gdb Red Hat Linux (6.1post-1.20040607.17rh) > > Copyright 2004 Free Software Foundation, Inc. > > GDB is free software, covered by the GNU General Public License, and you are > > welcome to change it and/or distribute copies of it under certain conditions. > > Type "show copying" to see the conditions. > > There is absolutely no warranty for GDB. Type "show warranty" for details. > > This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols > > found)...Using host libthread_db library "/lib/tls/libthread_db.so.1". > > > > Core was generated by `/opt/xrootd//bin/olbd -m -l /tmp/babar2.olblog -c config/redirector.cf'. > > Program terminated with signal 11, Segmentation fault. > > Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done. > > Loaded symbols for /lib/libnsl.so.1 > > Reading symbols from /lib/tls/libpthread.so.0...(no debugging symbols found)...done. > > Loaded symbols for /lib/tls/libpthread.so.0 > > Reading symbols from /lib/tls/librt.so.1...(no debugging symbols found)...done. > > Loaded symbols for /lib/tls/librt.so.1 > > Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done. > > Loaded symbols for /lib/libdl.so.2 > > Reading symbols from /usr/lib/libstdc++.so.5...(no debugging symbols found)...done. > > Loaded symbols for /usr/lib/libstdc++.so.5 > > Reading symbols from /lib/tls/libm.so.6...(no debugging symbols found)...done. > > Loaded symbols for /lib/tls/libm.so.6 > > Reading symbols from /lib/tls/libc.so.6...(no debugging symbols found)...done. > > Loaded symbols for /lib/tls/libc.so.6 > > Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. > > Loaded symbols for /lib/libgcc_s.so.1 > > Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. > > Loaded symbols for /lib/ld-linux.so.2 > > Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done. > > Loaded symbols for /lib/libnss_files.so.2 > > Reading symbols from /lib/libnss_dns.so.2...(no debugging symbols found)...done. > > Loaded symbols for /lib/libnss_dns.so.2 > > Reading symbols from /lib/libresolv.so.2...(no debugging symbols found)...done. > > Loaded symbols for /lib/libresolv.so.2 > > #0 0x08067adb in XrdOucSecurity::Authorize () > > (gdb) backtrace > > #0 0x08067adb in XrdOucSecurity::Authorize () > > #1 0x0806673b in XrdOucNetwork::do_Accept () > > #2 0x08065dec in XrdOucNetwork::Accept () > > #3 0x08058024 in main () > > > > I also added two lines on the config file for the 2 new GPFS (see full > > dataserver config file below): > > > > olb.allow host f01-010-110.gridka.de > > olb.allow host f01-005-115.gridka.de > > > > > > Regards, > > Gregory > > > > > > # > > # dataserver.cf > > # > > > > # The Open Distributed Cache Section > > # > > odc.manager babar2 3121 > > > > # The Open Load Balancer Section > > # > > olb.allow host l01-001-122.gridka.de > > olb.allow host f01-001-1*.gridka.de > > olb.allow host f01-010-110.gridka.de > > olb.allow host f01-005-115.gridka.de > > olb.port 3121 > > olb.path r /store > > olb.sched cpu 100 > > olb.subscribe babar2 3121 > > olb.wait > > > > # The Open File System Section > > # > > ofs.redirect remote if l01-001-122.gridka.de > > ofs.redirect target > > #ofs.redirect target if f01-001-121.gridka.de > > #ofs.redirect target if f01-001-1*.gridka.de > > > > # The Open Storage System Section (cache & localroot are used by olb) > > # > > oss.alloc * * 80 > > oss.fdlimit * max > > oss.localroot /home/xrootd/disk/kanga/EventStore/ > > #oss.path /data/read r/o > > > > # The XRD Section > > # > > xrd.protocol xrootd * > > > > # The XROOTD Section > > # > > xrootd.fslib /home/xrootd/software/20040907-0403/lib/libXrdOfs.so > > xrootd.export /store > > xrootd.export /prod > > > > # Switch on debugging output > > # > > odc.trace redirect > > xrd.trace all > > xrootd.trace all > > olb.trace all > > oss.trace all > > > > > > -------------- Dr. Gregory Schott -------------- > > Institut fuer Experimentelle Kernphysik (IEKP) > > Universitaet Karlsruhe - Postfach 3640 > > 76021 Karlsruhe (Germany) > > tel.: +49-(0)724782-3537 > > fax.: +49-(0)724782-3414 > > e-mail: [log in to unmask] > > ----------------------------------------------- > > > > ------------------------------------------------------------------------- Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland -------------------------------------------------------------------------