Print

Print


Hello Andy,

> The reason is that xrootd cannot translate the NAS IP address back to the
> DNS name. Are you sure the NAS boxes are properly registered in your name
> server? The next version will give you an error message when this happens
> instead of crashing.

No, I am not sure. Do you know how I can check that?

However is that something you expect with the new versions and not the 
older ones? If I use the older xrootd version then it works fine still.

-- gregory

> Andy
>
> On Tue, 22 Mar 2005, Gregory Schott wrote:
>
>> Hello,
>>
>>    While I could xrdcp a file when running version 20050110-1339, running now
>> version 20050321-0425 the xrd process crashes on the dataservers (RH7.3
>> and SL3).
>>    I appended below some logfiles, the xrd and olb logfiles don't say anything
>> special and xrdcp complains for an error.
>>    The core file on SL3 says: (see below)
>>
>> -- Gregory
>>
>>
>> # xrd logfile
>>
>> 050322 16:30:23 001 (c) 2004 Stanford University/SLAC xrd version
>> 20050321-0425_dbg
>> 050322 16:30:23 001 xrd@f01-001-116 initialization started.
>> 050322 16:30:23 001 Using configuration file config/dataserver.cf
>> 050322 16:30:23 001 Optimizing for 256 connections; maximum is 1024
>> 050322 16:30:23 001 XrdSched: Set min_Workers=4 max_Workers=32
>> 050322 16:30:23 001 XrdSched: Set stk_Workers=26 max_Workidl=780
>> 050322 16:30:23 001 XrdSched: scheduling underused thread monitor in 780
>> seconds
>> 050322 16:30:23 001 XrdSched: Now have 1 workers
>> 050322 16:30:23 001 XrdLink: Allocating 16 link objects at a time
>> 050322 16:30:23 001 XrdPoll: Starting poller 0
>> 050322 16:30:23 001 XrdPoll: Starting poller 1
>> 050322 16:30:23 001 XrdPoll: Starting poller 2
>> 050322 16:30:23 001 XrdProtocol: loading protocol xrootd
>> 050322 16:30:23 001 (c) 2004 Stanford University/SLAC XRootd (eXtended Root
>> Daemon).
>> 050322 16:30:23 001 XrootdAioReq: Max aio/req=8; aio/srv=4096; Quantum=65536
>> 050322 16:30:23 001 XrootdAioReq: Adding 30 aioreq objects.
>> 050322 16:30:23 001 XrootdAio: Adding 24 aio objects; 4096 pending.
>> 050322 16:30:23 001 XRootd seclib not specified; strong authentication disabled
>> 050322 16:30:23 001 XrootdProtocol: Loading filesystem library
>> /home/xrootd/software/current/lib/libXrdOfs.so
>> 050322 16:30:23 001 ofs_Init: (c) 2005 Stanford University/SLAC, Ofs Version
>> 20050321-0425_dbg
>> 050322 16:30:23 001 ofs_Config: File system initialization started.
>> 050322 16:30:23 001 ofs_Config: redirect remote ignored; not applicable host.
>> 050322 16:30:23 001 odc_Config: Target redirection initialization started
>> 050322 16:30:23 001 odc_Config: Target redirection initialization completed.
>> 050322 16:30:23 001 ofs_Config: File system initialization completed.
>> config/dataserver.cf ofs configuration:
>> ofs.authorize
>> ofs.redirect target
>> ofs.fdscan     9 120 1200
>> ofs.maxdelay   60
>> ofs.trace      0
>> 050322 16:30:23 001 oss_Init: (c) 2004, Stanford University, oss Version
>> 20050321-0425_dbg
>> 050322 16:30:23 001 oss_config: Storage system initialization started.
>> 050322 16:30:23 001 oss_AioInit: started AIO read signal thread; tid=8201
>> 050322 16:30:23 24756 odc_olb: Connected to olb via /tmp/.olb/olbd.admin
>> 050322 16:30:23 001 oss_AioInit: started AIO write signal thread; tid=9226
>> 050322 16:30:23 001 oss_config: Storage system initialization completed.
>> config/dataserver.cf oss configuration:
>> oss.alloc        0 0 80
>> oss.cachescan    600
>> oss.compdetect   *
>> oss.fdlimit      512 1024
>> oss.maxdbsize    0
>> oss.localroot /home/xrootd/disk/kanga/EventStore/
>> oss.trace        fff
>> oss.xfr          1 9437184 30 10800
>> oss.memfile off  max 527738880
>> oss.path / r/w  nocheck nodread nomig nomkeep nomlock nommap norcreate nostage
>> 050322 16:30:23 001 XrdSched: scheduling xrootd protocol anchor in 3600 seconds
>> 050322 16:30:23 001 Prep log directory not specified; prepare tracking
>> disabled.
>> 050322 16:30:23 001 Exporting /prod
>> 050322 16:30:23 001 Exporting /store
>> 050322 16:30:23 001 XRootd protocol version 2.3.0 build 20050321-0425
>> successfully loaded.
>> 050322 16:30:23 001 xrd@f01-001-116:1094 initialization completed.
>>
>> # olb logfile
>>
>> 050322 16:30:23 001 olb_Config: (c) 2004 SLAC olbd version 20050321-0425_dbg
>> initializing as Server
>> 050322 16:30:23 001 olb_Config: Server initialization completed.
>> 050322 16:30:23 24748 olb_Start: Waiting for primary server to login.
>> 050322 16:30:23 24758 Admin_Login Initial admin request: 'login p 24742 port
>> 1094'
>> 050322 16:30:23 24758 olb_Admin_Login: Primary server 24742 logged in
>> 050322 16:30:23 001 AddManager Manager: Added babar2 to config; id=0
>> 050322 16:30:23 001 FreeSpace Updated fs info; old=0K new=0K tot=0K
>> 050322 16:30:23 001 olb_Server: Logged into babar2
>> 050322 16:31:01 001 Receive From babar2: 1@0 ping
>> 050322 16:31:04 24758 Admin_Login received admin request: ''
>> 050322 16:31:04 24758 olb_Login: Primary server 24742 logged out
>>
>> # xrdcp output (using version 20050316-1316)
>>
>> 050322 16:22:46 001 Xrd: GetDomainToMatch GetHostName(f01-001-116.gridka.de)
>> returned name=f01-001-116.gridka.de
>> 050322 16:22:46 001 Xrd: GetDomainToMatch GetDomain(f01-001-116.gridka.de) -->
>> gridka.de
>> 050322 16:22:46 001 Xrd: CheckHostDomain Resolved [f01-001-116.gridka.de]'s
>> domain name into [gridka.de]
>> 050322 16:22:46 001 Xrd: CheckHostDomain Access granted to the domain of
>> [f01-001-116.gridka.de].
>> 050322 16:22:46 001 Xrd: GetDomainToMatch GetHostName(f01-001-116.gridka.de)
>> returned name=f01-001-116.gridka.de
>> 050322 16:22:46 001 Xrd: GetDomainToMatch GetDomain(f01-001-116.gridka.de) -->
>> gridka.de
>> 050322 16:22:46 001 Xrd: CheckHostDomain Resolved [f01-001-116.gridka.de]'s
>> domain name into [gridka.de]
>> 050322 16:22:46 001 Xrd: CheckHostDomain Access granted to the domain of
>> [f01-001-116.gridka.de].
>> 050322 16:22:46 001 Xrd: CreateTXNf Trying to connect to
>> f01-001-116.gridka.de:1094. Connect try 1
>> 050322 16:22:46 001 Xrd: Connect Creating a logical connection...
>> 050322 16:22:46 001 Xrd: Connect Physical connection not found. Creating a new
>> one...
>> 050322 16:22:46 001 Xrd: Connect Connecting to [f01-001-116.gridka.de:1094]
>> 050322 16:22:46 001 Xrd: ClientSock::TryConnect Trying to connect
>> tof01-001-116.gridka.de(10.65.1.116):1094 Timeout=60
>> 050322 16:22:46 001 Xrd: Connect Connected to [f01-001-116.gridka.de:1094]
>> 050322 16:22:46 001 Xrd: Connect New physical connection to server
>> f01-001-116.gridka.de:1094 succesfully created.
>> 050322 16:22:46 001 Xrd: Connect LogConn: size:1 count: 1PhyConn: size:1 count:
>> 1
>> 050322 16:22:46 001 Xrd: Connect Connect(f01-001-116.gridka.de, 1094) returned
>> 0
>> 050322 16:22:46 001 Xrd: CreateTXNf The logical connection id is 0. This will
>> be the streamid for this client
>> 050322 16:22:46 001 Xrd: CreateTXNf Working url is f01-001-116.gridka.de:1094//
>> 050322 16:22:46 001 Xrd: DoHandShake HandShake step 1: Sending 20 bytes to the
>> server [f01-001-116.gridka.de:1094]
>> 050322 16:22:46 001 Xrd: DoHandShake HandShake step 2: Reading 4 bytes from
>> server [f01-001-116.gridka.de:1094].
>> 050322 16:22:48 001 Xrd: ClientSock::RecvRaw Disconnection detected reading 4
>> bytes from socket 4 (server[f01-001-116.gridka.de:1094]). Revents=25
>> 050322 16:22:48 001 Xrd: ReadRaw Read error on f01-001-116.gridka.de:1094.
>> errno=22
>> 050322 16:22:48 001 Xrd: ReadRaw Disconnection reported
>> onf01-001-116.gridka.de:1094
>> 050322 16:22:48 001 Xrd: DoHandShake Error reading 4 bytes from server
>> [f01-001-116.gridka.de:1094].
>> 050322 16:22:48 001 Xrd: StartReader Starting reader thread...
>> 050322 16:22:48 000 Xrd: SocketReaderThread Reader Thread starting.
>> 050322 16:22:48 000 Xrd: ReadRaw Socket is disconnected.
>> 050322 16:22:48 001 Xrd: GetAccessToSrv HandShake failed with server
>> [f01-001-116.gridka.de:1094]
>> 050322 16:22:48 001 Xrd: CreateTXNf Access to server failed
>> 050322 16:22:48 001 Xrd: CreateTXNf Disconnecting.
>> 050322 16:22:48 001 Xrd: Disconnect Destroying nonexistent logconn 0
>> 050322 16:22:48 001 Xrd: Create Connection attempt failed. Sleeping 10 seconds.
>>
>> # core file
>>
>> -bash-2.05b$ gdb software/current/bin/xrootd  core.14044
>> GNU gdb Red Hat Linux (6.1post-1.20040607.17rh)
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you
>> are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
>> details.
>> This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
>> found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
>>
>> Core was generated by `/home/xrootd/software/current/bin/xrootd -p 1094 -l
>> /tmp/f01-010-110.xrdlog -c'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /lib/libnsl.so.1...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libnsl.so.1
>> Reading symbols from /lib/tls/libpthread.so.0...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/tls/libpthread.so.0
>> Reading symbols from /lib/tls/librt.so.1...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/tls/librt.so.1
>> Reading symbols from /lib/libdl.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libdl.so.2
>> Reading symbols from /usr/lib/libstdc++.so.5...(no debugging symbols
>> found)...done.
>> Loaded symbols for /usr/lib/libstdc++.so.5
>> Reading symbols from /lib/tls/libm.so.6...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/tls/libm.so.6
>> Reading symbols from /lib/tls/libc.so.6...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/tls/libc.so.6
>> Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libgcc_s.so.1
>> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/ld-linux.so.2
>> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libnss_files.so.2
>> Reading symbols from
>> /home/xrootd/software/20050316-1316/lib/libXrdOfs.so...(no debugging
>> symbols found)...done.
>> Loaded symbols for /home/xrootd/software/current/lib/libXrdOfs.so
>> Reading symbols from /lib/libnss_dns.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libnss_dns.so.2
>> Reading symbols from /lib/libresolv.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libresolv.so.2
>> #0  0xb7407198 in strcmp () from /lib/tls/libc.so.6
>> (gdb) backtrace
>> #0  0xb7407198 in strcmp () from /lib/tls/libc.so.6
>> #1  0x08079903 in XrdNet::Trim ()
>> #2  0x0806d165 in XrdLink::Alloc ()
>> #3  0x08078cd1 in XrdInet::Accept ()
>> #4  0x0806f58b in main ()
>>
>>
>>
>>
>>
>>
>>
>>
>>
>