Print

Print


Hi,

I'm out of ideas where to debug further. We've hit very odd case that xrdcp client receives SIGSEGV. Full backtrace with few additional info you can find below [*].

xrootd-client version is:

Name        : xrootd-client
Version     : 4.8.3
Release     : 1.osg34.el7

xrootd-server version is:

Name        : xrootd-server
Version     : 4.8.1
Release     : 1.osg34.el7

Server was verified as identical installation to other xrootd nodes at the site, configuration included. Fetching file from other nodes at the site works well. We can reproduce "Segmentation fault" eventually quickly by running command manually (see backtrace below).

In addition to backtrace, I'm posting output with increased debug level $ xrdcp -d 3 -f root://...:
http://t2.unl.edu/store/zvada/andy/xrdcp-segfault.txt

We are out of ideas what to debug in the next step. Best guess suggests that node sends a malformed message to the client that responds in segfault... We haven't seen this happening elsewhere, though.

Thanks for any help in advance.

-Marian

[*]

(gdb) r
Starting program: /usr/bin/xrdcp -d 2 -f root://xrootd.accre.vanderbilt.edu:1094//store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root /dev/null
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[2018-06-21 16:26:15.408510 -0500][Debug  ][Utility           ] CopyProcess: 2 jobs to prepare
[2018-06-21 16:26:15.408619 -0500][Debug  ][Utility           ] Creating a classic copy job, from root://xrootd.accre.vanderbilt.edu:1094//store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root to file://localhost/dev/null
[2018-06-21 16:26:15.408650 -0500][Debug  ][Utility           ] Monitor library name not set. No monitoring
[2018-06-21 16:26:15.408721 -0500][Debug  ][Poller            ] Available pollers: built-in
[2018-06-21 16:26:15.408728 -0500][Debug  ][Poller            ] Attempting to create a poller according to preference: built-in
[2018-06-21 16:26:15.408734 -0500][Debug  ][Poller            ] Creating poller: built-in
[2018-06-21 16:26:15.408745 -0500][Debug  ][Poller            ] Creating and starting the built-in poller...
[New Thread 0x7ffff4a25700 (LWP 11113)]
[2018-06-21 16:26:15.413698 -0500][Debug  ][Poller            ] Using 1 poller threads
[2018-06-21 16:26:15.413726 -0500][Debug  ][TaskMgr           ] Starting the task manager...
[New Thread 0x7ffff4224700 (LWP 11114)]
[2018-06-21 16:26:15.414280 -0500][Debug  ][TaskMgr           ] Task manager started
[2018-06-21 16:26:15.414297 -0500][Debug  ][JobMgr            ] Starting the job manager...
[New Thread 0x7ffff3a23700 (LWP 11115)]
[New Thread 0x7ffff3222700 (LWP 11116)]
[New Thread 0x7ffff2a21700 (LWP 11117)]
[2018-06-21 16:26:15.416346 -0500][Debug  ][JobMgr            ] Job manager started, 3 workers
[2018-06-21 16:26:15.416364 -0500][Debug  ][TaskMgr           ] Registering task: "FileTimer task" to be run at: [2018-06-21 16:26:15 -0500]
[2018-06-21 16:26:15.416378 -0500][Debug  ][Utility           ] Opening root://xrootd.accre.vanderbilt.edu:1094//store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root for reading
[2018-06-21 16:26:15.416421 -0500][Debug  ][File              ] [0x61b6b0@root://xrootd.accre.vanderbilt.edu:1094//store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root] Sending an open command
[2018-06-21 16:26:15.416479 -0500][Debug  ][PostMaster        ] Creating new channel to: xrootd.accre.vanderbilt.edu:1094 1 stream(s)
[2018-06-21 16:26:15.416499 -0500][Debug  ][PostMaster        ] [xrootd.accre.vanderbilt.edu:1094 #0] Stream parameters: Network Stack: IPAuto, Connection Window: 120, ConnectionRetry: 5, Stream Error Widnow: 1800
[2018-06-21 16:26:15.417059 -0500][Debug  ][TaskMgr           ] Registering task: "TickGeneratorTask for: xrootd.accre.vanderbilt.edu:1094" to be run at: [2018-06-21 16:26:30 -0500]
[2018-06-21 16:26:15.418326 -0500][Debug  ][PostMaster        ] [xrootd.accre.vanderbilt.edu:1094] Found 1 address(es): [::ffff:129.59.197.121]:1094
[2018-06-21 16:26:15.418417 -0500][Debug  ][AsyncSock         ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Attempting connection to [::ffff:129.59.197.121]:1094
[2018-06-21 16:26:15.418473 -0500][Debug  ][Poller            ] Adding socket 0x621290 to the poller
[2018-06-21 16:26:15.459293 -0500][Debug  ][AsyncSock         ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Async connection call returned
[2018-06-21 16:26:15.459368 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Sending out the initial hand shake + kXR_protocol
[2018-06-21 16:26:15.500320 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Got the server hand shake response (type: manager [], protocol version 310)
[2018-06-21 16:26:15.500380 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] kXR_protocol successful (type: manager [], protocol version 310)
[2018-06-21 16:26:15.501518 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Sending out kXR_login request, username: zvada, cgi: ?xrd.cc=us&xrd.tz=-6&xrd.appname=xrdcp&xrd.info=&xrd.hostname=hcc-marian.unl.edu&xrd.rn=v4.8.3, dual-stack: true, private IPv4: false, private IPv6: false
[2018-06-21 16:26:15.542406 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Logged in, session: 6e4c0000093a000037000000164e0000
[2018-06-21 16:26:15.542456 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Authentication is required: &P=gsi,v:10300,c:ssl,ca:70d35895.0|d690e530.0
[2018-06-21 16:26:15.542470 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Sending authentication data
[2018-06-21 16:26:15.558881 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Trying to authenticate using gsi
[2018-06-21 16:26:15.692276 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Sending more authentication data for gsi
[2018-06-21 16:26:15.738889 -0500][Debug  ][XRootDTransport   ] [xrootd.accre.vanderbilt.edu:1094 #0.0] Authenticated with gsi.
[2018-06-21 16:26:15.738954 -0500][Debug  ][PostMaster        ] [xrootd.accre.vanderbilt.edu:1094 #0] Stream 0 connected.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff2a21700 (LWP 11117)]
operator<< <char, std::char_traits<char>, std::allocator<char> > (__str=<error reading variable: Cannot access memory at address 0x0>, __os=...) at /usr/include/c++/4.8.2/bits/basic_string.h:2758
warning: Source file is more recent than executable.
2758	      return __ostream_insert(__os, __str.data(), __str.size());
Missing separate debuginfos, use: debuginfo-install libstdc++-4.8.5-28.el7_5.1.x86_64
(gdb) bt
#0  operator<< <char, std::char_traits<char>, std::allocator<char> > (__str=<error reading variable: Cannot access memory at address 0x0>, __os=...) at /usr/include/c++/4.8.2/bits/basic_string.h:2758
#1  XrdCl::XRootDMsgHandler::Process (this=0x6215d0, msg=<optimized out>) at /usr/src/debug/xrootd-4.8.3/src/XrdCl/XrdClXRootDMsgHandler.cc:480
#2  0x00007ffff7b2c6ee in XrdCl::Stream::HandleIncMsgJob::Run (this=0x7fffec0556d0, arg=<optimized out>) at /usr/src/debug/xrootd-4.8.3/src/XrdCl/XrdClStream.hh:284
#3  0x00007ffff7b90daf in XrdCl::JobManager::RunJobs (this=0x61a5e0) at /usr/src/debug/xrootd-4.8.3/src/XrdCl/XrdClJobManager.cc:148
#4  0x00007ffff7b91009 in RunRunnerThread (arg=<optimized out>) at /usr/src/debug/xrootd-4.8.3/src/XrdCl/XrdClJobManager.cc:33
#5  0x00007ffff6620dc5 in start_thread (arg=0x7ffff2a21700) at pthread_create.c:308
#6  0x00007ffff6b4473d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) frame 2
#2  0x00007ffff7b2c6ee in XrdCl::Stream::HandleIncMsgJob::Run (this=0x7fffec0556d0, arg=<optimized out>) at /usr/src/debug/xrootd-4.8.3/src/XrdCl/XrdClStream.hh:284
284	            pHandler->Process( msg );
(gdb) frame 1
#1  XrdCl::XRootDMsgHandler::Process (this=0x6215d0, msg=<optimized out>) at /usr/src/debug/xrootd-4.8.3/src/XrdCl/XrdClXRootDMsgHandler.cc:480
480	        o << urlComponents[0];


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

{"@context":"http://schema.org","@type":"EmailMessage","potentialAction":{"@type":"ViewAction","target":"https://github.com/xrootd/xrootd/issues/752","url":"https://github.com/xrootd/xrootd/issues/752","name":"View Issue"},"description":"View this Issue on GitHub","publisher":{"@type":"Organization","name":"GitHub","url":"https://github.com"}} {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/xrootd/xrootd","title":"xrootd/xrootd","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/xrootd/xrootd"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"NULL pointer in XrdCl::XRootDMsgHandler? (#752)"}],"action":{"name":"View Issue","url":"https://github.com/xrootd/xrootd/issues/752"}}} { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "NULL pointer in XrdCl::XRootDMsgHandler? (#752)", "sections": [ { "text": "", "activityTitle": "**Marian Zvada**", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@zvada", "facts": [ { "name": "Repository: ", "value": "xrootd/xrootd" }, { "name": "Issue #: ", "value": 752 } ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"xrootd/xrootd\",\n\"issueId\": 752,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"xrootd/xrootd\",\n\"issueId\": 752\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/xrootd/xrootd/issues/752" } ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 348876614\n}" } ], "themeColor": "26292E" }

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1