Print

Print


With xrootd 5.3.4 (running on EOS 5.0.9 storage nodes) we hit the following problem which makes the EOS FST service crash and restart in a loop. I let @simonmichal comment for more info on the problem. 

```bash
220114 03:15:56 time=1642126556.999168 func=DoTpcTransfer            level=INFO  logid=e86c642c-74df-11ec-8906-0cc47a69735c unit=f
[log in to unmask]:1095 tid=00007fa7d19b4700 source=XrdFstOfsFile:3480             tident=1.14663:53@eospilot-ns-00 sec=
    uid=1 gid=1 name=nobody geo="" msg="tcp write" offset=0
=================================================================
==12363==ERROR: AddressSanitizer: heap-use-after-free on address 0x6140000f6f50 at pc 0x7fa829eac3bc bp 0x7fa80ddc35c0 sp 0x7fa80ddc35b0
READ of size 8 at 0x6140000f6f50 thread T443
    #0 0x7fa829eac3bb in XrdCl::AsyncSocketHandler::Event(unsigned char, XrdCl::Socket*) /usr/src/debug/xrootd-5.3.4/src/XrdCl/XrdClAsyncSocketHandler.cc:257
    #1 0x7fa829c8af6b  (/opt/eos/xrootd/lib64/libXrdCl.so.3+0x36cf6b)
    #2 0x7fa83882d666 in XrdSys::IOEvents::Poller::CbkXeq(XrdSys::IOEvents::Channel*, int, int, char const*) (/opt/eos/xrootd/lib64/libXrdUtils.so.3+0x97666)
    #3 0x7fa83882ed5b in XrdSys::IOEvents::Poller::CbkTMO() (/opt/eos/xrootd/lib64/libXrdUtils.so.3+0x98d5b)
    #4 0x7fa83882f40d in XrdSys::IOEvents::Poller::TmoGet() (/opt/eos/xrootd/lib64/libXrdUtils.so.3+0x9940d)
    #5 0x7fa838831a67 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) /usr/src/debug/xrootd-5.3.4/src/XrdSys/XrdSysIOEventsPollE.icc:212
    #6 0x7fa838831a67 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) /usr/src/debug/xrootd-5.3.4/src/XrdSys/XrdSysIOEventsPollE.icc:196
    #7 0x7fa8388276bf in XrdSys::IOEvents::BootStrap::Start(void*) /usr/src/debug/xrootd-5.3.4/src/XrdSys/XrdSysIOEvents.cc:133
    #8 0x7fa83883fe29 in XrdSysThread_Xeq /usr/src/debug/xrootd-5.3.4/src/XrdSys/XrdSysPthread.cc:86
    #9 0x7fa837955ea4 in start_thread (/lib64/libpthread.so.0+0x7ea4)
    #10 0x7fa83767eb0c in clone (/lib64/libc.so.6+0xfeb0c)

0x6140000f6f50 is located 272 bytes inside of 408-byte region [0x6140000f6e40,0x6140000f6fd8)
freed by thread T443 here:
    #0 0x7fa839420a35 in operator delete(void*, unsigned long) (/usr/lib64/libasan.so.5+0x10fa35)
    #1 0x7fa829caf566 in XrdCl::SubStreamData::~SubStreamData() /usr/src/debug/xrootd-5.3.4/src/XrdCl/XrdClStream.cc:78
    #2 0x7fa829caf566 in XrdCl::Stream::~Stream() /usr/src/debug/xrootd-5.3.4/src/XrdCl/XrdClStream.cc:150

previously allocated by thread T68 here:
    #0 0x7fa83941f36f in operator new(unsigned long) (/usr/lib64/libasan.so.5+0x10e36f)
    #1 0x7fa829cbeddf in XrdCl::Stream::Initialize() /usr/src/debug/xrootd-5.3.4/src/XrdCl/XrdClStream.cc:164
    #2 0x7fa829ca5842 in XrdCl::Channel::Channel(XrdCl::URL const&, XrdCl::Poller*, XrdCl::TransportHandler*, XrdCl::TaskManager*,
 XrdCl::JobManager*, XrdCl::URL const&) /usr/src/debug/xrootd-5.3.4/src/XrdCl/XrdClChannel.cc:120

Thread T443 created by T239 here:
    #0 0x7fa83934b471 in pthread_create (/usr/lib64/libasan.so.5+0x3a471)
    #1 0x7fa83884067a in XrdSysThread::Run(unsigned long*, void* (*)(void*), void*, int, char const*) (/opt/eos/xrootd/lib64/libXr
dUtils.so.3+0xaa67a)

Thread T239 created by T51 here:
    #0 0x7fa83934b471 in pthread_create (/usr/lib64/libasan.so.5+0x3a471)
    #1 0x7fa83884067a in XrdSysThread::Run(unsigned long*, void* (*)(void*), void*, int, char const*) (/opt/eos/xrootd/lib64/libXr
dUtils.so.3+0xaa67a)

Thread T51 created by T0 here:
    #0 0x7fa83934b471 in pthread_create (/usr/lib64/libasan.so.5+0x3a471)
    #1 0x7fa82b739c64 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State
> >, void (*)()) (/lib64/libEosCommon.so.5+0x70fc64)
    #2 0x62000000507f  (<unknown module>)

Thread T68 created by T5 here:
    #0 0x7fa83934b471 in pthread_create (/usr/lib64/libasan.so.5+0x3a471)
    #1 0x7fa83884067a in XrdSysThread::Run(unsigned long*, void* (*)(void*), void*, int, char const*) (/opt/eos/xrootd/lib64/libXr
dUtils.so.3+0xaa67a)

Thread T5 created by T0 here:
    #0 0x7fa83934b471 in pthread_create (/usr/lib64/libasan.so.5+0x3a471)
    #1 0x7fa83884067a in XrdSysThread::Run(unsigned long*, void* (*)(void*), void*, int, char const*) (/opt/eos/xrootd/lib64/libXr
dUtils.so.3+0xaa67a)



SUMMARY: AddressSanitizer: heap-use-after-free /usr/src/debug/xrootd-5.3.4/src/XrdCl/XrdClAsyncSocketHandler.cc:257 in XrdCl::Asyn
cSocketHandler::Event(unsigned char, XrdCl::Socket*)
Shadow bytes around the buggy address:
  0x0c2880016d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2880016da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2880016db0: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c2880016dc0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c2880016dd0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c2880016de0: fd fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd
  0x0c2880016df0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c2880016e00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c2880016e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2880016e20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2880016e30: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==12363==ABORTING
220114 03:16:03 24278 Starting on Linux 3.10.0-1160.31.1.el7.x86_64
220114 03:16:03 24278 /opt/eos/xrootd/bin/xrootd -n fst -c /etc/xrd.cf.fst -l /var/log/eos/xrdlog.fst -Rdaemon
```




[1]

[root@p05151113071960 ~]# rpm -qa | grep xroot
xrootd-5.3.4-1.el7.x86_64
xrootd-libs-5.3.4-1.el7.x86_64
xrootd-server-libs-5.3.4-1.el7.x86_64
eos-xrootd-debuginfo-5.3.4-1.el7.cern.asan.x86_64
xrootd-client-libs-5.3.4-1.el7.x86_64
xrootd-server-5.3.4-1.el7.x86_64
xrootd-client-5.3.4-1.el7.x86_64
eos-xrootd-5.3.4-1.el7.cern.asan.x86_64
xrootd-selinux-5.3.4-1.el7.noarch
xrootd-debuginfo-5.3.4-1.el7.x86_64
[root@p05151113071960 ~]# rpm -qa | grep eos
eos-client-5.0.9-1.el7.cern.asan.x86_64
eos-grpc-1.41.0-1.el7.x86_64
eos-xrootd-debuginfo-5.3.4-1.el7.cern.asan.x86_64
eos-folly-deps-2019.11.11.00-1.el7.cern.x86_64
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64
eos-libmicrohttpd-0.9.38-eos.el7.cern.x86_64
eos-xrootd-5.3.4-1.el7.cern.asan.x86_64
eos-folly-2019.11.11.00-1.el7.cern.x86_64
eosscripts-2.38-2.el7.cern.noarch
eos-server-5.0.9-1.el7.cern.asan.x86_64

-- 
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/1590
You are receiving this because you are subscribed to this thread.

Message ID: <[log in to unmask]>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1