Hello all, i'am new to xrootd and evaluate a test setup in a 10GBit network. During a looped copy job of a 1GB large test file xrootd 4.8.1 stalls every 4 to 7 copy jobs. The copying is done between RAM-Network-RAM to exclude disk i/o. Sometimes xrootd comes back, sometimes the copy job has to be killed. Restarting the xrootd job reverts everything to normal until the stall reappears. I would expect xrootd not to stall even under such circumstances. But i agree this is a somewhat artificial usecase. Best Heiko xrootd.cf, Ver 4.8.1: all.export /xrootd set xrdr=REDIRECTOR set inventory=/var/log/xrootd/inventory all.manager $(xrdr):3121 if $(xrdr) && named cns all.export $(inventory) xrd.port 1095 else if $(xrdr) all.role manager ofs.forward 3way $(xrdr):1095 mv rm rmdir trunc xrd.port 1094 else all.role server ofs.notify closew create mkdir mv rm rmdir trunc | /usr/bin/XrdCnsd -d -D 2 -i 90 -b $(xrdr):1095:$(inventory) ofs.notifymsg create $TID create $FMODE $LFN?$CGI ofs.notifymsg closew $TID closew $LFN $FSIZE fi The brute force test: for ((i=0;i<=100;i++));do rm -f /mnt/ramdisk/test.dat; xrdcp -d 3 -v root://REDIRECTOR//xrootd/test.dat /mnt/ramdisk/test.dat; rm -f /mnt/ramdisk/test.dat; sleep 1; done xrdcp debug output: [2018-04-04 14:42:59.221301 +0200][Debug ][File ] [0x24e27a0@file://localhost/mnt/ramdisk/test.dat?oss.asize=1073741824] Sending a write command for handle 0xb to localhost [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] URL: file://localhost/mnt/ramdisk/test.dat?oss.asize=1073741824 [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] Protocol: file [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] User Name: [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] Password: [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] Host Name: localhost [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] Port: 1094 [2018-04-04 14:42:59.228163 +0200][Dump ][Utility ] Path: /mnt/ramdisk/test.dat [2018-04-04 14:42:59.228229 +0200][Debug ][File ] [0x24dd0d0@root://REDIRECTOR:1094//xrootd/test.dat] Sending a read command for handle 0x0 to 192.168.16.120:1094 [2018-04-04 14:42:59.228233 +0200][Dump ][File ] [0x24e27a0@file://localhost/mnt/ramdisk/test.dat?oss.asize=1073741824] Got state response for message kXR_write (handle: 0x0b000000, offset: 503316480, size: 16777216) [2018-04-04 14:42:59.228254 +0200][Dump ][XRootD ] [192.168.16.120:1094] Sending message kXR_read (handle: 0x00000000, offset: 570425344, size: 16777216) [2018-04-04 14:42:59.228272 +0200][Dump ][PostMaster ] [192.168.16.120:1094 #0] Sending message kXR_read (handle: 0x00000000, offset: 570425344, size: 16777216) (0x24dd9e0) through substream 0 expecting answer at 0 [2018-04-04 14:42:59.228305 +0200][Dump ][AsyncSock ] [192.168.16.120:1094 #0.0] Wrote a message: kXR_read (handle: 0x00000000, offset: 570425344, size: 16777216) (0x24dd9e0), 32 bytes [2018-04-04 14:42:59.228329 +0200][Dump ][AsyncSock ] [192.168.16.120:1094 #0.0] Successfully sent message: kXR_read (handle: 0x00000000, offset: 570425344, size: 16777216) (0x24dd9e0). [2018-04-04 14:42:59.228340 +0200][Dump ][XRootD ] [192.168.16.120:1094] Message kXR_read (handle: 0x00000000, offset: 570425344, size: 16777216) has been successfully sent. [2018-04-04 14:42:59.228353 +0200][Dump ][PostMaster ] [192.168.16.120:1094 #0.0] All messages consumed, disable uplink [2018-04-04 14:42:59.750894 +0200][Dump ][TaskMgr ] Running task: "FileTimer task" [2018-04-04 14:42:59.750934 +0200][Dump ][TaskMgr ] Will rerun task "FileTimer task" at [2018-04-04 14:43:14 +0200] [2018-04-04 14:43:13.464015 +0200][Dump ][XRootDTransport ] [REDIRECTOR:1094 #0.0] Stream inactive since 15 seconds, TTL: 1200, allocated SIDs: 0, open files: 0 [2018-04-04 14:43:13.464039 +0200][Dump ][XRootDTransport ] [REDIRECTOR:1094 #0.0] Stream inactive since 15 seconds, stream timeout: 60, allocated SIDs: 0, wait barrier: 2018-04-04 14:42:58 +0200 [2018-04-04 14:43:13.751694 +0200][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: REDIRECTOR:1094" [2018-04-04 14:43:13.751737 +0200][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: REDIRECTOR:1094" at [2018-04-04 14:43:28 +0200] [2018-04-04 14:43:13.751753 +0200][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: 192.168.16.120:1094" [2018-04-04 14:43:13.751764 +0200][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: 192.168.16.120:1094" at [2018-04-04 14:43:28 +0200] [2018-04-04 14:43:14.751830 +0200][Dump ][TaskMgr ] Running task: "FileTimer task" [2018-04-04 14:43:14.751849 +0200][Dump ][TaskMgr ] Will rerun task "FileTimer task" at [2018-04-04 14:43:29 +0200] [2018-04-04 14:43:28.752586 +0200][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: REDIRECTOR:1094" [2018-04-04 14:43:28.752656 +0200][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: REDIRECTOR:1094" at [2018-04-04 14:43:43 +0200] [2018-04-04 14:43:28.752691 +0200][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: 192.168.16.120:1094" [2018-04-04 14:43:28.752727 +0200][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: 192.168.16.120:1094" at [2018-04-04 14:43:43 +0200] [2018-04-04 14:43:28.785950 +0200][Dump ][XRootDTransport ] [REDIRECTOR:1094 #0.0] Stream inactive since 30 seconds, TTL: 1200, allocated SIDs: 0, open files: 0 [2018-04-04 14:43:28.786026 +0200][Dump ][XRootDTransport ] [REDIRECTOR:1094 #0.0] Stream inactive since 30 seconds, stream timeout: 60, allocated SIDs: 0, wait barrier: 2018-04-04 14:42:58 +0200 [2018-04-04 14:43:29.752822 +0200][Dump ][TaskMgr ] Running task: "FileTimer task" [2018-04-04 14:43:29.752892 +0200][Dump ][TaskMgr ] Will rerun task "FileTimer task" at [2018-04-04 14:43:44 +0200] [2018-04-04 14:43:40.846051 +0200][Dump ][XRootDTransport ] [192.168.16.120:1094 #0.0] Stream inactive since 15 seconds, TTL: 300, allocated SIDs: 4, open files: 1 [2018-04-04 14:43:40.846125 +0200][Dump ][XRootDTransport ] [192.168.16.120:1094 #0.0] Stream inactive since 15 seconds, stream timeout: 60, allocated SIDs: 4, wait barrier: 2018-04-04 14:42:59 +0200 [2018-04-04 14:43:43.753676 +0200][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: REDIRECTOR:1094" [2018-04-04 14:43:43.753760 +0200][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: REDIRECTOR:1094" at [2018-04-04 14:43:58 +0200] [2018-04-04 14:43:43.753775 +0200][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: 192.168.16.120:1094" [2018-04-04 14:43:43.753786 +0200][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: 192.168.16.120:1094" at [2018-04-04 14:43:58 +0200] [2018-04-04 14:43:43.854343 +0200][Dump ][XRootDTransport ] [REDIRECTOR:1094 #0.0] Stream inactive since 45 seconds, TTL: 1200, allocated SIDs: 0, open files: 0 [2018-04-04 14:43:43.854399 +0200][Dump ][XRootDTransport ] [REDIRECTOR:1094 #0.0] Stream inactive since 45 seconds, stream timeout: 60, allocated SIDs: 0, wait barrier: 2018-04-04 14:42:58 +0200 [2018-04-04 14:43:44.753880 +0200][Dump ][TaskMgr ] Running task: "FileTimer task" [2018-04-04 14:43:44.753958 +0200][Dump ][TaskMgr ] Will rerun task "FileTimer task" at [2018-04-04 14:43:59 +0200] -- ----------------------------------------------------------------------- Heiko Schröter Institute of Environmental Physics (IUP) phone: ++49-(0)421-218-62092 Institute of Remote Sensing (IFE) fax: ++49-(0)421-218-62070 University of Bremen (FB1) P.O. Box 330440 email: [log in to unmask] Otto-Hahn-Allee 1 28359 Bremen Germany ----------------------------------------------------------------------- ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1