Hi Andy, On 12/8/22 22:41, Andrew Hanushevsky wrote: > Hi Bockjoo, > > Remind me what DFS you are using (gpfs, lustre, etc)? It's lustre. > Does the redirector log (xrootd and cmsd) show anything unusual? No, but the only thing is about the TLS in the xrootd.log (xrootd is configured for both gsi and TLS/ztn and I was testing with export XrdSecPROTOCOL=gsi): 221209 03:48:13 36369 XrdLinkXeq: TLS connection from anon.0:[log in to unmask] failed; broken pipe 221209 03:48:13 36369 XrootdXeq: Unable to enable TLS for anon.0:[log in to unmask] 221209 03:48:13 36369 XrootdXeq: anon.0:[log in to unmask] disc 0:00:00 221209 03:48:13 36369 multiuser_UserSentry: No security entity object provided There is nothing about vocms036 and the file in the cmsd.log. > As for hanging at that line, that isn't necessarily where it happens > as hat is just a message and execution continues, though bviously > without getting a responds for a long time. When you say "time to > time" what is the periodicity? Every 5 or so tries, I have this long xrdfs execution time. > I presume the servers are running on bare metal (i.e. not VM's nor > containers), yes? > Correct. > There can be any number of reasons why this is happening. Once you > answer the above we can construct an appropriate trace on the > redirector to narrow down where it is actually happening. > > Andy I am not sure if this info is useful, but with export XRD_LOGLEVEL=Dump, I am seeing 'Running task: "FileTimer task": [2022-12-09 10:02:34.464248 +0100][Debug ][TaskMgr ] Registering task: "StreamConnectorTask for cmsio7.rc.ufl.edu:1094" to be run at: [2022-12-09 10:04:33 +0100] [2022-12-09 10:02:34.858015 +0100][Dump ][TaskMgr ] Running task: "FileTimer task" [2022-12-09 10:02:34.858070 +0100][Dump ][TaskMgr ] Will rerun task "FileTimer task" at [2022-12-09 10:02:49 +0100] [2022-12-09 10:02:48.859526 +0100][Dump ][TaskMgr ] Running task: "TickGeneratorTask for: root://cmsio7.rc.ufl.edu:1094" [2022-12-09 10:02:48.859624 +0100][Dump ][TaskMgr ] Will rerun task "TickGeneratorTask for: root://cmsio7.rc.ufl.edu:1094" at [2022-12-09 10:03:03 +0100] [2022-12-09 10:02:49.859739 +0100][Dump ][TaskMgr ] Running task: "FileTimer task" [2022-12-09 10:02:49.859797 +0100][Dump ][TaskMgr ] Will rerun task "FileTimer task" at [2022-12-09 10:03:04 +0100] Thanks, Bockjoo > > > On Thu, 8 Dec 2022, Bockjoo Kim wrote: > >> Hi, >> >> I have an issue with my redirector (XRootD 5.5.0) with the command: >> >> /usr/bin/xrdfs cmsio7.rc.ufl.edu:1094 stat >> //store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root >> >> Usually, the command returns in less than 5 seconds, but from time to >> time, it hangs here: >> >> [2022-12-09 00:07:54.714793 +0100][Debug ][TaskMgr ] >> Registering task: "StreamConnectorTask for cmsio7.rc.ufl.edu:1094" to >> be run at: [2022-12-09 00:09:54 +0100] >> >> and eventually returns in more than 2 minutes. >> >> Other USCMS Tier2 sites do not have this issue. >> >> There are 7 servers under the redirector and they do not have this >> issue. >> >> I thought the cmsd directive, cms.dfs might have something to do with >> this, so I tried a different option. >> >> But that didn't change anything. >> >> My current config uses this option for the cms.dfs directive: >> >> cms.dfs limit 0 lookup distrib mdhold 0 redirect immed retries 2 >> >> >> What could be the source of the problem? >> >> Thanks, >> >> Bockjoo >> >> ######################################################################## >> Use REPLY-ALL to reply to list >> >> To unsubscribe from the XROOTD-L list, click the following link: >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 >> ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1