Hi Matevz, Yes, we're using 4.0.0 now. Regards, Andrew. ________________________________________ From: Matevz Tadel [[log in to unmask]] Sent: Friday, June 27, 2014 6:09 PM To: Andrew Hanushevsky Cc: xrootd-dev; Lahiff, Andrew (STFC,RAL,PPD) Subject: Re: Stalls at outgoing proxy Thanks Andy, This is actually the standard proxy, RAL was running 4.0.0-rc1 the last time we talked about it. Andrew, have you upgraded to 4.0.0 yet? Matevz On 06/27/14 10:01, Andrew Hanushevsky wrote: > Hi Matevz, > > No need to turn on debugging here. This particular stall occurs because a file > is being opened and the OFS has found that the file is already open or being > opened by another client. So, it tries to piggy-back the new open on that handle > to avoid actually doing another physical open. The problem is that the other > client has not yet released the handle for use; likely being hung up in the > proxy code trying to do the open or perhaps a close. The latter problem I > thought was solved by the disk caching proxy by doing the closes in the > background to avoid holding on to the handle lock for long periods of time. > > This is not a fatal problem the client will eventually open the file. The ofs > layer uses this as congenstion control when there is a lot of open/close > contention for the same file. I suppose you can trace opens and closes to get > better feeling of how long this takes: > > ofs.trace open close > > Assuming this is a disk caching proxy there may be tracing options for that to > see what happens during the open/close sequence. > > Andy > > On Fri, 27 Jun 2014, Matevz Tadel wrote: > >> Hi, >> >> At RAL, they see the following on their outgoing proxy servers (repeating for >> about a minute before the file-open times-out at the application side):<<FNORD >> >> When our xrootd proxy cluster is busy, there are sometimes messages like this >> in the logs: >> >> 140626 16:53:25 24465 ofs_Stall: Stall 3: File >> 2EF5AF84-D65A-E311-AB3F-02163E00A0E1.root is being staged; estimated time to >> completion 3 seconds for >> /store/mc/Fall13/QCD_Pt-5to10_Tune4C_13TeV_pythia8/GEN-SIM/POSTLS162_V1_castor-v1/10000/2EF5AF84-D65A-E311-AB3F-02163E00A0E1.root >> >> 140626 16:53:25 24465 pcms054.6545:147@lcg1353 XrootdProtocol: stalling client >> for 3 sec >> 140626 16:53:25 24465 pcms054.6545:147@lcg1353 ofs_close: use=0 fn=dummy >> >> FNORD >> >> This probably means that the remote file can not be opened for some reason >> (like being delayed by external redirector/server)? Would there be a special >> error if the socket can not be opened (due to fd or firewall limits ... or >> some other internal limits)? Note that this only happens when the proxies are >> already under heavy load. >> >> What options should they set to debug this? >> >> pss.memcache debug ??? >> xrd.trace conn >> xrootd.trace redirect >> >> Matevz >> >> ######################################################################## >> Use REPLY-ALL to reply to list >> >> To unsubscribe from the XROOTD-DEV list, click the following link: >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> -- Scanned by iCritical. ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1