Hi Fabrizio, Yes, I think you should not use the flag (of course you should provide for an option to turn it on). I understand the problem Alvise had. His client crashed and never closed the connection. So, the server thought the file was still opened for write and refused to let him rewrite it. It is also possible for this to happen if you close a connection and immediately try to open a new one. I think having the open to force the copy is probably sufficiemt. Andy ----- Original Message ----- From: "Fabrizio Furano" <[log in to unmask]> To: "Wilko Kroeger" <[log in to unmask]> Cc: "Peter Elmer" <[log in to unmask]>; <[log in to unmask]>; "Andrew Hanushevsky" <[log in to unmask]> Sent: Monday, March 07, 2005 2:20 AM Subject: Re: crashes in xrdcp > Hi, > > about the kxr_force, I remember that I put it into the flags to override > some situation in which a write retry could not succeed because the former > server had not already understood that the previous connection was down. > That was part of a bunch of little problems spotted by Alvise in his > tests. > > Andy, do you agree for me to cut that flag off? > > Fabrizio > > Wilko Kroeger wrote: >> Hello Pete >> >> Ok, thanks. I will try out the head. >> If possible we should also fix the problem that the kXR_force >> is used. It seems to me quite dangerous that two clients >> can write to the same file or one could over write piece >> of an existing file. >> >> Cheers, >> Wilko >> >> >> On Thu, 3 Mar 2005, Peter Elmer wrote: >> >> >>> Hi Wilko, >>> >>> Just for the record, Fabrizio just wrote (as part of a CVS commit): >>> >>>On Thu, Mar 03, 2005 at 07:33:42PM +0000, Fabrizio Furano wrote: >>> >>>>Hi again, >>> >>><...> >>> >>>> With this one I am no longer able to make xrdcp crash under heavy load >>>>in the client/server machine. I am still investigating on the occasional >>>>cpu eating, but it seems that that's more difficult, since in my tests, >>>>the problem disappears when enabling the client side log, and for some >>>>strange reason I am not able to spot it by attaching gdb to the process. >>>> >>>>Fabrizio >>> >>> Pete >>> >>> >>>On Mon, Feb 28, 2005 at 12:26:48AM -0800, Wilko Kroeger wrote: >>> >>>>Hello Fabrizio >>>> >>>>I run the xrdcp test again and I can reproduce crashes in xrdcp >>>>(some times it take 30-60 mins). >>>>I used the xrootd version 20050226-0825 and xrdcp is running on a RHEL3 >>>>machine. I read the same file over and over: >>>> xrdcp -DIDebugLevel 2 root://${xrdhost}:2094///prod/test/small.test - >>>> > /dev/null >>>> >>>>The size of the small.test file is: >>>> >>>>>ls -l small.test >>>> >>>>rw-r--r-- 1 wilko ec 31457280 Feb 27 18:09 >>>>/u1/wilko/kanga/prod/test/small.test >>>>which is 30 MB (30*1024*1024) >>>> >>>>I used debugLevel 1 and 2. >>>> >>>>You can find the core file and the debug output files in: >>>>~wilko/bbdev/work/xrootd/core/20050227_2233_d1/ >>>>~wilko/bbdev/work/xrootd/core/20050227_2302_d1/ >>>>~wilko/bbdev/work/xrootd/core/20050227_2314_d2/ >>>>~wilko/bbdev/work/xrootd/core/20050227_2350_d2/ >>>> >>>>each directory contains a core file and the debug output file >>>>(wk_log...). The ending d1 or d2 means debuglevel 1 or 2. >>>> >>>>With debug option = 1, gdb shows: >>>>#0 0x0018b17c in memcpy () from /lib/tls/libc.so.6 >>>>#1 0x0806edbc in XrdClientReadCacheItem::GetPartialInterval(void >>>>const*, >>>> long long, long long) (this=0x9f107d0, buffer=0xb5750d08, >>>> begin_offs=31457280, end_offs=31714559) at XrdClientReadCache.hh:93 >>>> >>>>whereas with debugLevel=2, gdb shows: >>>> >>>>#0 0x00a4e027 in _int_free () from /lib/tls/libc.so.6 >>>>#1 0x00a4d018 in free () from /lib/tls/libc.so.6 >>>>#2 0x0806d984 in ~XrdClientReadCacheItem (this=0x96b3db8) at >>>> XrdClientReadCache.cc:40 >>>> >>>> >>>>On the xrootd site I see the error: >>>>050227 23:54:39 064 XrdLink: Unable to receive from >>>>wilko.30110:17@tori0001; >>>> connection reset by peer >>>>050227 23:54:39 064 XrootdXeq: wilko.30110:17@tori0001 disc 1:02:03 >>>>(link >>>> read error) >>>> >>>>(the corresponding client crash was around 23:50) >>>> >>>> >>>>Thanks for looking into this, >>>> >>>>Wilko >>>> >>> >>> >>> >>>------------------------------------------------------------------------- >>>Peter Elmer E-mail: [log in to unmask] Phone: +41 (22) 767-4644 >>>Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland >>>------------------------------------------------------------------------- >>> >