Print

Print


Hello Pete

Ok, thanks. I will try out the head.
If possible we should also fix the problem that the kXR_force
is used. It seems to me quite dangerous that two clients
can write to the same file or one could over write piece
of an existing file.

Cheers,
  Wilko


On Thu, 3 Mar 2005, Peter Elmer wrote:

>   Hi Wilko,
>
>   Just for the record, Fabrizio just wrote (as part of a CVS commit):
>
> On Thu, Mar 03, 2005 at 07:33:42PM +0000, Fabrizio Furano wrote:
> > Hi again,
> <...>
> >  With this one I am no longer able to make xrdcp crash under heavy load
> > in the client/server machine. I am still investigating on the occasional
> > cpu eating, but it seems that that's more difficult, since in my tests,
> > the problem disappears when enabling the client side log, and for some
> > strange reason I am not able to spot it by attaching gdb to the process.
> >
> > Fabrizio
>
>                                    Pete
>
>
> On Mon, Feb 28, 2005 at 12:26:48AM -0800, Wilko Kroeger wrote:
> > Hello Fabrizio
> >
> > I run the xrdcp test again and I can reproduce crashes in xrdcp
> > (some times it take 30-60 mins).
> > I used the xrootd version 20050226-0825 and xrdcp is running on a RHEL3
> > machine. I read the same file over and over:
> >   xrdcp -DIDebugLevel 2 root://${xrdhost}:2094///prod/test/small.test - > /dev/null
> >
> > The size of the small.test file is:
> > > ls -l small.test
> > rw-r--r--   1 wilko  ec  31457280 Feb 27 18:09 /u1/wilko/kanga/prod/test/small.test
> > which is 30 MB (30*1024*1024)
> >
> > I used debugLevel 1 and 2.
> >
> > You can find the core file and the debug output files in:
> > ~wilko/bbdev/work/xrootd/core/20050227_2233_d1/
> > ~wilko/bbdev/work/xrootd/core/20050227_2302_d1/
> > ~wilko/bbdev/work/xrootd/core/20050227_2314_d2/
> > ~wilko/bbdev/work/xrootd/core/20050227_2350_d2/
> >
> > each directory contains a core file and the debug output file
> > (wk_log...). The ending d1 or d2 means debuglevel 1 or 2.
> >
> > With debug option = 1, gdb shows:
> > #0  0x0018b17c in memcpy () from /lib/tls/libc.so.6
> > #1  0x0806edbc in XrdClientReadCacheItem::GetPartialInterval(void const*,
> >     long long, long long) (this=0x9f107d0, buffer=0xb5750d08,
> >     begin_offs=31457280, end_offs=31714559) at XrdClientReadCache.hh:93
> >
> > whereas with debugLevel=2, gdb shows:
> >
> > #0  0x00a4e027 in _int_free () from /lib/tls/libc.so.6
> > #1  0x00a4d018 in free () from /lib/tls/libc.so.6
> > #2  0x0806d984 in ~XrdClientReadCacheItem (this=0x96b3db8) at
> >     XrdClientReadCache.cc:40
> >
> >
> > On the xrootd site I see the error:
> > 050227 23:54:39 064 XrdLink: Unable to receive from wilko.30110:17@tori0001;
> >        connection reset by peer
> > 050227 23:54:39 064 XrootdXeq: wilko.30110:17@tori0001 disc 1:02:03 (link
> >        read error)
> >
> > (the corresponding client crash was around 23:50)
> >
> >
> > Thanks for looking into this,
> >
> > Wilko
> >
>
>
>
> -------------------------------------------------------------------------
> Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
> Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
> -------------------------------------------------------------------------
>