Posted by: Marco Cattaneo <cattanem> Related to: [ROOT bugs #87880] wrong TBasket when reading via xrootd URL: <http://savannah.cern.ch/bugs/?87880> Follow-up Comment: In which case I guess we need a patch release of Root 5.30 ASAP, so that we can integrate this with our applications? Submitted by: clemencic Originator Email: Bug / Feature: Bug report Category: Priority: 5 - Normal Severity: 5 - Blocker Status: Fixed Privacy: Public Assigned to: ljanyst Open/Closed: Closed Release: 5.30/00 Discussion Lock: Operating System: GNU/Linux -----Reply from Lukasz Janyst <ljanyst> on 2011-11-03 12:53 (Europe/Paris)----- It's the client. -----Reply from Marco Clemencic <clemencic> on 2011-11-03 12:51 (Europe/Paris)----- Hi Lukasz, thanks a lot for the fix. It's not too clear to me if the problem is on the client or server side. In any case, how can we get the fix deployed? Thanks Marco -----Reply from Lukasz Janyst <ljanyst> on 2011-11-03 12:34 (Europe/Paris)----- XRootD has some limits on the lengths and number of chunks per vector read. If one of those limits is exceeded then the XrdClient::ReadV method will try to split the chunks themselves and send them in many actual readv requests if necessary. This is what happened in this case: * the number of chunks after splitting them not to exceed max chunk size was 513 * maximum number of chunks per readv is 512, so the client issued one readv and one read request * the readv request was unpacked correctly into the buffer supplied to the client but the read request didn't take into account that a readv request was unpacked before overwriting the beginning of the buffer http://xrootd.cern.ch/cgi-bin/cgit.cgi/xrootd/commit/?id=d10c528900539891037566b5d26c26be1c662132 -----Reply from Lukasz Janyst <ljanyst> on 2011-11-03 10:26 (Europe/Paris)----- It's a bug in the vector read algorithm of xrootd. The attached file reproduces the issue in terms of pure xroot api. (file #22190, file #22191) -----Reply from Marco Cattaneo <cattanem> on 2011-11-02 16:45 (Europe/Paris)----- To comment#8 Yes, that is also what we observe: we have circumstantial evidence that switching off the cache circumvents the problem. -----Reply from Lukasz Janyst <ljanyst> on 2011-11-02 16:32 (Europe/Paris)----- On the other hand with TTreeCache disabled all three protocols come out with the right thing. -----Reply from Lukasz Janyst <ljanyst> on 2011-11-02 15:50 (Europe/Paris)----- rfio:// is consistent to castor:// but different than root:// sorry for the mistake. -----Reply from Lukasz Janyst <ljanyst> on 2011-11-02 15:48 (Europe/Paris)----- I have hooked in the crc32 calculation for this particular buffer and it indeed comes different for the two protocols. I have also checked rfio:// and get the reading consistent to the access through root://. This confirms that there is a problem on xrootd side, either in TXNetFile or xrootd itself. -----Reply from Lukasz Janyst <ljanyst> on 2011-11-02 14:49 (Europe/Paris)----- I am looking into this and will let you know as soon as I know something. -----Reply from Marco Cattaneo <cattanem> on 2011-11-02 14:35 (Europe/Paris)----- Is there any progress with this? We need a fix.... -----Reply from Marco Clemencic <clemencic> on 2011-10-28 22:07 (Europe/Paris)----- Hi Lukasz, I do not know exactly the details, but you can find it here: https://svnweb.cern.ch/trac/lhcb/browser/Online/trunk/Online/RootCnv If you need something else, let me know. Thanks Marco -----Reply from Lukasz Janyst <ljanyst> on 2011-10-24 16:39 (Europe/Paris)----- Marco, can you point me to the place where you keep the code interacting with ROOT? I meant the part where the event selector touches ROOT API. -----Reply from Lukasz Janyst <ljanyst> on 2011-10-24 14:45 (Europe/Paris)----- I can reproduce the problem and will be looking at it. Lukasz -----Original Message----- Hi, we have a problem (most probably) with xrootd (bug #87105). In some cases we get strange warnings of the type Warning in <TBasket::ReadBasketBuffers>: basket:_Event_Bhadron_Phys_B02DstDstKSDDBeauty2CharmLine_Particle2VertexRelations.DataObject.m_version has fNevBuf=26400 but fEntryOffset=0, pos=509266101, len=26563, fNbytes=311, fObjlen=26400, trying to repair The problem doesn't occur if we use rootd instead of xrootd. At least in the case case I've been testing, it turns out that we are getting the wrong basket. I crafted a minimal example (attached) that exposes the problem, showing the different behavior between "root://" and "castor://" PFNs. It reads few events and it prints the branch name for a good event and for a bad event. The output you should get is: $ ./run_me.sh ... Prepare the environment ROOTSYS=/afs/cern.ch/sw/lcg/app/releases/ROOT/5.30.00/x86_64-slc5-gcc43-dbg/root ****** Using xrootd (PFN:root://) ****** No source file named TBranch.cxx. warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaac7000