Hi Chris, I just fixed almost everything. For the last item, you are right also there. Before exiting, the dtor of the connectionmgr does a garbage collection, which only destroys the connections idle from more than x seconds. The younger ones are left there, so this is that mem leak. Not serious, as you point out, but I'll figure out a fix for this also in the next few days. Fabrizio Chris Jones wrote: > Gentlemen, > Since we now have the XrdClient code working with CLEO's > infrastructure code, it was easy to run Purify on the code. It found a > number of 'off the ends of array writes' (some reads) and a few memory > leaks. Below I have included the output of Purify and my comments > about the exact place of the problem. > > Chris Jones > Cornell University > > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ---- > > Purify > ======================================================================== > ====== > ABW: Array bounds write: > * This is occurring while in thread 6: > memset [rtlib.o] > bool XrdClientMessage::CreateData() [XrdClientMessage.cc:90] > int XrdClientMessage::ReadRaw(XrdClientPhyConnection*) > [XrdClientMessage.cc:160] > > XrdClientMessage*XrdClientPhyConnection::BuildMessage(bool,bool) > [XrdClientMessage.hh:55] > SocketReaderThread [XrdClientPhyConnection.cc:49] > _thread_start [libthread.so.1] > * Writing 1 byte to 0x21a9b5 in the heap. > * Address 0x21a9b5 is 1 byte past end of a malloc'd block at 0x21a9b0 > of 5 bytes. > * This block was allocated from thread 6: > malloc [rtlib.o] > bool XrdClientMessage::CreateData() [XrdClientMessage.cc:83] > int XrdClientMessage::ReadRaw(XrdClientPhyConnection*) > [XrdClientMessage.cc:160] > > XrdClientMessage*XrdClientPhyConnection::BuildMessage(bool,bool) > [XrdClientMessage.hh:55] > SocketReaderThread [XrdClientPhyConnection.cc:49] > _thread_start [libthread.so.1] > > **Comment** > XrdClientMessage.cc lines 83 (where memory allocated) to 90 (where > memset called) > > fData = malloc(fHdr.dlen+1); > if (!fData) { > Error("XrdClientMessage::CreateData","Fatal ERROR *** > malloc failed." > " Probable system resources exhausted."); > abort(); > } > char *tmpPtr = (char *)fData; > memset((void*)(tmpPtr+fHdr.dlen+1), 0, 1); > > memset is writing to 1 beyond the end. So line 90 should be changed to > memset((void*)(tmpPtr+fHdr.dlen), 0, 1); > or line 83 should be set to > fData = malloc(fHdr.dlen+2); > > > > Purify > ======================================================================== > ====== > ABR: Array bounds read: > * This is occurring while in: > strlen [rtlib.o] > void > ParseRedir(XrdClientMessage*,int&,XrdClientString&,XrdClientString&) > [XrdClientString.hh:67] > XrdClientConn::ESrvErrorHandlerRetval XrdClientConn:: > HandleServerError(XReqErrorType&,XrdClientMessage*,ClientRequest*) > [XrdClientConn.cc:1398] > XrdClientMessage*XrdClientConn:: > ReadPartialAnswer(XReqErrorType&,unsigned&,ClientRequest*,bool,void**,Xr > dClientConn::EThreeStateReadHandler&) [XrdClientConn.cc:753] > > XrdClientMessage*XrdClientConn::ClientServerCmd(ClientRequest*,const > void*,void**,void*,bool) [XrdClientConn.cc:238] > bool XrdClientConn::SendGenCommand(ClientRequest*,const > void*,void**,void*,bool,char*,ServerResponseHeader*) > [XrdClientConn.cc:300] > * Reading 24 bytes from 0x28ba5c in the heap (1 byte at 0x28ba73 > illegal). > * Address 0x28ba5c is 4 bytes into a malloc'd block at 0x28ba58 of 27 > bytes. > * This block was allocated from thread 6: > malloc [rtlib.o] > bool XrdClientMessage::CreateData() [XrdClientMessage.cc:83] > int XrdClientMessage::ReadRaw(XrdClientPhyConnection*) > [XrdClientMessage.cc:160] > > XrdClientMessage*XrdClientPhyConnection::BuildMessage(bool,bool) > [XrdClientMessage.hh:55] > SocketReaderThread [XrdClientPhyConnection.cc:49] > _thread_start [libthread.so.1] > > > **Comment** > > probably this comes from XrdClientConn.cc line 55 > host = redirdata->host; > where redirdata->host is not null terminated since it is obtained from > the server and the server just says how long the buffer is. > > Purify > ======================================================================== > ====== > ABW: Array bounds write: > * This is occurring while in: > bool XrdClientStringMatcher::SingleMatches(char*,char*) > [XrdClientStringMatcher.cc:57] > bool XrdClientStringMatcher::Matches(char*) > [XrdClientStringMatcher.cc:123] > bool XrdClientConn:: > CheckHostDomain(XrdClientString,XrdClientString,XrdClientString) > [XrdClientConn.cc:415] > XrdClientConn::ESrvErrorHandlerRetval XrdClientConn:: > HandleServerError(XReqErrorType&,XrdClientMessage*,ClientRequest*) > [XrdClientConn.cc:1425] > XrdClientMessage*XrdClientConn:: > ReadPartialAnswer(XReqErrorType&,unsigned&,ClientRequest*,bool,void**,Xr > dClientConn::EThreeStateReadHandler&) [XrdClientConn.cc:753] > > XrdClientMessage*XrdClientConn::ClientServerCmd(ClientRequest*,const > void*,void**,void*,bool) [XrdClientConn.cc:238] > * Writing 1 byte to 0x250c3f in the heap. > * Address 0x250c3f is 1 byte before start of malloc'd block at > 0x250c40 of 1 byte. > * This block was allocated from: > malloc [rtlib.o] > strdup [libc.so.1] > bool XrdClientStringMatcher::SingleMatches(char*,char*) > [XrdClientStringMatcher.cc:50] > bool XrdClientStringMatcher::Matches(char*) > [XrdClientStringMatcher.cc:123] > bool XrdClientConn:: > CheckHostDomain(XrdClientString,XrdClientString,XrdClientString) > [XrdClientConn.cc:415] > XrdClientConn::ESrvErrorHandlerRetval XrdClientConn:: > HandleServerError(XReqErrorType&,XrdClientMessage*,ClientRequest*) > [XrdClientConn.cc:1425] > > **Comment** > > This time, Purify seems to be complaining about the line above which it > is reporting (i.e. line 55) > > if (starend) > plainexp[strlen(plainexp)-1] = '\0'; > > but the memory was allocated from > > if (starbeg) > plainexp = strdup(expr+1); > > but strlen(expr) == 1 and therefore strdup created a string of length 0 > (but of memory size 1 since it added the '\0'). So therefore > strlen(plainexp)-1 == -1 and you write to 1 byte before the start of > the string. > > > > > > Purify > ======================================================================== > ====== > > MLK: 11520 bytes leaked in 20 blocks > * This memory was allocated from: > malloc [rtlib.o] > XrdOucHash<_pthread_cond>::XrdOucHash(int,int,int) > [XrdOucHash.icc:36] > XrdClientInputBuffer::XrdClientInputBuffer() > [XrdClientVector.hh:34] > XrdClientPhyConnection:: > XrdClientPhyConnection(XrdClientAbsUnsolMsgHandler*) > [XrdClientPhyConnection.cc:63] > short XrdClientConnectionMgr::Connect(XrdClientUrlInfo) > [XrdClientConnMgr.cc:315] > short XrdClientConn::Connect(XrdClientUrlInfo) > [XrdClientString.hh:85] > * Block of 576 bytes (20 times); last block at 0x28ead0 > > MLK: 6400 bytes leaked in 20 blocks > * This memory was allocated from: > malloc [rtlib.o] > c2n6Fi_Pv___1 [libCrun.so.1] > void*operator new(unsigned) [rtlib.o] > short XrdClientConnectionMgr::Connect(XrdClientUrlInfo) > [XrdClientConnMgr.cc:315] > short XrdClientConn::Connect(XrdClientUrlInfo) > [XrdClientString.hh:85] > XReqErrorType > XrdClientConn::GoToAnotherServer(XrdClientUrlInfo) [XrdClientString.hh:85] > * Block of 320 bytes (20 times); last block at 0x28bac8 > [Plus other leaks that all come from XrdClientPhyConnection] > > **Comment** > > It looks like I am getting memory leaks for each connection (I only did > 20 data reads which matches the '20 times' for the above leaks [why > I'm getting a new connection each read I don't understand]). Although > I don't believe purify's identification of line 315 in > XrdClientConnMgr.cc, I do think the only new in that routine might be > the leak > > > phyconn = new XrdClientPhyConnection(this); > > there is definitely a leak at line 274 of that routine at the > > if ( phyconn && phyconn->Connect(RemoteServ) ) { //line 262 > .... > } else > return -1; //leak if pyconn->Connect failed > > > As for the other instances of XrdClientPhyConnection, they only go away > if the GarbageCollection was correct. Perhaps the time out constraint > on the GarbageCollection code is never met before the program stops (so > it really isn't a bad leak) OR there is a problem with the > GarbageCollection.