Andy I confirm that uncommenting these two lines helped. So, do you recommend to keep that and forget about my commit 68b3ddc8f ? Thanks Jacek > On Aug 27, 2015, at 10:46 PM, Salnikov, Andrei A. <[log in to unmask]> wrote: > > I think that conclusion for DM-2930 was that if worker (xrootd) is > restarted than mysql server needs to be restarted as well. This is > not how we do things today in integration test, we only restart workers > and not mysql. > > We should probably re-discuss how to make things more reliable, in > the meantime you can fix it by un-commenting "TRUNCATE TABLE ..." > statement in core/modules/wdb/ChunkResource.cc <http://chunkresource.cc/> (lines 324-325). > > Cheers, > Andy > > > From: [log in to unmask] <mailto:[log in to unmask]> [mailto:[log in to unmask] <mailto:[log in to unmask]>] On Behalf Of Becla, Jacek > Sent: Thursday, August 27, 2015 8:55 PM > To: qserv-l <[log in to unmask] <mailto:[log in to unmask]>> > Subject: Re: [QSERV-L] memoryLockDb troubles > > For the record, I have a fix in DM-3618 > > John, FYI, AndyS is restarting xrootd in wmgr because apparently that is the only way to force xrootd to refresh chunk inventory (at the moment) > > Jacek > > > > On Aug 27, 2015, at 7:39 PM, Becla, Jacek <[log in to unmask] <mailto:[log in to unmask]>> wrote: > > John > > Your latest code is giving me troubles. Integration test killed xrootd, the tail of the log is similar to what you observed intermittently: > > [2015-08-27T21:32:13.084-0500] [0x7f92c2df8720] INFO root (build/xrdsvc/SsiService.cc <http://ssiservice.cc/>:142) - Cleaning up scratchDb: qservScratch. > [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] WARN root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:302) - memLockStatus LOCKED_OTHER wrong uid. Expected 29962 got 28572 err= > [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] WARN root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:316) - Memory tables were not released cleanly! LockStatus=LOCKED_OTHER > [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] DEBUG root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:272) - execLockSql CREATE DATABASE IF NOT EXISTS q_memoryLockDb;CREATE TABLE IF NOT EXISTS q_memoryLockDb.memoryLockTbl ( keyId INT UNIQUE, uid INT ) ENGINE = MEMORY; > [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] DEBUG root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:272) - execLockSql INSERT INTO q_memoryLockDb.memoryLockTbl (keyId, uid) VALUES(1, 29962 ) > [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] ERROR root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:373) - Lock failed, exiting. query=INSERT INTO q_memoryLockDb.memoryLockTbl (keyId, uid) VALUES(1, 29962 ) err=Error 1062: Duplicate entry '1' for key 'keyId' Unable to execute query: INSERT INTO q_memoryLockDb.memoryLockTbl (keyId, uid) VALUES(1, 29962 ) > > > I saved full log here: > > /home/becla/qserv-run/2015_08/var/log/worker/xrootd.log > > I hope we will be able to resolve it very soon, if we won’t, backup plan: > a) back off all changes > b) disable the code that is causing xrootd to die while you investigate > > I’ll try to debug it tonight > > Jacek > > Use REPLY-ALL to reply to list > To unsubscribe from the QSERV-L list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1 <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1> > > > Use REPLY-ALL to reply to list > > To unsubscribe from the QSERV-L list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1 <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1> ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1