Print

Print


Thanks for so quick turn around! I think I’ll keep the generalization that allows to restart any service, but will rely on truncate instead of restarting mysql

Jacek


> On Aug 27, 2015, at 11:09 PM, Salnikov, Andrei A. <[log in to unmask]> wrote:
> 
> I have just finished reviewing DM-3618 so it's up tto you to decide what
> you like better. Personally I'd prefer not to restart mysql (it's a slow
> process with innodb tables).
> 
> Cheers,
> Andy
> 
> 
> From: Becla, Jacek
> Sent: Thursday, August 27, 2015 11:06 PM
> To: Salnikov, Andrei A. <[log in to unmask] <mailto:[log in to unmask]>>
> Cc: qserv-l <[log in to unmask] <mailto:[log in to unmask]>>
> Subject: Re: [QSERV-L] memoryLockDb troubles
> 
> Andy
> 
> I confirm that uncommenting these two lines helped. So, do you recommend to keep that
> and forget about my commit 68b3ddc8f ?
> 
> Thanks
> Jacek
> 
> 
> 
> On Aug 27, 2015, at 10:46 PM, Salnikov, Andrei A. <[log in to unmask] <mailto:[log in to unmask]>> wrote:
> 
> I think that conclusion for DM-2930 was that if worker (xrootd) is
> restarted than mysql server needs to be restarted as well. This is
> not how we do things today in integration test, we only restart workers
> and not mysql.
> 
> We should probably re-discuss how to make things more reliable, in
> the meantime you can fix it by un-commenting "TRUNCATE TABLE ..."
> statement in core/modules/wdb/ChunkResource.cc <http://chunkresource.cc/> (lines 324-325).
> 
> Cheers,
> Andy
> 
> 
> From: [log in to unmask] <mailto:[log in to unmask]> [mailto:[log in to unmask] <mailto:[log in to unmask]>] On Behalf Of Becla, Jacek
> Sent: Thursday, August 27, 2015 8:55 PM
> To: qserv-l <[log in to unmask] <mailto:[log in to unmask]>>
> Subject: Re: [QSERV-L] memoryLockDb troubles
> 
> For the record, I have a fix in DM-3618
> 
> John, FYI, AndyS is restarting xrootd in wmgr because apparently that is the only way to force xrootd to refresh chunk inventory (at the moment)
> 
> Jacek
> 
> 
> 
> On Aug 27, 2015, at 7:39 PM, Becla, Jacek <[log in to unmask] <mailto:[log in to unmask]>> wrote:
> 
> John
> 
> Your latest code is giving me troubles. Integration test killed xrootd, the tail of the log is similar to what you observed intermittently:
> 
> [2015-08-27T21:32:13.084-0500] [0x7f92c2df8720] INFO  root (build/xrdsvc/SsiService.cc <http://ssiservice.cc/>:142) - Cleaning up scratchDb: qservScratch.
> [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] WARN  root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:302) - memLockStatus LOCKED_OTHER wrong uid. Expected 29962 got 28572 err=
> [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] WARN  root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:316) - Memory tables were not released cleanly! LockStatus=LOCKED_OTHER
> [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] DEBUG root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:272) - execLockSql CREATE DATABASE IF NOT EXISTS q_memoryLockDb;CREATE TABLE IF NOT EXISTS q_memoryLockDb.memoryLockTbl ( keyId INT UNIQUE, uid INT ) ENGINE = MEMORY;
> [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] DEBUG root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:272) - execLockSql INSERT INTO q_memoryLockDb.memoryLockTbl (keyId, uid) VALUES(1, 29962 )
> [2015-08-27T21:32:13.086-0500] [0x7f92c2df8720] ERROR root (build/wdb/ChunkResource.cc <http://chunkresource.cc/>:373) - Lock failed, exiting. query=INSERT INTO q_memoryLockDb.memoryLockTbl (keyId, uid) VALUES(1, 29962 ) err=Error 1062: Duplicate entry '1' for key 'keyId' Unable to execute query: INSERT INTO q_memoryLockDb.memoryLockTbl (keyId, uid) VALUES(1, 29962 )
> 
> 
> I saved full log here:
> 
> /home/becla/qserv-run/2015_08/var/log/worker/xrootd.log
> 
> I hope we will be able to resolve it very soon, if we won’t, backup plan:
> a) back off all changes
> b) disable the code that is causing xrootd to die while you investigate
> 
> I’ll try to debug it tonight
> 
> Jacek
> 
> Use REPLY-ALL to reply to list
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1 <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1>
> 
> 
> Use REPLY-ALL to reply to list
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1 <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1