Serge at al,
I am trying to implement code to avoid race conditions in zookeeper.
I figured I'd create an ephemeral node "/LOCKS/<dbName>" and proceed
with sensitive things only after creating such node successfully.
However, I have the impression zookeeper will allow multiple jobs
to create the same node, even in synchronous mode. Here is a test
I'm using (see below), if I run it concurrently (just two instances),
I'm typically seeing a collision (pasted below).
Can you have a look, think about it, and let's discuss tomorrow.
Thanks!
===================================
import os
import socket
import time
from random import randint
from kazoo.client import KazooClient
from kazoo.exceptions import NodeExistsError, NoNodeError
def sleepABit():
v = randint(1,100) / 1000.0
print "sleep ", v
time.sleep(v)
def createIt(zk, k, v):
while True:
try:
print "create ", v
zk.create(k, v, ephemeral=True, makepath=True)
except:
print "create failed"
sleepABit()
finally:
print "create ok"
return
k = "/LOCKS/x"
zk = KazooClient(hosts="127.0.0.1:12181")
zk.start()
for i in range(0,100):
v = str(socket.gethostbyname(socket.gethostname())) + '_' +
str(os.getpid()) + '_' + str(i)
createIt(zk, k, v)
sleepABit()
d, s = zk.get(k)
print "got ", d
print "delete"
zk.delete(k)
print "---"
=====================
create 141.142.225.179_8831_29
create ok
sleep 0.04
got 141.142.225.179_8839_16
delete
Traceback (most recent call last):
File "quickTest.py", line 45, in <module>
zk.delete(k)
File
"/usr/local/home/becla/qserv/1/stack/Linux64/kazoo/1.3.1/lib/python/kazoo-1.3.1-py2.6.egg/kazoo/client.py",
line 1159, in delete
return self.delete_async(path, version).get()
File
"/usr/local/home/becla/qserv/1/stack/Linux64/kazoo/1.3.1/lib/python/kazoo-1.3.1-py2.6.egg/kazoo/handlers/threading.py",
line 107, in get
raise self._exception
kazoo.exceptions.NoNodeError: ((), {})
See? the node has pid 8831 and successfully created the node,
but the other job (pid 8839) managed to create it OK as well
before job with pid 8831 deleted it.
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|