Print

Print


It turns out kazoo always uses async:

http://kazoo.readthedocs.org/en/latest/_modules/kazoo/client.html#KazooClient.get

I tried calling "sync(path)", but it didn't help.
I believe that is used only to sync data between
zookeeper servers anyway (and I am trying it all
on a single, local server)

J.





On 05/13/2014 09:25 PM, Jacek Becla wrote:
> Serge at al,
>
> I am trying to implement code to avoid race conditions in zookeeper.
> I figured I'd create an ephemeral node "/LOCKS/<dbName>" and proceed
> with sensitive things only after creating such node successfully.
> However, I have the impression zookeeper will allow multiple jobs
> to create the same node, even in synchronous mode. Here is a test
> I'm using (see below), if I run it concurrently (just two instances),
> I'm typically seeing a collision (pasted below).
>
> Can you have a look, think about it, and let's discuss tomorrow.
> Thanks!
>
>
> ===================================
>
> import os
> import socket
> import time
> from random import randint
>
>
> from kazoo.client import KazooClient
> from kazoo.exceptions import NodeExistsError, NoNodeError
>
> def sleepABit():
>       v = randint(1,100) / 1000.0
>       print "sleep ", v
>       time.sleep(v)
>
>
> def createIt(zk, k, v):
>       while True:
>           try:
>               print "create ", v
>               zk.create(k, v, ephemeral=True, makepath=True)
>           except:
>               print "create failed"
>               sleepABit()
>           finally:
>               print "create ok"
>               return
>
>
> k = "/LOCKS/x"
> zk = KazooClient(hosts="127.0.0.1:12181")
> zk.start()
>
> for i in range(0,100):
>       v = str(socket.gethostbyname(socket.gethostname())) + '_' +
> str(os.getpid()) + '_' + str(i)
>
>       createIt(zk, k, v)
>
>       sleepABit()
>
>       d, s = zk.get(k)
>       print "got ", d
>
>       print "delete"
>       zk.delete(k)
>
>       print "---"
>
>
> =====================
>
> create  141.142.225.179_8831_29
> create ok
> sleep  0.04
> got  141.142.225.179_8839_16
> delete
> Traceback (most recent call last):
>     File "quickTest.py", line 45, in <module>
>       zk.delete(k)
>     File
> "/usr/local/home/becla/qserv/1/stack/Linux64/kazoo/1.3.1/lib/python/kazoo-1.3.1-py2.6.egg/kazoo/client.py",
> line 1159, in delete
>       return self.delete_async(path, version).get()
>     File
> "/usr/local/home/becla/qserv/1/stack/Linux64/kazoo/1.3.1/lib/python/kazoo-1.3.1-py2.6.egg/kazoo/handlers/threading.py",
> line 107, in get
>       raise self._exception
> kazoo.exceptions.NoNodeError: ((), {})
>
>
> See? the node has pid 8831 and successfully created the node,
> but the other job (pid 8839) managed to create it OK as well
> before job with pid 8831 deleted it.
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1