首页 > 代码库 > 在python中使用zookeeper管理你的应用集群

在python中使用zookeeper管理你的应用集群

http://www.zlovezl.cn/articles/40/

简介:

  Zookeeper 分布式服务框架是 Apache Hadoop 的一个子项目,它主要是用来解决分布式应用中经常遇到的一些数据管理问题,如:统一命名服务、状态同步服务、集群管理、分布式应用配置项的管理等。

具体简介可以参照这篇文章。

zkpython的安装:

python中有一个zkpython的包,是基于zookeeper的c-client开发的,所以安装的时候需要先安装zookeeper的c客户端。安装步骤如下:

# 首先下载zookeeperwget http://labs.renren.com/apache-mirror//zookeeper/zookeeper-3.3.3/zookeeper-3.3.3.tar.gztar xzvf zookeeper-3.3.3.tar.gzcd zookeeper-3.3.3/src/c/./configuremakemake install# 然后下载zkpythonwget http://pypi.python.org/packages/source/z/zkpython/zkpython-0.4.tar.gz#md5=3de220615aaddf57f1462b78d32477f9tar xzvf zkpython-0.4.tar.gzcd zkpython-0.4python setup.py install

这样就完成了zkpython的安装。

一个简单的demo:

之后让我们来写一个简单的demo吧。(demo中用到的zkclient.py:https://github.com/piglei/zkpython_example/blob/master/zkclient.py)

# coding: utf-8import loggingfrom os.path import basename, joinfrom zkclient import ZKClient, zookeeper, watchmethodlogging.basicConfig(    level = logging.DEBUG,    format = "[%(asctime)s] %(levelname)-8s %(message)s")log = loggingclass GJZookeeper(object):    ZK_HOST = "localhost:2181"    ROOT = "/app"    WORKERS_PATH = join(ROOT, "workers")    MASTERS_NUM = 1    TIMEOUT = 10000    def __init__(self, verbose = True):        self.VERBOSE = verbose        self.masters = []        self.is_master = False        self.path = None        self.zk = ZKClient(self.ZK_HOST, timeout = self.TIMEOUT)        self.say("login ok!")        # init        self.__init_zk()        # register        self.register()    def __init_zk(self):        """        create the zookeeper node if not exist        """        nodes = (self.ROOT, self.WORKERS_PATH)        for node in nodes:             if not self.zk.exists(node):                try:                    self.zk.create(node, "")                except:                    pass    @property    def is_slave(self):        return not self.is_master    def register(self):        """        register a node for this worker        """        self.path = self.zk.create(self.WORKERS_PATH + "/worker", "1", flags=zookeeper.EPHEMERAL | zookeeper.SEQUENCE)        self.path = basename(self.path)        self.say("register ok! I‘m %s" % self.path)        # check who is the master        self.get_master()    def get_master(self):        """        get children, and check who is the smallest child        """        @watchmethod        def watcher(event):            self.say("child changed, try to get master again.")            self.get_master()        children = self.zk.get_children(self.WORKERS_PATH, watcher)        children.sort()        self.say("%s‘s children: %s" % (self.WORKERS_PATH, children))         # check if I‘m master        self.masters = children[:self.MASTERS_NUM]        if self.path in self.masters:            self.is_master = True            self.say("I‘ve become master!")        else:            self.say("%s is masters, I‘m slave" % self.masters)    def say(self, msg):        """        print messages to screen        """        if self.VERBOSE:            if self.path:                log.info("[ %s(%s) ] %s" % (self.path, "master" if self.is_master else "slave", msg))            else:                log.info(msg)def main():    gj_zookeeper = GJZookeeper()if __name__ == "__main__":    main()    import time    time.sleep(1000)

  这个简单的demo所做的事情,就是通过在zookeeper的/app/workers节点下建立临时的子节点( flags=zookeeper.EPHEMERAL | zookeeper.SEQUENCE ),每次create完成之后检查自己是不是在最小的MASTERS_NUM(例子中为1,即单master)里。如果是的话,作为master运行,否则的话,作为slave运行。

  这样的话,当我们的master挂掉以后,与zookeeper之间的连接也会中断,过了指定的TIMEOUT以后,master之前在worker下的子节点就会被删除,于是slave节点之前设置的watcher会被触发,再次检查自己是否为master,如果是的话则完成切换。

demo运行结果:

# 第一个实例Connected in 20 ms, handle is 0[2011-09-09 12:40:43,702] INFO     login ok!Node /app/workers/worker created in 4 ms[2011-09-09 12:40:43,708] INFO     [ worker0000000022(slave) ] register ok! I‘m worker0000000022[2011-09-09 12:40:43,709] INFO     [ worker0000000022(slave) ] /app/workers‘s children: [‘worker0000000022‘][2011-09-09 12:40:43,709] INFO     [ worker0000000022(master) ] I‘ve become master!# 这时再起第二个实例Connected in 64 ms, handle is 0[2011-09-09 12:43:08,334] INFO     login ok!Node /app/workers/worker created in 11 ms[2011-09-09 12:43:08,346] INFO     [ worker0000000023(slave) ] register ok! I‘m worker0000000023[2011-09-09 12:43:08,347] INFO     [ worker0000000023(slave) ] /app/workers‘s children: [‘worker0000000022‘, ‘worker0000000023‘][2011-09-09 12:43:08,347] INFO     [ worker0000000023(slave) ] [‘worker0000000022‘] is masters, I‘m slave# 杀掉master,第二个实例发生的变化[2011-09-09 12:44:06,016] INFO     [ worker0000000023(slave) ] child changed, try to get master again.[2011-09-09 12:44:06,017] INFO     [ worker0000000023(slave) ] /app/workers‘s children: [‘worker0000000023‘][2011-09-09 12:44:06,017] INFO     [ worker0000000023(master) ] I‘ve become master!