首页 > 代码库 > 第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)
第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)
说明:该篇内容部分来自红丸编写的MongoDB实战文章。
1、简介
MongoDB支持在多个机器中通过异步复制达到故障转移和实现冗余,多机器中同一时刻只有一台是用于写操作,正是由于这个情况,为了MongoDB提供了数据一致性的保障,担当primary角色的服务能把读操作分发给Slave(详情请看前两篇关于Replica Set成员组成和理解)。
MongoDB高可用分为两种:
- Master-Slave主从复制:只需要在某一个服务启动时加上-master参数,而另外一个服务加上-slave与-source参数,即可实现同步,MongoDB的最新版本已经不在推荐此方案。在官网的文档中有如下一段提醒:
IMPORTANT
Replica sets replace master-slave replication for most use cases. If possible, use replica sets rather than master-slave replication for all new production deployments. This documentation remains to support legacy deployments and for archival purposes only.
意思就是说在很多的案例中已经用Replica Set来替代Master-slave。
- Replica Set复制集:MongoDB在1.6版本后开发了新功能Replica Set,这比之前的Replication功能要强大一些,增加了故障自动切换和自动修复成员节点,各个DB之间数据完全一致,大大降低了维护难度,auto shard已经明确说明不支持replication paris,建议使用Replica Set,故障完全自动切换。
- 环境准备
- 步骤
[root@localhost mongodb]# mkdir -p r0 [root@localhost mongodb]# mkdir -p r1 [root@localhost mongodb]# mkdir -p r2创建日志文件路径:
[root@localhost mongodb]# mkdir -p log创建主从key文件,用于标识集群的私钥的完整路径,如果各个实例的keyfile内容不一致,程序将不能正常启动。
[root@localhost mongodb]# mkdir -p key
[root@localhost mongodb]# echo "this is rs1 super secret key">key/r0 [root@localhost mongodb]# echo "this is rs1 super secret key">key/r1 [root@localhost mongodb]# echo "this is rs1 super secret key">key/r2 [root@localhost mongodb]# chmod 600 key/r* [root@localhost mongodb]#启动三个实例:
[root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r0 --fork --port 28010 --dbpath=/usr/local/mongodb/r0 --logpath=/usr/local/mongodb/log/r0.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 2545
[root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 2596
[root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 2602说明:三个实例端口分别为28010、28011、28012 数据存放文件分别为r0、r1、r2。
[root@localhost bin]# ./mongo --port 28010 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28010/test > config_rs1={_id:"rs1",members:[{_id:0,host:'localhost:28010',priority:1},{_id:1,host:'localhost:28011'},{_id:2,host:'localhost:28012'}]} { "_id" : "rs1", "members" : [ { "_id" : 0, "host" : "localhost:28010", "priority" : 1 }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } >说明:指定每个阶段的IP和端口,priority=1作用将端口28010设置为primary。
> rs.initiate(config_rs1); { "info" : "Config now saved locally. Should come online in about a minute.", "ok" : 1 } >查看复制集的状态:
rs1:OTHER> rs.status(); { "set" : "rs1", "date" : ISODate("2015-01-16T03:10:41Z"), "myState" : 2, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 260, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "self" : true }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, <span style="background-color: rgb(255, 0, 0);"><span style="color:#ff0000;"> </span>"state" : 5, "stateStr" : "STARTUP2",</span> "uptime" : 8, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2015-01-16T03:10:39Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:10:39Z"), "pingMs" : 0, "<span style="background-color: rgb(255, 0, 0);">lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"</span> }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 5, "stateStr" : "STARTUP2", "uptime" : 8, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2015-01-16T03:10:39Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:10:40Z"), "pingMs" : 0, <span style="background-color: rgb(255, 0, 0);">"lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"</span> } ], "ok" : 1 }
说明:在name为localhost:28010的节点的stateStr为SECONDARY,这是为什么呢?我们结合下面红色字体标注的地方来看,在调用rs.initiatie初始化Replica Set配置时,里面的提示信息为:
rs1:PRIMARY> rs.status(); { "set" : "rs1", "date" : ISODate("2015-01-16T03:13:09Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 408, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "electionTime" : Timestamp(1421377841, 1), "electionDate" : ISODate("2015-01-16T03:10:41Z"), "self" : true }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 156, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "lastHeartbeat" : ISODate("2015-01-16T03:13:09Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:13:07Z"), "pingMs" : 1, "syncingTo" : "localhost:28010" }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 156, "optime" : Timestamp(1421377833, 1), "optimeDate" : ISODate("2015-01-16T03:10:33Z"), "lastHeartbeat" : ISODate("2015-01-16T03:13:08Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:13:09Z"), "pingMs" : 0, "syncingTo" : "localhost:28010" } ], "ok" : 1 }此时Replica Set已经初始化完成,各个节点状态均以正常,state=1的为primary服务。state=2的为SECONDARY服务节点。两个SECONDARY状态的阶段都是通过28010端口同步数据,通过syncingTo字段可以看出。
rs1:PRIMARY> rs.isMaster(); { "setName" : "rs1", "setVersion" : 1, "ismaster" : true, "secondary" : false, "hosts" : [ "localhost:28010", "localhost:28012", "localhost:28011" ], "primary" : "localhost:28010", "me" : "localhost:28010", "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 1000, "localTime" : ISODate("2015-01-16T02:38:58.479Z"), "maxWireVersion" : 2, "minWireVersion" : 0, "ok" : 1 } rs1:PRIMARY>
3.1、主从操作日志oplog
MongoDB的Replica Set架构是通过一个日志来存储写操作的,这个日志叫做oplog,在前面的教程中已经学习过了,oplog.rs是一个固定长度的capped collection,它存在于local数据库中,用于记录Replica Sets的操作日志,在默认情况下,对于64位的MongoDB,oplog是比较大的,可以达到5%的磁盘空间,oplog的大小可以通过mongod的参数--oplogSize来改变oplog的日志大小。
rs1:PRIMARY> use local switched to db local rs1:PRIMARY> show collections me oplog.rs startup_log system.indexes system.replset rs1:PRIMARY> \
rs1:PRIMARY> db.oplog.rs.find(); { "ts" : Timestamp(1421375729, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } rs1:PRIMARY>
字段说明:
rs1:PRIMARY> db.printReplicationInfo(); configured oplog size: 990MB log length start to end: 0secs (0hrs) oplog first event time: Fri Jan 16 2015 10:35:29 GMT+0800 (CST) oplog last event time: Fri Jan 16 2015 10:35:29 GMT+0800 (CST) now: Fri Jan 16 2015 10:45:18 GMT+0800 (CST) rs1:PRIMARY>字段说明:
rs1:PRIMARY> db.printSlaveReplicationInfo(); source: localhost:28011 syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST) 1421375729 secs (394826.59 hrs) behind the primary source: localhost:28012 syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST) 1421375729 secs (394826.59 hrs) behind the primary rs1:PRIMARY>字段说明:
rs1:PRIMARY> db.system.replset.find(); { "_id" : "rs1", "version" : 1, "members" : [ { "_id" : 0, "host" : "localhost:28010" }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } rs1:PRIMARY>从这个集合中可以看出,Replica Sets的配置信息,也可以在任何一个成员实例上执行rs.conf()来查看配置信息。
3.3、Replica set测试
[root@localhost bin]# ./mongo --port 28010 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28010/test rs1:PRIMARY> db.student.insert({name:"zhangsan",age:20}); WriteResult({ "nInserted" : 1 }) <span style="background-color: rgb(255, 0, 0);">rs1:PRIMARY</span>> db.student.find(); { "_id" : ObjectId("54b87ca7f663c819d621d590"), "name" : "zhangsan", "age" : 20 } rs1:PRIMARY>
[root@localhost bin]# ./mongo --port 28011 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28011/test rs1:SECONDARY> show collections 2015-01-16T11:22:54.517+0800 error: { "$err" : "<span style="background-color: rgb(255, 102, 102);">not master and slaveOk=false</span>", "code" : 13435 } at src/mongo/shell/query.js:131
rs1:SECONDARY> db.getMongo().setSlaveOk();
<span style="background-color: rgb(255, 0, 0);">rs1:SECONDARY</span>> show collections; student system.indexes rs1:SECONDARY>此时便可以进行查询操作了。
在此要注意下连接到mongod服务之后,命令行开头变成了rs1.SECONDARY和rs1.PRIMARY,说明当前登录的rs1这个复制集得PRIMARY节点或者SECONDARY的节点。
rs1:SECONDARY> db.student.find(); { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 } rs1:SECONDARY>28012端口操作如下:
[root@localhost bin]# ./mongo --port 28012 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28012/test rs1:SECONDARY> show collections 2015-01-16T11:27:04.747+0800 error: { "$err" : "not master and slaveOk=false", "code" : 13435 } at src/mongo/shell/query.js:131 rs1:SECONDARY> db.getMongo().setSlaveOk(); rs1:SECONDARY> show collections student system.indexes rs1:SECONDARY> db.student.find(); { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 } rs1:SECONDARY>在28011端口上进行写操作:
[root@localhost bin]# ./mongo --port 28011 MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:28011/test rs1:SECONDARY> db.student.insert({name:"lisi",age:20}); WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } })此时提示不是master不能进行写操作,这跟前面两章节详细讲解Replica Set架构的相关原理相符合。
bye [root@localhost bin]# <span style="color:#ff0000;">ps aux|grep mongod</span> root <span style="background-color: rgb(255, 0, 0);"> 6658 </span> 0.8 3.7 3175956 37508 ? Sl 11:06 0:12 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r0 --fork --port <span style="background-color: rgb(255, 0, 0);">28010</span> --dbpath=/usr/local/mongodb/r0 --logpath=/usr/local/mongodb/log/r0.log --logappend root 7461 0.7 3.7 3144172 37764 ? Sl 11:06 0:11 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend root 28166 0.6 3.8 3144152 38520 ? Sl 11:10 0:08 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend root 30833 0.0 0.0 103244 832 pts/1 S+ 11:31 0:00 grep mongod [root@localhost bin]# <span style="background-color: rgb(255, 0, 0);">kill -2 6658</span> [root@localhost bin]# ps aux|grep mongod root 7461 0.7 3.7 3158520 37960 ? Sl 11:06 0:11 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend root 28166 0.6 3.8 3154396 38616 ? Sl 11:10 0:08 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend root 30869 0.0 0.0 103244 832 pts/1 S+ 11:31 0:00 grep mongod [root@localhost bin]#此时通过28011端口连接mongod服务并查看复制集状态
rs1:PRIMARY> rs.status() { "set" : "rs1", "date" : ISODate("2015-01-16T03:33:15Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "<span style="background-color: rgb(255, 0, 0);">localhost:28010</span>", "health" : 0, <span style="background-color: rgb(255, 0, 0);">"state" : 8,</span> <span style="background-color: rgb(255, 0, 0);">"stateStr" : "(not reachable/healthy)",</span> "uptime" : 0, "optime" : Timestamp(1421378555, 1), "optimeDate" : ISODate("2015-01-16T03:22:35Z"), "lastHeartbeat" : ISODate("2015-01-16T03:33:14Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:31:50Z"), "pingMs" : 0 }, { "_id" : 1, "name" : <span style="color:#ff0000;">"localhost:28011</span>", "health" : 1, "state" : 1, <span style="background-color: rgb(255, 0, 0);">"stateStr" : "PRIMARY"</span>, "uptime" : 1608, "optime" : Timestamp(1421378555, 1), "optimeDate" : ISODate("2015-01-16T03:22:35Z"), "electionTime" : Timestamp(1421379114, 1), "electionDate" : ISODate("2015-01-16T03:31:54Z"), "self" : true }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 1358, "optime" : Timestamp(1421378555, 1), "optimeDate" : ISODate("2015-01-16T03:22:35Z"), "lastHeartbeat" : ISODate("2015-01-16T03:33:15Z"), "lastHeartbeatRecv" : ISODate("2015-01-16T03:33:13Z"), "pingMs" : 0, "lastHeartbeatMessage" : "syncing to: localhost:28011", "syncingTo" : "localhost:28011" } ], "ok" : 1 } rs1:PRIMARY>此时28010的状态变为了8,描述为不可达。健康状态为0,28011的状态变为了1,描述为PRIMARY,此时的架构为如下所示:
rs1:PRIMARY> use test switched to db test rs1:PRIMARY> db.student.insert({name:"lisi",age:20}); WriteResult({ "nInserted" : 1 }) rs1:PRIMARY> db.student.find(); { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 } { "_id" : ObjectId("54b8876aad5e04c1fe460154"), "name" : "lisi", "age" : 20 } rs1:PRIMARY>
第一部分 基础篇 第一章 走进MongoDB
第一部分 基础篇 第二章 安装MongoDB
第一部分 基础篇 第三章 MongoDB体系结构
第一部分 基础篇 第四章 MongoDB快速入门
第一部分 基础篇 第四章 MongoDB查询
第二部分 应用篇 第五章 MongoDB高级查询
第二部分 应用篇 第六章 MongoDB GridFS
第二部分 应用篇 第七章 MongoDB MapReduce
第三部分 管理篇 第八章 MongoDB服务管理
第三部分 管理篇 第九章 MongoDB shell之系统命令、用户命令
第三部分 管理篇 第九章 MongoDB shell之eval、进程
第四部分 性能篇 第十章 MongoDB 索引
第四部分 性能篇 第十一章 MongoDB 性能监控
第五部分 架构篇 第十二章 MongoDB Replica Sets 架构(简介)
第五部分 架构篇 第十三章 MongoDB Replica Sets 架构(成员深入理解)
第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)