首页 > 代码库 > RHCS(三)之quorum机制测试(阶段三、四、结论)

RHCS(三)之quorum机制测试(阶段三、四、结论)

阶段三:我们这次试一下非正常退出

模拟宕机方法:

方法一:虚拟机挂起

方法二:echoc>/proc/sysrq-trigger

 

恢复所有节点(略)

 

[root@web1 ~]# cman_tool status

……

Nodes: 4

Expected votes: 8

Total votes: 8

Node votes: 2

Quorum: 5 

……

 

=====Step1:对web3节点进行模拟故障=====

[root@web1 ~]# ssh root@web3 ‘echo c>/proc/sysrq-trigger‘

Sep 22 22:40:18 web1openais[4253]: [TOTEM] The token was lost in the OPERATIONAL state.

#在这之前没有收到web3节点退出的信息,对比阶段二关机状态的日志。

Sep 22 22:40:18 web1 openais[4253]: [TOTEM] Receive multicast socketrecv buffer size (320000 bytes).

Sep 22 22:40:18 web1 openais[4253]: [TOTEM] Transmit multicast socketsend buffer size

……

Sep 22 22:40:30 web1 openais[4253]: [SYNC ] This node is within theprimary component and will provide service.

Sep 22 22:40:30 web1 openais[4253]: [TOTEM] entering OPERATIONAL state.

Sep 22 22:40:30 web1 openais[4253]: [CLM  ] got nodejoin message 192.168.1.201

Sep 22 22:40:30 web1 openais[4253]: [CLM  ] got nodejoin message 192.168.1.202

Sep 22 22:40:30 web1 openais[4253]: [CLM  ] got nodejoin message 192.168.1.204

Sep 22 22:40:30 web1 openais[4253]: [CPG  ] got joinlist message from node 1

Sep 22 22:40:30 web1 openais[4253]: [CPG  ] got joinlist message from node 2

Sep 22 22:40:30 web1 openais[4253]: [CPG  ] got joinlist message from node 4

Sep 22 22:40:35 web1fenced[4272]: fencing node "web3.rocker.com"

Sep 22 22:40:35 web1fenced[4272]: fence "web3.rocker.com" failed

Sep 22 22:40:40 web1fenced[4272]: fencing node "web3.rocker.com"

Sep 22 22:40:40 web1fenced[4272]: fence "web3.rocker.com" failed

Sep 22 22:40:45 web1fenced[4272]: fencing node "web3.rocker.com"

Sep 22 22:40:45 web1fenced[4272]: fence "web3.rocker.com" failed

#我们用了手动fence设备,当集群发现web3节点失联的时候,向管理员申请fenceweb3节点

 

我们先来看看节点状态

[root@web2 ~]# clustat

Cluster Status for mycluster @ Mon Sep 22 22:42:33 2014

Member Status: Quorate #集群可用

 

 Member Name                                     ID   Status

 ------ ----                                     ----------

 web1.rocker.com                                     1 Online,rgmanager

 web2.rocker.com                                     2 Online,Local, rgmanager

 web3.rocker.com                                     3 Offline

 web4.rocker.com                                     4 Online,rgmanager

 

 Service Name                           Owner (Last)                           State        

 ------- ----                           ----- ------                           -----        

 service:myservice                      web3.rocker.com                        started  

#但是资源还没有进行转移!

 

看看quorum

[root@web2 ~]# cman_tool status

……

Nodes: 3

Expected votes: 8

Total votes: 6

Node votes: 2

Quorum: 5  #区别在这里

#Totalvotes改了,意味这web3节点投票失败,然而quorum没有改变。

……

 

测试一下

wKioL1QjxfHCvALsAAEAkF1pO0Q309.jpg

 

=====Step2:web3节点手动fence=====

[root@web2 ~]# fence_ack_manual -n web3.rocker.com

 

Warning:  If the node"web3.rocker.com" has not been manually fenced

(i.e. power cycled or disconnected from shared storage devices)

the GFS file system may become corrupted and all its data

unrecoverable!  Please verifythat the node shown above has

been reset or disconnected from storage.

 

Are you certain you want to continue? [yN] y

can‘t open /tmp/fence_manual.fifo: No such file or directory

 

[root@web2 ~]# touch /tmp/fence_manual.fifo

[root@web2 ~]# fence_ack_manual -n web3.rocker.com -e

 

Warning:  If the node"web3.rocker.com" has not been manually fenced

(i.e. power cycled or disconnected from shared storage devices)

the GFS file system may become corrupted and all its data

unrecoverable!  Please verifythat the node shown above has

been reset or disconnected from storage.

 

Are you certain you want to continue? [yN] y

done

 

#fence成功

#tail /var/log/message

Sep 22 22:52:25 web1 fenced[4272]: fence "web3.rocker.com"overridden by administrator intervention

 

再看节点状态

[root@web1 ~]# clustat

Cluster Status for mycluster @ Mon Sep 22 22:54:47 2014

Member Status: Quorate

 

 Member Name                                     ID   Status

 ------ ----                                     ----------

 web1.rocker.com                                     1 Online,Local, rgmanager

 web2.rocker.com                                     2 Online,rgmanager

 web3.rocker.com                                     3 Offline

 web4.rocker.com                                     4 Online,rgmanager

 

 Service Name                           Owner (Last)                           State        

 ------- ----                           ----- ------                           -----        

 service:myservice                      web2.rocker.com                        started

#资源发生转移了

测试

wKioL1QjxfDirjVlAABKP7ZBFd0417.jpg

 

看看quorum

[root@web1 ~]# cman_tool status

……

Nodes: 3

Expected votes: 8

Total votes: 6

Node votes: 2

Quorum: 5  #quorum不变,因为expectedvote也没变

……

 

=====step3:继续踢!现在用虚拟机web2挂起=====

Sep 22 22:57:14 web1 openais[4253]: [TOTEM] The token was lost in the OPERATIONAL state.

Sep 22 22:57:14 web1 openais[4253]: [TOTEM] Receive multicast socketrecv buffer size (320000 bytes).

Sep 22 22:57:14 web1 openais[4253]: [TOTEM] Transmit multicast socketsend buffer size (221184 bytes).

Sep 22 22:57:14 web1 openais[4253]: [TOTEM] entering GATHER state from2.

……

Sep 22 22:57:26 web1 openais[4253]:[CMAN ] quorum lost, blocking activity

……

Sep 22 22:57:26 web1 openais[4253]: [TOTEM] entering OPERATIONAL state.

Sep 22 22:57:26 web1 openais[4253]: [CLM  ] got nodejoin message 192.168.1.201

Sep 22 22:57:26 web1 ccsd[4247]:Cluster is not quorate.  Refusingconnection.

Sep 22 22:57:26 web1 openais[4253]: [CLM  ] got nodejoin message 192.168.1.204

Sep 22 22:57:26 web1 ccsd[4247]: Error while processing connect:Connection refused

Sep 22 22:57:26 web1 openais[4253]: [CPG  ] got joinlist message from node 1

Sep 22 22:57:26 web1 ccsd[4247]: Invalid descriptor specified (-111).

Sep 22 22:57:26 web1 openais[4253]: [CPG  ] got joinlist message from node 4

Sep 22 22:57:26 web1 ccsd[4247]: Someone may be attempting somethingevil.

Sep 22 22:57:26 web1 ccsd[4247]: Error while processing get: Invalidrequest descriptor

#爽!!!终于出现了!!!集群挂起了。

 

web1shell界面报错:

Message from syslogd@ at Mon Sep 22 22:57:26 2014 ...

web1 clurgmgrd[4312]: <emerg> #1: Quorum Dissolved

 

[root@web1 ~]# clustat

Service states unavailable: Operation requires quorum

Cluster Status for mycluster @ Mon Sep 22 22:58:05 2014

Member Status: Inquorate #集群挂起了!

 

 Member Name                                     ID   Status

 ------ ----                                     ----------

 web1.rocker.com                                     1 Online,Local

 web2.rocker.com                                     2 Offline

 web3.rocker.com                                     3 Offline

 web4.rocker.com                                     4 Online

#没有转移到资源

 

[root@web1 ~]# cman_tool status

……

Nodes: 2

Expected votes: 8

Total votes: 4

Node votes: 2

Quorum: 5 Activity blocked

#因为quorum> Total votes,所以集群挂起了

……

************************************************************************ 

阶段四:配置qdisk

  就是为了这种情况的发生,我们需要配置qdisk(我们之前配置了,在用system-config-cluster新建集群的时候),并且开启qdiskd服务

 

ss节点为/dev/sdb分两个区(略)

 

ss节点编辑配置文件/etc/tgt/targets.conf,添加一个target

<target iqn.2008-09.com.example:rocker.use.target>

   backing-store /dev/sdb

</target>

 

ss开启iscsi-target服务:

[root@ss ~]# service tgtd start

Starting SCSI target daemon: Starting target framework daemon

 

web所有节点开启iscsi-initial,并且识别target

[root@web1 ~]# for i in web1 web2 web3 web4;do ssh root@$i ‘serviceiscsi start ; iscsiadm -m discovery -t sendtargets -p 192.168.1.205 ; iscsiadm-m node -p 192.168.1.205 -l’;done

 

查看挂载后的块号

[root@web1 ~]# dmesg

sd 1:0:0:1: Attached scsi disk sdb

sd 1:0:0:1: Attached scsi generic sg6 type 0

 

[root@web1 ~]# fdisk /dev/sdb -l

 

Disk /dev/sdb: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

 

   Device Boot      Start         End      Blocks  Id  System

/dev/sdb1               1          63      506016  83  Linux

/dev/sdb2              64         126      506047+ 83  Linux

 

创建qdisk分区:

web某一节点设置即可

[root@web1 ~]# mkqdisk -c /dev/sdb1 -l myqdisk

mkqdisk v0.6.0

Writing new quorum disk label ‘myqdisk‘ to /dev/sdb1.

WARNING: About to destroy all data on /dev/sdb1; proceed [N/y] ? y

Initializing status block for node 1...

……

Initializing status block for node 16...

 

在所有节点开启qdiskd服务

[root@web1 ~]# for i in web1 web2 web3 web4;do ssh root@$i ‘serviceqdiskd start‘;done

Starting the Quorum Disk Daemon:[ OK  ]

Starting the Quorum Disk Daemon:[ OK  ]

Starting the Quorum Disk Daemon:[ OK  ]

Starting the Quorum Disk Daemon:[ OK  ]

 

查看quorum
[root@web1 ~]# cman_tool status

……

Nodes: 4

Expected votes: 8

Quorum device votes: 2  #qdisk支持投票了

Total votes: 10

Node votes: 2

Quorum: 6 

……

 

=====step1:web3宕机=====

 

[root@web3 ~]# echo c>/proc/sysrq-trigger

 

[root@web1 ~]# clustat

Cluster Status for mycluster @ Tue Sep 23 08:55:41 2014

Member Status: Quorate #集群可用

 

 Member Name                         ID   Status

 ------ ----                         ---- ------

 web1.rocker.com                         1 Online, Local, rgmanager

 web2.rocker.com                         2 Online, rgmanager

 web3.rocker.com                         3 Offline

 web4.rocker.com                         4 Online, rgmanager

 /dev/disk/by-id/scsi-1IET_00010001-p   0 Online, Quorum Disk

#qdisk开启成功

 

 Service Name               Owner (Last)               State        

 ------- ----               ----- ------               -----        

 service:myservice          web2.rocker.com            started

#资源转移了

 

[root@web1 ~]# cman_tool status

……

Nodes: 3

Expected votes: 8

Quorum device votes: 2

Total votes: 8

Node votes: 2

Quorum: 6 

……

 

=====step2:web2宕机=====

 

[root@web2 ~]# echo c>/proc/sysrq-trigger

 

[root@web1 ~]# clustat

Cluster Status for mycluster @ Tue Sep 23 08:58:25 2014

Member Status: Quorate

 

 Member Name                         ID   Status

 ------ ----                         ---- ------

 web1.rocker.com                         1 Online, Local,rgmanager

 web2.rocker.com                         2 Offline

 web3.rocker.com                         3 Offline

 web4.rocker.com                         4 Online, rgmanager

 /dev/disk/by-id/scsi-1IET_00010001-p    0 Online, Quorum Disk

 

 Service Name               Owner (Last)               State        

 ------- ----               ----- ------               -----        

 service:myservice          web1.rocker.com            started

 

[root@web1 ~]# cman_tool status

……

Nodes: 2

Expected votes: 8

Quorum device votes: 2

Total votes: 6

Node votes: 2

Quorum: 6             #这次就不会cluster挂起

……

 

=====step3:web1节点宕机=====

 

[root@web1 ~]# echo c>/proc/sysrq-trigger

 

web4 clurgmgrd[4505]: <emerg> #1: Quorum Dissolved

[root@web4 ~]# clustat

Service states unavailable: Operation requires quorum

Cluster Status for mycluster @ Tue Sep 23 09:02:03 2014

Member Status: Inquorate  #集群挂起了

 

 Member Name                         ID   Status

 ------ ----                         ---- ------

 web1.rocker.com                         1 Offline

 web2.rocker.com                         2 Offline

 web3.rocker.com                         3 Offline

 web4.rocker.com                         4 Online, Local

 /dev/disk/by-id/scsi-1IET_00010001-p    0 Online

#没有转移资源

 

[root@web4 ~]# cman_tool status

……

Nodes: 1

Expected votes: 8

Quorum device votes: 2

Total votes: 4

Node votes: 2

Quorum: 6 Activity blocked   #挂了

……

 

五、结论:

1)在正常关机的情况下,无论是否需要发生资源转移,都会自动把关机的节点踢出去,然后重新计算quorum,不会发生节点数少于过半导致集群挂起;

2)在非正常宕机的情况下,当集群检测到有节点失联,就会通知fence来把它隔离掉,但是,在重新计算quorum后,当节点数少于过半会导致集群挂起;

3)配置了qdisk之后,相当于是总票数的后援,为Total vote加了票数。

 

关于quorumvoteTotalvote的关系:

例如4个节点,每个节点2voteqdisk2vote

Total vote=node vote + qdisk vote,这里的Total vote=4X2+2=10

Expected vote=所有节点正常情况下的Total vote +qdisk vote

Quorum=expected vote/2+1,这里的Quorum=10/2+1=6

当检查到关机节点,集群会重新计算Total voteQuorum。例如,node3关机了,集群重新计算Total vote=2X3+2=8Quorum=8/2+1=5

当检测到非正常关机导致与集群失联的节点,Total vote就会重新计算,但是Quorum不变。例如,这里的node2死机了,集群会通知fence设备,把它隔离掉,然后再进行资源在Failover Domain内转移。Total vote=2X3+2=8。但是Quorum保持6不变。当再有节点死机,重新计算得到的Total vote <Quorum,整个集群会挂起。

由此推断,如果把qdisk vote=4,即可实现剩下一台服务器也可以让集群继续工作。

某些资料说,在gfs文件系统上集群,有一个节点就会挂起,我搞不懂什么原理,还有实验如何实施,请大家多多指教。

本文出自 “Rocker” 博客,请务必保留此出处http://rocker.blog.51cto.com/6314218/1558158

RHCS(三)之quorum机制测试(阶段三、四、结论)