首页 > 代码库 > 使用RHCS创建Linux高可用群集

使用RHCS创建Linux高可用群集

基础环境准备

环境拓扑图

技术分享

Linux基本服务设定

关闭iptables

#/etc/init.d/iptables stop

#chkconfig iptables off

#chkconfig list | grep iptables

关闭selinux

#setenforce 0

#vi /etc/selinux/config

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

# enforcing - SELinux security policy is enforced.

# permissive - SELinux prints warnings instead of enforcing.

# disabled - No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

# targeted - Only targeted network daemons are protected.

# strict - Full SELinux protection.

SELINUXTYPE=targeted

关闭NetworkManager

#/etc/init.d/NetworkManager stop

#chkconfig NetworkManager off

#chkconfig list | grep NetworkManager

双机互信

# mkdir ~/.ssh

# chmod 700 ~/.ssh

# ssh-keygen -t rsa

enter

enter

enter

# ssh-keygen -t dsa

enter

enter

enter

N1PMCSAP01执行

# cat ~/.ssh/*.pub >> ~/.ssh/authorized_keys

# ssh N1PMCSAP02 cat ~/.ssh/*.pub >> ~/.ssh/authorized_keys

yes

N1PMCSAP02的密码

# scp ~/.ssh/authorized_keys N1PMCSAP02:~/.ssh/authorized_keys

存储多路径配置

技术分享

使用autoscan.sh脚本刷新IBM存储路径

技术分享

查看存储底层WWID

NAME

WWID

Capcity

Path

Dataqdisk

360050763808101269800000000000003

5GB

4

Data

360050763808101269800000000000001

328GB

4

创建多路径配置文件

#vi /etc/multipath.conf

技术分享

Blacklist devnode默认为^sd[a]需要修改为^sd[i]本环境Root分区为sdi,屏蔽本地硬盘

技术分享

配置完毕后,重启multipathd服务

/etc/init.d/multipathd restart

技术分享

使用multipath -v2命令刷新存储路径

多网卡绑定

网卡绑定拓扑图

技术分享

技术分享

修改vi /etc/modules.conf

技术分享

创建网卡绑定配置文件

N1PMCSAP01 Bond配置

[root@N1PMCSAP01 /]# vi /etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1

# HWADDR=A0:36:9F:DA:DA:CD

TYPE=Ethernet

UUID=3ca5c4fe-44cd-4c50-b3f1-8082e1c1c94d

ONBOOT=yes

NM_CONTROLLED=no

BOOTPROTO=none

MASTER=bond0

SLAVE=yes

[root@N1PMCSAP01 /]# vi /etc/sysconfig/network-scripts/ifcfg-eth3

DEVICE=eth3

# HWADDR=A0:36:9F:DA:DA:CB

TYPE=Ethernet

UUID=1d47913a-b11c-432c-b70f-479a05da2c71

ONBOOT=yes

NM_CONTROLLED=no

BOOTPROTO=none

MASTER=bond0

SLAVE=yes

[root@N1PMCSAP01 /]# vi /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

# HWADDR=A0:36:9F:DA:DA:CC

TYPE=Ethernet

UUID=a099350a-8dfa-4d3f-b444-a08f9703cdc2

ONBOOT=yes

NM_CONTROLLED=no

BOOTPROTO=satic

IPADDR=10.51.66.11

NETMASK=255.255.248.0

GATEWAY=10.51.71.254

N1PMCSAP02 Bond配置

[root@N1PMCSAP02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1

# HWADDR=A0:36:9F:DA:DA:D1

TYPE=Ethernet

UUID=8e0abf44-360a-4187-ab65-42859d789f57

ONBOOT=yes

NM_CONTROLLED=no

BOOTPROTO=none

MASTER=bond0

SLAVE=yes

[root@N1PMCSAP02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth3

DEVICE=eth3

# HWADDR=A0:36:9F:DA:DA:B1

TYPE=Ethernet

UUID=d300f10b-0474-4229-b3a3-50d95e6056c8

ONBOOT=yes

NM_CONTROLLED=no

BOOTPROTO=none

MASTER=bond0

SLAVE=yes

[root@N1PMCSAP02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

# HWADDR=A0:36:9F:DA:DA:D0

TYPE=Ethernet

UUID=2288f4e1-6743-4faa-abfb-e83ec4f9443c

ONBOOT=yes

NM_CONTROLLED=no

BOOTPROTO=static

IPADDR=10.51.66.12

NETMASK=255.255.248.0

GATEWAY=10.51.71.254

主机Host配置

技术分享

在N1PMCSAP01和N1PMCSAP02中配置hosts文件

#Vi /etc/hosts

RHEL本地源配置

more /etc/yum.repos.d/rhel-source.repo

[rhel_6_iso]

name=local iso

baseurl=file:///media

gpgcheck=1

gpgkey=file:///media/RPM-GPG-KEY-redhat-release

[HighAvailability]

name=HighAvailability

baseurl=file:///media/HighAvailability

gpgcheck=1

gpgkey=file:///media/RPM-GPG-KEY-redhat-release

[LoadBalancer]

name=LoadBalancer

baseurl=file:///media/LoadBalancer

gpgcheck=1

gpgkey=file:///media/RPM-GPG-KEY-redhat-release

[ResilientStorage]

name=ResilientStorage

baseurl=file:///media/ResilientStorage

gpgcheck=1

gpgkey=file:///media/RPM-GPG-KEY-redhat-release

[ScalableFilesystem]

name=ScalableFileSystem

baseurl=file:///media/ScalableFileSystem

gpgcheck=1

gpgkey=file:///media/RPM-GPG-KEY-redhat-release

文件系统格式化

[root@N1PMCSAP01 /]# pvdisplay

connect() failed on local socket: No such file or directory

Internal cluster locking initialisation failed.

WARNING: Falling back to local file-based locking.

Volume Groups with the clustered attribute will be inaccessible.

--- Physical volume ---

PV Name /dev/sdi2

VG Name VolGroup

PV Size 556.44 GiB / not usable 3.00 MiB

Allocatable yes (but full)

PE Size 4.00 MiB

Total PE 142448

Free PE 0

Allocated PE 142448

PV UUID 0fSZ8Q-Ay1W-ef2n-9ve2-RxzM-t3GV-u4rrQ2

--- Physical volume ---

PV Name /dev/mapper/data

VG Name vg_data

PV Size 328.40 GiB / not usable 1.60 MiB

Allocatable yes (but full)

PE Size 4.00 MiB

Total PE 84070

Free PE 0

Allocated PE 84070

PV UUID kJvd3t-t7V5-MULX-7Kj6-OI2f-vn3r-QXN8tr

[root@N1PMCSAP01 /]# vgdisplay

System ID

Format lvm2

Metadata Areas 1

Metadata Sequence No 4

VG Access read/write

VG Status resizable

MAX LV 0

Cur LV 3

Open LV 3

Max PV 0

Cur PV 1

Act PV 1

VG Size 556.44 GiB

PE Size 4.00 MiB

Total PE 142448

Alloc PE / Size 142448 / 556.44 GiB

Free PE / Size 0 / 0

VG UUID 6q2td7-AxWX-4K4K-8vy6-ngRs-IIdP-peMpCU

--- Volume group ---

VG Name vg_data

System ID

Format lvm2

Metadata Areas 1

Metadata Sequence No 3

VG Access read/write

VG Status resizable

MAX LV 0

Cur LV 1

Open LV 1

Max PV 0

Cur PV 1

Act PV 1

VG Size 328.40 GiB

PE Size 4.00 MiB

Total PE 84070

Alloc PE / Size 84070 / 328.40 GiB

Free PE / Size 0 / 0

VG UUID GfMy0O-QcmQ-pkt4-zf1i-yKpu-6c2i-JUoSM2

[root@N1PMCSAP01 /]# lvdisplay

--- Logical volume ---

LV Path /dev/vg_data/lv_data

LV Name lv_data

VG Name vg_data

LV UUID 1AMJnu-8UnC-mmGb-7s7N-P0Wg-eeOj-pXrHV6

LV Write Access read/write

LV Creation host, time N1PMCSAP01, 2017-05-26 11:23:04 -0400

LV Status available

# open 1

LV Size 328.40 GiB

Current LE 84070

Segments 1

Allocation inherit

Read ahead sectors auto

- currently set to 256

Block device 253:5

RHCS软件组件安装

# yum -y install cman ricci rgmanager luci

技术分享

修改ricci用户名密码

#passwd ricci

ricc password:redhat

RHCS群集设定

LUCI图形化界面

技术分享

使用浏览器登录https://10.51.56.11:8084/

用户名root,密码redhat(默认密码未修改)

群集创建

技术分享

点击左上角Manage Cluster创建群集

Node Name

NODE ID

Votes

Ricci user

Ricci password

HOSTNAME

N1PMCSAP01-PRIV

1

1

ricci

redhat

N1PMCSAP01-PRIV

N1PMCSAP02-PRIV

2

1

ricci

redhat

N1PMCSAP02-PRIV

Fence Device配置

技术分享技术分享

由于群集只有2节点,可能产生脑裂情况,所以在主机出现故障的时候,需要使用Fence机制仲裁哪台主机脱离群集,而Fence最佳实践就是采用主板集成的IMM端口,IBM称为IMM模块,HPE则为ILO,DELL为IDRAC,其目的是强制服务器重新POST开机,从而达到释放资源的目的,图中左上角RJ45网口为IMM端口

技术分享

N1PMCSAP01 Fence设备设定参数,用户名:USERID,密码:PASSW0RD

技术分享

N1PMCSAP02 Fence设备设定参数,用户名:USERID,密码:PASSW0RD

Failover Domain配置

技术分享

群集资源配置

技术分享

一般来说某个应用程序都应该包含其依赖的资源,例如IP地址、存储介质、服务脚本(应用程序)

群集资源配置中需要配置好各种资源的属性参数,图中10.51.66.1为IP资源,定义了群集应用程序的IP地址

技术分享

Lv_data为群集磁盘资源,定义了磁盘的挂载点,及物理块设备的位置

技术分享

McsCluster则为应用程序启动脚本,定义了启动脚本的位置

Service Groups配置

技术分享

Service Group可定义一个或一组应用,包含该应用所需要的所有资源,以及启动优先等级,管理员可在这个界面手动切换服务运行的主机,及恢复策略

其他高级设定

技术分享技术分享

为了防止群集加入时来回Fence,这里配置了Post Join Delay为3秒

查看群集配置文件

最后可通过命令查看群集所有的配置

#cat /etc/cluster/cluster.conf

[root@N1PMCSAP01 ~]# vi /etc/cluster/cluster.conf

<fence>

<method name="N1PMCSAP01_Method">

<device name="N1PMCSAP01_FD"/>

</method>

</fence>

</clusternode>

<clusternode name="N1PMCSAP02-PRIV" nodeid="2">

<fence>

<method name="N1PMCSAP02_Method">

<device name="N1PMCSAP02_FD"/>

</method>

</fence>

</clusternode>

</clusternodes>

<cman expected_votes="1" two_node="1"/>

<rm>

<failoverdomains>

<failoverdomain name="N1PMCSAP-FD" nofailback="1" ordered="1">

<failoverdomainnode name="N1PMCSAP01-PRIV" priority="1"/>

<failoverdomainnode name="N1PMCSAP02-PRIV" priority="2"/>

</failoverdomain>

</failoverdomains>

<resources>

<ip address="10.51.66.1" sleeptime="10"/>

<fs device="/dev/vg_data/lv_data" force_unmount="1" fsid="25430" mountpoint="/u01" name="lv_data" self_fence="1"/>

<script file="/home/mcs/cluster/McsCluster" name="McsCluster"/>

</resources>

<service domain="N1PMCSAP-FD" name="APP" recovery="disable">

<fs ref="lv_data">

<ip ref="10.51.66.1"/>

</fs>

<script ref="McsCluster"/>

</service>

</rm>

<fencedevices>

<fencedevice agent="fence_imm" ipaddr="10.51.188.177" login="USERID" name="N1PMCSAP01_FD" passwd="PASSW0RD"/>

<fencedevice agent="fence_imm" ipaddr="10.51.188.178" login="USERID" name="N1PMCSAP02_FD" passwd="PASSW0RD"/>

</fencedevices>

</cluster>

该配置文件在N1PMCSAP01和N1PMCSAP02两台主机中是保持一致的

RHCS群集使用方法

查看群集状态

使用Clustat命令查看运行状态

[root@N1PMCSAP01 /]# clustat -l

Cluster Status for N1PMCSAP @ Fri May 26 14:13:25 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

N1PMCSAP01-PRIV 1 Online, Local, rgmanager

N1PMCSAP02-PRIV 2 Online, rgmanager

Service Information

------- -----------

Service Name : service:APP

Current State : started (112)

Flags : none (0)

Owner : N1PMCSAP01-PRIV

Last Owner : N1PMCSAP01-PRIV

Last Transition : Fri May 26 13:55:45 2017

Current State为服务正在运行状态

Owner为正在运行服务的节点

手动切换群集

[root@N1PMCSAP01 /]# clusvcadm -r APP(Service Group Name) -m N1PMCSAP02(Host Member)

#手动从节点1切换至节点2

[root@N1PMCSAP01 /]# clusvcadm -d APP(Service Group Name)

#手动停止APP服务

[root@N1PMCSAP01 /]# clusvcadm -e APP(Service Group Name)

#手动启用APP服务

[root@N1PMCSAP01 /]# clusvcadm -M APP(Service Group Name) -m N1PMCSAP02(Host Member)

#将Owner优先级设定为N1PMCSAP02优先

自动切换群集

当服务器硬件出现故障时,心跳网络不可达对方时,RHCS群集会自动重启故障服务器,将资源切换至状态完好的另一台服务器,当硬件修复完毕后,管理员可选择是否将服务回切

注意:

不要同时拔出两台服务器的心跳网络,会造成脑裂

本文出自 “袁伟烨IT技术” 博客,请务必保留此出处http://popeyeywy.blog.51cto.com/745223/1930942

使用RHCS创建Linux高可用群集