首页 > 代码库 > 使用RHCS创建Linux高可用群集
使用RHCS创建Linux高可用群集
基础环境准备环境拓扑图 |
Linux基本服务设定 |
关闭iptables
#/etc/init.d/iptables stop
#chkconfig iptables off
#chkconfig list | grep iptables
关闭selinux
#setenforce 0
#vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.
SELINUXTYPE=targeted
关闭NetworkManager
#/etc/init.d/NetworkManager stop
#chkconfig NetworkManager off
#chkconfig list | grep NetworkManager
双机互信 |
# mkdir ~/.ssh
# chmod 700 ~/.ssh
# ssh-keygen -t rsa
enter
enter
enter
# ssh-keygen -t dsa
enter
enter
enter
N1PMCSAP01执行
# cat ~/.ssh/*.pub >> ~/.ssh/authorized_keys
# ssh N1PMCSAP02 cat ~/.ssh/*.pub >> ~/.ssh/authorized_keys
yes
N1PMCSAP02的密码
# scp ~/.ssh/authorized_keys N1PMCSAP02:~/.ssh/authorized_keys
存储多路径配置 |
使用autoscan.sh脚本刷新IBM存储路径
查看存储底层WWID
NAME | WWID | Capcity | Path |
Dataqdisk | 360050763808101269800000000000003 | 5GB | 4 |
Data | 360050763808101269800000000000001 | 328GB | 4 |
创建多路径配置文件
#vi /etc/multipath.conf
Blacklist devnode默认为^sd[a]需要修改为^sd[i]本环境Root分区为sdi,屏蔽本地硬盘
配置完毕后,重启multipathd服务
/etc/init.d/multipathd restart
使用multipath -v2命令刷新存储路径
多网卡绑定 |
网卡绑定拓扑图
修改vi /etc/modules.conf
创建网卡绑定配置文件
N1PMCSAP01 Bond配置
[root@N1PMCSAP01 /]# vi /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
# HWADDR=A0:36:9F:DA:DA:CD
TYPE=Ethernet
UUID=3ca5c4fe-44cd-4c50-b3f1-8082e1c1c94d
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
[root@N1PMCSAP01 /]# vi /etc/sysconfig/network-scripts/ifcfg-eth3
DEVICE=eth3
# HWADDR=A0:36:9F:DA:DA:CB
TYPE=Ethernet
UUID=1d47913a-b11c-432c-b70f-479a05da2c71
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
[root@N1PMCSAP01 /]# vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
# HWADDR=A0:36:9F:DA:DA:CC
TYPE=Ethernet
UUID=a099350a-8dfa-4d3f-b444-a08f9703cdc2
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=satic
IPADDR=10.51.66.11
NETMASK=255.255.248.0
GATEWAY=10.51.71.254
N1PMCSAP02 Bond配置
[root@N1PMCSAP02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
# HWADDR=A0:36:9F:DA:DA:D1
TYPE=Ethernet
UUID=8e0abf44-360a-4187-ab65-42859d789f57
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
[root@N1PMCSAP02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth3
DEVICE=eth3
# HWADDR=A0:36:9F:DA:DA:B1
TYPE=Ethernet
UUID=d300f10b-0474-4229-b3a3-50d95e6056c8
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
[root@N1PMCSAP02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
# HWADDR=A0:36:9F:DA:DA:D0
TYPE=Ethernet
UUID=2288f4e1-6743-4faa-abfb-e83ec4f9443c
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=static
IPADDR=10.51.66.12
NETMASK=255.255.248.0
GATEWAY=10.51.71.254
主机Host配置 |
在N1PMCSAP01和N1PMCSAP02中配置hosts文件
#Vi /etc/hosts
RHEL本地源配置 |
more /etc/yum.repos.d/rhel-source.repo
[rhel_6_iso]
name=local iso
baseurl=file:///media
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
[HighAvailability]
name=HighAvailability
baseurl=file:///media/HighAvailability
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
[LoadBalancer]
name=LoadBalancer
baseurl=file:///media/LoadBalancer
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
[ResilientStorage]
name=ResilientStorage
baseurl=file:///media/ResilientStorage
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
[ScalableFilesystem]
name=ScalableFileSystem
baseurl=file:///media/ScalableFileSystem
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
文件系统格式化 |
[root@N1PMCSAP01 /]# pvdisplay
connect() failed on local socket: No such file or directory
Internal cluster locking initialisation failed.
WARNING: Falling back to local file-based locking.
Volume Groups with the clustered attribute will be inaccessible.
--- Physical volume ---
PV Name /dev/sdi2
VG Name VolGroup
PV Size 556.44 GiB / not usable 3.00 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 142448
Free PE 0
Allocated PE 142448
PV UUID 0fSZ8Q-Ay1W-ef2n-9ve2-RxzM-t3GV-u4rrQ2
--- Physical volume ---
PV Name /dev/mapper/data
VG Name vg_data
PV Size 328.40 GiB / not usable 1.60 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 84070
Free PE 0
Allocated PE 84070
PV UUID kJvd3t-t7V5-MULX-7Kj6-OI2f-vn3r-QXN8tr
[root@N1PMCSAP01 /]# vgdisplay
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 4
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 3
Max PV 0
Cur PV 1
Act PV 1
VG Size 556.44 GiB
PE Size 4.00 MiB
Total PE 142448
Alloc PE / Size 142448 / 556.44 GiB
Free PE / Size 0 / 0
VG UUID 6q2td7-AxWX-4K4K-8vy6-ngRs-IIdP-peMpCU
--- Volume group ---
VG Name vg_data
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 3
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 1
Open LV 1
Max PV 0
Cur PV 1
Act PV 1
VG Size 328.40 GiB
PE Size 4.00 MiB
Total PE 84070
Alloc PE / Size 84070 / 328.40 GiB
Free PE / Size 0 / 0
VG UUID GfMy0O-QcmQ-pkt4-zf1i-yKpu-6c2i-JUoSM2
[root@N1PMCSAP01 /]# lvdisplay
--- Logical volume ---
LV Path /dev/vg_data/lv_data
LV Name lv_data
VG Name vg_data
LV UUID 1AMJnu-8UnC-mmGb-7s7N-P0Wg-eeOj-pXrHV6
LV Write Access read/write
LV Creation host, time N1PMCSAP01, 2017-05-26 11:23:04 -0400
LV Status available
# open 1
LV Size 328.40 GiB
Current LE 84070
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:5
RHCS软件组件安装 |
# yum -y install cman ricci rgmanager luci
修改ricci用户名密码
#passwd ricci
ricc password:redhat
RHCS群集设定LUCI图形化界面 |
使用浏览器登录https://10.51.56.11:8084/
用户名root,密码redhat(默认密码未修改)
群集创建 |
点击左上角Manage Cluster创建群集
Node Name | NODE ID | Votes | Ricci user | Ricci password | HOSTNAME |
N1PMCSAP01-PRIV | 1 | 1 | ricci | redhat | N1PMCSAP01-PRIV |
N1PMCSAP02-PRIV | 2 | 1 | ricci | redhat | N1PMCSAP02-PRIV |
Fence Device配置 |
由于群集只有2节点,可能产生脑裂情况,所以在主机出现故障的时候,需要使用Fence机制仲裁哪台主机脱离群集,而Fence最佳实践就是采用主板集成的IMM端口,IBM称为IMM模块,HPE则为ILO,DELL为IDRAC,其目的是强制服务器重新POST开机,从而达到释放资源的目的,图中左上角RJ45网口为IMM端口
N1PMCSAP01 Fence设备设定参数,用户名:USERID,密码:PASSW0RD
N1PMCSAP02 Fence设备设定参数,用户名:USERID,密码:PASSW0RD
Failover Domain配置 |
群集资源配置 |
一般来说某个应用程序都应该包含其依赖的资源,例如IP地址、存储介质、服务脚本(应用程序)
群集资源配置中需要配置好各种资源的属性参数,图中10.51.66.1为IP资源,定义了群集应用程序的IP地址
Lv_data为群集磁盘资源,定义了磁盘的挂载点,及物理块设备的位置
McsCluster则为应用程序启动脚本,定义了启动脚本的位置
Service Groups配置 |
Service Group可定义一个或一组应用,包含该应用所需要的所有资源,以及启动优先等级,管理员可在这个界面手动切换服务运行的主机,及恢复策略
其他高级设定 |
为了防止群集加入时来回Fence,这里配置了Post Join Delay为3秒
查看群集配置文件 |
最后可通过命令查看群集所有的配置
#cat /etc/cluster/cluster.conf
[root@N1PMCSAP01 ~]# vi /etc/cluster/cluster.conf
<fence>
<method name="N1PMCSAP01_Method">
<device name="N1PMCSAP01_FD"/>
</method>
</fence>
</clusternode>
<clusternode name="N1PMCSAP02-PRIV" nodeid="2">
<fence>
<method name="N1PMCSAP02_Method">
<device name="N1PMCSAP02_FD"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<rm>
<failoverdomains>
<failoverdomain name="N1PMCSAP-FD" nofailback="1" ordered="1">
<failoverdomainnode name="N1PMCSAP01-PRIV" priority="1"/>
<failoverdomainnode name="N1PMCSAP02-PRIV" priority="2"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="10.51.66.1" sleeptime="10"/>
<fs device="/dev/vg_data/lv_data" force_unmount="1" fsid="25430" mountpoint="/u01" name="lv_data" self_fence="1"/>
<script file="/home/mcs/cluster/McsCluster" name="McsCluster"/>
</resources>
<service domain="N1PMCSAP-FD" name="APP" recovery="disable">
<fs ref="lv_data">
<ip ref="10.51.66.1"/>
</fs>
<script ref="McsCluster"/>
</service>
</rm>
<fencedevices>
<fencedevice agent="fence_imm" ipaddr="10.51.188.177" login="USERID" name="N1PMCSAP01_FD" passwd="PASSW0RD"/>
<fencedevice agent="fence_imm" ipaddr="10.51.188.178" login="USERID" name="N1PMCSAP02_FD" passwd="PASSW0RD"/>
</fencedevices>
</cluster>
该配置文件在N1PMCSAP01和N1PMCSAP02两台主机中是保持一致的
RHCS群集使用方法查看群集状态 |
使用Clustat命令查看运行状态
[root@N1PMCSAP01 /]# clustat -l
Cluster Status for N1PMCSAP @ Fri May 26 14:13:25 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1PMCSAP01-PRIV 1 Online, Local, rgmanager
N1PMCSAP02-PRIV 2 Online, rgmanager
Service Information
------- -----------
Service Name : service:APP
Current State : started (112)
Flags : none (0)
Owner : N1PMCSAP01-PRIV
Last Owner : N1PMCSAP01-PRIV
Last Transition : Fri May 26 13:55:45 2017
Current State为服务正在运行状态
Owner为正在运行服务的节点
手动切换群集 |
[root@N1PMCSAP01 /]# clusvcadm -r APP(Service Group Name) -m N1PMCSAP02(Host Member)
#手动从节点1切换至节点2
[root@N1PMCSAP01 /]# clusvcadm -d APP(Service Group Name)
#手动停止APP服务
[root@N1PMCSAP01 /]# clusvcadm -e APP(Service Group Name)
#手动启用APP服务
[root@N1PMCSAP01 /]# clusvcadm -M APP(Service Group Name) -m N1PMCSAP02(Host Member)
#将Owner优先级设定为N1PMCSAP02优先
自动切换群集 |
当服务器硬件出现故障时,心跳网络不可达对方时,RHCS群集会自动重启故障服务器,将资源切换至状态完好的另一台服务器,当硬件修复完毕后,管理员可选择是否将服务回切
注意:
不要同时拔出两台服务器的心跳网络,会造成脑裂
本文出自 “袁伟烨IT技术” 博客,请务必保留此出处http://popeyeywy.blog.51cto.com/745223/1930942
使用RHCS创建Linux高可用群集