首页 > 代码库 > Oracle11G集群故障排错

Oracle11G集群故障排错

问题现象:

集群好像不漂移IP了,全部会话连接数量只集中在某个节点上,这个节点断掉也不会自动连接到另外一个节点上;

问题排查:

在节点RAC1上执行集群状态检查命令(注意看红色字体部分):

grid@rac01:[/home/grid]crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE       SERVER                  STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ARDATA.dg

               ONLINE  ONLINE      rac01                                       

               ONLINE  ONLINE      rac02                                       

ora.CLDATA.dg

               ONLINE  ONLINE      rac01                                       

               ONLINE  ONLINE      rac02                                       

ora.LISTENER.lsnr

               ONLINE  ONLINE      rac01                                       

               ONLINE  ONLINE      rac02                                       

ora.USDATA.dg

               ONLINE  ONLINE      rac01                                       

               ONLINE  ONLINE      rac02                                       

ora.USDATA01.dg

               ONLINE  OFFLINE     rac01                                        

               ONLINE  ONLINE      rac02                                       

ora.asm

               ONLINE  ONLINE      rac01                    Started            

               ONLINE  ONLINE      rac02                    Started            

ora.gsd

               OFFLINE OFFLINE      rac01                                       

               OFFLINE OFFLINE      rac02                                       

ora.net1.network

               ONLINE  ONLINE      rac01                                       

               ONLINE  ONLINE      rac02                                       

ora.ons

               ONLINE  ONLINE      rac01                                       

               ONLINE  ONLINE      rac02                                        

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

     1        ONLINE ONLINE       rac02                                       

ora.cvu

     1        ONLINE  ONLINE      rac02                                       

ora.oc4j

     1        ONLINE  ONLINE      rac02                                       

ora.rac01.vip

     1        ONLINE  ONLINE      rac01                                       

ora.rac02.vip

     1        ONLINE  ONLINE      rac02                                       

ora.racdb.db

     1        ONLINE  OFFLINE                              Instance Shutdown  

     2        ONLINE  ONLINE      rac02                    Open               

ora.scan1.vip

     1        ONLINE  ONLINE      rac02                 



由此发现可能USDATA01卷组不能正常工作,尝试启动USDATA01卷组,发现继续报错。


grid@rac01:[/home/grid]srvctl start diskgroup -g USDATA01

PRCR-1079 : Failed to start resource ora.USDATA01.dg

CRS-5017: The resource action "ora.USDATA01.dg start" encountered the following error: 

ORA-15032: not all alterations performed

ORA-15017: diskgroup "USDATA01" cannot be mounted

ORA-15063: ASM discovered an insufficient number of disks for diskgroup "USDATA01"

. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/rac01/agent/crsd/oraagent_grid/oraagent_grid.log".

CRS-2674: Start of ‘ora.USDATA01.dg‘ on ‘rac01‘ failed


查看ASM磁盘卷组,发现USDATA01卷组无法正常识别,所以节点RAC1的状态不正常是因为无法正常识别该卷组的问题。

grid@rac01:[/home/grid]asmcmd

ASMCMD> lsdg

State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

MOUNTED  EXTERN  N         512   4096  1048576    666603   332494                0          332494              0             N  ARDATA/

MOUNTED  EXTERN  N         512   4096  1048576     51207    50811                0           50811              0             Y  CLDATA/

MOUNTED  EXTERN  N         512   4096  1048576    512009   124302                0          124302              0             N  USDATA/






本文出自 “张建忠的技术专栏” 博客,请务必保留此出处http://cloudzjz.blog.51cto.com/4099948/1609314

Oracle11G集群故障排错