首页 > 代码库 > RAC环境下控制文件损坏重建过程

RAC环境下控制文件损坏重建过程

处理过程参考了:
https://blogs.oracle.com/Database4CN/entry/%E5%A6%82%E4%BD%95%E9%87%8D%E5%BB%BArac%E7%9A%84%E6%8E%A7%E5%88%B6%E6%96%87%E4%BB%B6
 
 
问题现象:
 
现场有学校提报 登录PL/SQL连接数据库是报错“ORA-12541: TNS:无监听程序 ”;排查日志,发现
 
Tue Nov 25 14:46:58 2014
Thread 2 advanced to log sequence 18098 (LGWR switch)
  Current log# 4 seq# 18098 mem# 0: +DATADG/urpdb/onlinelog/group_4.266.821439273
Tue Nov 25 14:46:58 2014
Archived Log entry 50394 added for thread 2 sequence 18097 ID 0xb0471722 dest 1:
Tue Nov 25 14:54:49 2014
Read from controlfile member ‘+DATADG/urpdb/controlfile/current.260.821439141‘ has found a corrupted block (blk# 35, cf seq# 0)
Hex dump of (file 0, block 35) in trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc
Corrupt block relative dba: 0x00000023 (file 0, block 35)
Completely zero block found during control file block read
Hex dump of (file 0, block 35) in trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc
Corrupt block relative dba: 0x00000023 (file 0, block 35)
Completely zero block found during control file block read
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc:
ORA-00202: control file: ‘+DATADG/urpdb/controlfile/current.260.821439141‘
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc  (incident=15289):
ORA-00227: corrupt block detected in control file: (block 35, # blocks 1)
ORA-00202: control file: ‘+DATADG/urpdb/controlfile/current.260.821439141‘

Incident details in: /u01/app/oracle/diag/rdbms/urpdb/urpdb2/incident/incdir_15289/urpdb2_lmon_10377_i15289.trc
Tue Nov 25 14:54:51 2014
Dumping diagnostic data in directory=[cdmp_20141125145451], requested by (instance=2, osid=10377 (LMON)), summary=[incident=15289].
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc:
ORA-00227: corrupt block detected in control file: (block 35, # blocks 1)
ORA-00202: control file: ‘+DATADG/urpdb/controlfile/current.260.821439141‘
LMON (ospid: 10377): terminating the instance due to error 227
System state dump requested by (instance=2, osid=10377 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_diag_10357.trc
Instance terminated by LMON, pid = 10377
 
从日志可以看出,控制文件损坏了,导致数据库实例终止。
 
 
 
解决办法:
 
因为日志文件、数据文件都是好的,只是唯一的一份控制文件损坏了,只能通过重建控制文件方法解决;
 
在RAC上重建控制文件与单实例有一些小区别:
 在重建控制文件前必须设置cluster_database=false,而且只启动一个实例来执行操作,否则会报错
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode

重建完控制文件后,需要再将cluster_database设为true.
 
 
处理步骤:
 
1.首先生成重建控制文件的脚本:
SQL> alter database backup controlfile to trace;
 
2.数据库的Alert log中会详细输出这个文件的路径和名字:

Tue Nov 25 22:32:04 2014
alter database backup controlfile to trace
Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb1/trace/urpdb1_ora_6233.trc
Completed: alter database backup controlfile to trace

 
 
3.启动一个实例

SQL> startup nomount;    

 

4.修改cluster_database属性
SQL> alter system set cluster_database=false scope=spfile;

SQL> shutdown immediate

SQL> startup nomount;
 
 
5.从urpdb1_ora_6233.trc 中获取创建控制文件脚本(noresetlogs的)执行
 
SQL> CREATE CONTROLFILE REUSE DATABASE "URPDB" NORESETLOGS  ARCHIVELOG
    MAXLOGFILES 192
    MAXLOGMEMBERS 3
    MAXDATAFILES 1024
    MAXINSTANCES 32
    MAXLOGHISTORY 18688
LOGFILE
  GROUP 1 ‘+DATADG/urpdb/onlinelog/group_1.261.821439143‘  SIZE 50M BLOCKSIZE 512,
  GROUP 2 ‘+DATADG/urpdb/onlinelog/group_2.262.821439143‘  SIZE 50M BLOCKSIZE 512,
  GROUP 3 ‘+DATADG/urpdb/onlinelog/group_3.265.821439273‘  SIZE 50M BLOCKSIZE 512,
  GROUP 4 ‘+DATADG/urpdb/onlinelog/group_4.266.821439273‘  SIZE 50M BLOCKSIZE 512
DATAFILE
  ‘+DATADG/urpdb/datafile/system.256.821439061‘,
  ‘+DATADG/urpdb/datafile/sysaux.257.821439061‘,
  ‘+DATADG/urpdb/datafile/undotbs1.258.821439061‘,
  ‘+DATADG/urpdb/datafile/users.259.821439061‘,
  ‘+DATADG/urpdb/datafile/undotbs2.264.821439205‘,
  ‘+DATADG/urpdb/datafile/tsp_imp_d.dbf‘,
  ‘+DATADG/urpdb/datafile/tsp_dpstar.dbf‘,
  ‘+DATADG/urpdb/datafile/ts_portal_d.dbf‘,
  ‘+DATADG/urpdb/datafile/ts_w5_d.dbf‘,
  ‘+DATADG/urpdb/datafile/ts_wap_d.dbf‘,
  ‘+DATADG/urpdb/datafile/ts_oa_d.dbf‘,
  ‘+DATADG/urpdb/datafile/tsp_sastar.dbf‘,
  ‘+DATADG/urpdb/datafile/tsp_portal.dbf‘
CHARACTER SET AL32UTF8
;
 
 
6.确认redo log,thread#是正确的
SQL> select * from v$log; 
 
 
7.恢复数据库、启动数据库、恢复临时文件(这些步骤,在urpdb1_ora_6233.trc 写的很清楚
SQL> recover database;

Media recovery complete.

SQL> alter system  archive log all;

SQL> alter database open;


SQL> ALTER TABLESPACE TEMP ADD TEMPFILE ‘+DATADG/urpdb/tempfile/temp.263.821439145‘ REUSE;

SQL> ALTER TABLESPACE TSP_IMP_TEMP ADD TEMPFILE ‘+DATADG/urpdb/tempfile/tsp_imp_temp.dbf‘ REUSE;
SQL> ALTER TABLESPACE TSP_DPSTAR_TEMP ADD TEMPFILE ‘+DATADG/urpdb/tempfile/tsp_dpstar_temp.dbf‘ REUSE;
SQL> ALTER TABLESPACE TS_PORTAL_TEMP ADD TEMPFILE ‘+DATADG/urpdb/datafile/ts_portal_temp.dbf‘ REUSE;
SQL> ALTER TABLESPACE TS_W5_TEMP ADD TEMPFILE ‘+DATADG/urpdb/datafile/ts_w5_temp.dbf‘ REUSE;
SQL> ALTER TABLESPACE TS_WAP_TEMP ADD TEMPFILE ‘+DATADG/urpdb/datafile/ts_wap_temp.dbf‘ REUSE;
 
 
8.还原cluster_database属性
 
SQL> alter system set cluster_database=true scope=spfile;


SQL> shutdown immediate

 
 
9.重启集群服务
./crsctl start cluser   问题解决。
 
 
10.登录另外一台服务器查看数据库状态,如有问题进行相应处理。
 
 
总结:
 
  1.开启数据库控制文件自动备份很重要:
         本数据库由于没有打开控制文件自动备份功能,只能采用重建方法。如果开启了,采用该控制文件,进行recover database using backup controlfile会更简单一些(有待测试验证)

 

RAC环境下控制文件损坏重建过程