首页 > 代码库 > 无RMAN备份集情况下的坏块恢复

无RMAN备份集情况下的坏块恢复

测试的环境是没有可用的RMAN备份集,但是有数据文件的热备,下面来看测试:

--创建测试用户和测试表
[oracle@ora10g ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on 16 16:01:02 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> create user zlm identified by zlm;

User created.

SQL> alter user zlm default tablespace zlm;

User altered.

SQL> grant dba to zlm;

Grant succeeded.

SQL> conn zlm/zlm
Connected.
SQL> create table corrupt_test (id number(10),name varchar2(15));

Table created.

SQL> insert into corrupt_test values(1,‘aaron8219‘);

1 row created.

SQL> commit;

Commit complete.

SQL> set lin 130
SQL> col segment_name for a20
SQL> col tablespace_name for a20
SQL> select segment_name,tablespace_name from dba_segments where segment_name=‘CORRUPT_TEST‘;

SEGMENT_NAME         TABLESPACE_NAME
-------------------- --------------------
CORRUPT_TEST         ZLM

SQL> col name for a45
SQL> select a.segment_name,a.tablespace_name,b.file#,b.name from dba_segments a,v$datafile b where a.header_file=b.file# and a.segment_name=‘CORRUPT_TEST‘;

SEGMENT_NAME         TABLESPACE_NAME           FILE# NAME
-------------------- -------------------- ---------- ---------------------------------------------
CORRUPT_TEST         ZLM                           6 /u01/app/oracle/oradata/ora10g/zlm01.dbf

由于之前做过RMAN备份,所以先把备份集删除

[oracle@ora10g ~]$ rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on 16 16:06:47 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: ORA10G (DBID=4175411955)

RMAN> list backupset;

using target database control file instead of recovery catalog

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
286     Full    880.50M    DISK        00:01:35     2014-11-12     
        BP Key: 286   Status: AVAILABLE  Compressed: NO  Tag: TAG20141112T141548
        Piece Name: /u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp
  List of Datafiles in backup set 286
  File LV Type Ckp SCN    Ckp Time   Name
  ---- -- ---- ---------- ---------- ----
  1       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/system01.dbf
  2       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/undotbs01.dbf
  3       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/sysaux01.dbf
  4       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/users01.dbf
  5       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/example01.dbf
  6       Full 1202813    2014-11-12 

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
302     42.17M     DISK        00:00:27     2014-11-21     
        BP Key: 302   Status: AVAILABLE  Compressed: YES  Tag: ARC_BAK
        Piece Name: /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc

  List of Archived Logs in backup set 302
  Thrd Seq     Low SCN    Low Time   Next SCN   Next Time
  ---- ------- ---------- ---------- ---------- ---------
  1    39      1234835    2014-11-18 1247748    2014-11-21
  1    40      1247748    2014-11-21 1249682    2014-11-21
  1    41      1249682    2014-11-21 1250181    2014-11-21
  1    42      1250181    2014-11-21 1258063    2014-11-21
  1    43      1258063    2014-11-21 1260208    2014-11-21

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
303     Full    164.91M    DISK        00:01:52     2014-11-21     
        BP Key: 303   Status: AVAILABLE  Compressed: YES  Tag: DB_BAK
        Piece Name: /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db
  List of Datafiles in backup set 303
  File LV Type Ckp SCN    Ckp Time   Name
  ---- -- ---- ---------- ---------- ----
  1       Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/system01.dbf
  2       Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/undotbs01.dbf
  3       Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/sysaux01.dbf
  4       Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/users01.dbf
  5       Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/example01.dbf
  6       Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/zlm01.dbf

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
304     19.50K     DISK        00:00:01     2014-11-21     
        BP Key: 304   Status: AVAILABLE  Compressed: YES  Tag: ARC_BAK
        Piece Name: /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc

  List of Archived Logs in backup set 304
  Thrd Seq     Low SCN    Low Time   Next SCN   Next Time
  ---- ------- ---------- ---------- ---------- ---------
  1    44      1260208    2014-11-21 1260277    2014-11-21

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
305     Full    7.23M      DISK        00:00:01     2014-11-21     
        BP Key: 305   Status: AVAILABLE  Compressed: NO  Tag: TAG20141121T151114
        Piece Name: /u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl
  Control File Included: Ckp SCN: 1260283      Ckp time: 2014-11-21
  SPFILE Included: Modification time: 2014-11-21

RMAN> exit


Recovery Manager complete.
[oracle@ora10g ~]$ cd /u01/orabackup/backupsets/
[oracle@ora10g backupsets]$ ll -lrth
total 215M
-rw-r----- 1 oracle oinstall  43M Nov 21 15:09 ora10g-4175411955_20141121_864227317_351.arc
-rw-r----- 1 oracle oinstall 165M Nov 21 15:11 ora10g-4175411955_20141121_864227354_352.db
-rw-r----- 1 oracle oinstall  20K Nov 21 15:11 ora10g-4175411955_20141121_864227471_353.arc
-rw-r----- 1 oracle oinstall 7.3M Nov 21 15:11 ora10g-c-4175411955-20141121-04.ctl

--删除RMAN备份集
[oracle@ora10g backupsets]$ rm -f *
[oracle@ora10g backupsets]$ rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on 16 16:07:59 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: ORA10G (DBID=4175411955)

RMAN> crosscheck backup;

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=154 devtype=DISK
crosschecked backup piece: found to be ‘AVAILABLE
backup piece handle=/u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp recid=286 stamp=863446548
crosschecked backup piece: found to be ‘EXPIRED
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc recid=302 stamp=864227318
crosschecked backup piece: found to be ‘EXPIRED
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db recid=303 stamp=864227356
crosschecked backup piece: found to be ‘EXPIRED
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc recid=304 stamp=864227472
crosschecked backup piece: found to be ‘EXPIRED
backup piece handle=/u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl recid=305 stamp=864227475
Crosschecked 5 objects


RMAN> delete noprompt expired backupset;

using channel ORA_DISK_1

List of Backup Pieces
BP Key  BS Key  Pc# Cp# Status      Device Type Piece Name
------- ------- --- --- ----------- ----------- ----------
302     302     1   1   EXPIRED     DISK        /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc
303     303     1   1   EXPIRED     DISK        /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db
304     304     1   1   EXPIRED     DISK        /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc
305     305     1   1   EXPIRED     DISK        /u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc recid=302 stamp=864227318
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db recid=303 stamp=864227356
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc recid=304 stamp=864227472
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl recid=305 stamp=864227475
Deleted 4 EXPIRED objects

现在把由RMAN脚本生成的备份集删除了,再查看一次

RMAN> list backup;


List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
286     Full    880.50M    DISK        00:01:35     2014-11-12     
        BP Key: 286   Status: AVAILABLE  Compressed: NO  Tag: TAG20141112T141548
        Piece Name: /u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp
  List of Datafiles in backup set 286
  File LV Type Ckp SCN    Ckp Time   Name
  ---- -- ---- ---------- ---------- ----
  1       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/system01.dbf
  2       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/undotbs01.dbf
  3       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/sysaux01.dbf
  4       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/users01.dbf
  5       Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/example01.dbf
  6       Full 1202813    2014-11-12 

还有一个备份集在fra中做的全库备份,也将其删除

RMAN> host;

[oracle@ora10g backupsets]$ cd /u01/app/oracle/flash_recovery_area/ORA10G/backupset/
[oracle@ora10g backupset]$ ll
total 4
drwxr-x--- 2 oracle oinstall 4096 Nov 12 14:15 2014_11_12
[oracle@ora10g backupset]$ rm -rf *
[oracle@ora10g backupset]$ exit
exit
host command complete

RMAN> crosscheck backup;

using channel ORA_DISK_1
crosschecked backup piece: found to be ‘EXPIRED
backup piece handle=/u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp recid=286 stamp=863446548
Crosschecked 1 objects


RMAN> delete noprompt expired backup;

using channel ORA_DISK_1

List of Backup Pieces
BP Key  BS Key  Pc# Cp# Status      Device Type Piece Name
------- ------- --- --- ----------- ----------- ----------
286     286     1   1   EXPIRED     DISK        /u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp
deleted backup piece
backup piece handle=/u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp recid=286 stamp=863446548
Deleted 1 EXPIRED objects


RMAN> list backup summary;


RMAN> list backup;


RMAN> exit

好了,现在数据库的RMAN备份彻底没有了,继续我们的测试:

--开启测试表空间热备份模式
SQL> alter tablespace zlm begin backup;

Tablespace altered.

SQL> select * from v$backup;

FILE# STATUS                CHANGE# TIME
----- ------------------ ---------- ----------
    1 NOT ACTIVE                  0
    2 NOT ACTIVE                  0
    3 NOT ACTIVE                  0
    4 NOT ACTIVE                  0
    5 NOT ACTIVE                  0
    6 ACTIVE                1317685 2014-11-26

6 rows selected.

此时可以看到,开启热备模式以后,6号文件的状态从NOT ACTIVE变成了ACTIVE

SQL> select name,checkpoint_change# from v$datafile;

NAME                                          CHECKPOINT_CHANGE#
--------------------------------------------- ------------------
/u01/app/oracle/oradata/ora10g/system01.dbf              1306748
/u01/app/oracle/oradata/ora10g/undotbs01.dbf             1306748
/u01/app/oracle/oradata/ora10g/sysaux01.dbf              1306748
/u01/app/oracle/oradata/ora10g/users01.dbf               1306748
/u01/app/oracle/oradata/ora10g/example01.dbf             1306748
/u01/app/oracle/oradata/ora10g/zlm01.dbf                 1319387

6 rows selected.

SCN也比其他文件的要大,因为相当于对6号文件单独进行存档了,只不过SCN还没有写进数据文件头,这个时候这个数据文件是废的,要保持一致性,必须要依靠归档来实现

--OS级别热备份6号数据文件
SQL> !cp $ORACLE_BASE/oradata/zlm01.dbf /u01/zlm01_bak.dbf
cp: cannot stat `/u01/app/oracle/oradata/zlm01.dbf‘: No such file or directory

SQL> !cp $ORACLE_BASE/oradata/ora10g/zlm01.dbf /u01/zlm01_bak.dbf

--关闭热备模式
SQL> alter tablespace zlm end backup;

Tablespace altered.


SQL> select * from v$backup;

     FILE# STATUS                CHANGE# TIME
---------- ------------------ ---------- ----------
         1 NOT ACTIVE                  0
         2 NOT ACTIVE                  0
         3 NOT ACTIVE                  0
         4 NOT ACTIVE                  0
         5 NOT ACTIVE                  0
         6 NOT ACTIVE            1319387 2014-11-26

6 rows selected.

现在6号文件的状态又变回了NOT ACTIVE,说明热备结束了

SQL> select name,checkpoint_change# from v$datafile;

NAME                                          CHECKPOINT_CHANGE#
--------------------------------------------- ------------------
/u01/app/oracle/oradata/ora10g/system01.dbf              1306748
/u01/app/oracle/oradata/ora10g/undotbs01.dbf             1306748
/u01/app/oracle/oradata/ora10g/sysaux01.dbf              1306748
/u01/app/oracle/oradata/ora10g/users01.dbf               1306748
/u01/app/oracle/oradata/ora10g/example01.dbf             1306748
/u01/app/oracle/oradata/ora10g/zlm01.dbf                 1319387

6 rows selected.

数据文件的SCN依然是之前的,还没有变化

SQL> select header_block from dba_segments where segment_name=‘CORRUPT_TEST‘;

HEADER_BLOCK
------------
          11

通过dba_segments视图,得知6号文件的段头块是11

--模拟出现坏块
SQL> !
[oracle@ora10g backupsets]$ dd of=/u01/app/oracle/oradata/ora10g/zlm01.dbf bs=8192 conv=notrunc seek=12 <<EOF
> corruption
> EOF
0+1 records in
0+1 records out
11 bytes (11 B) copied, 0.000168204 seconds, 65.4 kB/s

seek=12表示跳过12个block开始写入,因为我不想破坏段头块,只是在文件尾部写了废数据“corruption”,那么这个块就会标识为逻辑坏块

[oracle@ora10g backupsets]$ sqlplus /nolog

SQL*Plus: Release 10.2.0.1.0 - Production on 16 15:52:41 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL> conn zlm/zlm
Connected.
SQL> select * from corrupt_test;

        ID NAME
---------- ---------------
         1 aaron8219

此时由于测试表corrupt_test里数据块中的行数据还在内存中,所以还是可以查询到行记录的

SQL> alter system flush buffer_cache;

System altered.

SQL> select * from corrupt_test;
select * from corrupt_test
              *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 6, block # 12)
ORA-01110: data file 6: ‘/u01/app/oracle/oradata/ora10g/zlm01.dbf‘

但是一旦我们把它刷到磁盘,就报ORA-01578的错误了,提示6号文件的第12个块损坏了,就是之前指定的那个数据块

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
[oracle@ora10g backupsets]$ rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on 16 16:30:19 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: ORA10G (DBID=4175411955)

RMAN> blockrecover datafile 6 block 12;

Starting blockrecover at 2014-11-26
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=159 devtype=DISK

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of blockrecover command at 11/26/2014 16:30:51
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 6 found to restore

此时直接用blockrecover来恢复坏块是不行的,首先我们没有可用的备份集,其次,控制文件中也不知道从哪里去找可用的备份文件,那么我们就要先把之前做过的热备文件catalog到控制文件中

RMAN> catalog datafilecopy ‘/u01/zlm01_bak.dbf‘;

cataloged datafile copy
datafile copy filename=/u01/zlm01_bak.dbf recid=17 stamp=864664486

RMAN> blockrecover datafile 6 block 12;

Starting blockrecover at 2014-11-26
using channel ORA_DISK_1

channel ORA_DISK_1: restoring block(s) from datafile copy /u01/zlm01_bak.dbf

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished blockrecover at 2014-11-26

RMAN> exit


Recovery Manager complete.

再做一次blockrecover,现在就顺利地介质恢复完了

[oracle@ora10g backupsets]$ sqlplus zlm/zlm

SQL*Plus: Release 10.2.0.1.0 - Production on 16 16:35:23 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> select * from corrupt_test;

        ID NAME
---------- ---------------
         1 aaron8219

SQL> 

可以看到,之前丢失的数据,又回来了

总结:

虽然在没有RMAN备份集的情况下,通过热备文件可以把丢失的数据恢复出来,但这毕竟还是很不靠谱的。在生产环境中,我们几乎不可能经常去对某个数据文件做热备,也不会知道什么时候,哪个文件就会出现坏块。所以,平时做好RMAN全备还是非常非常重要的,只要有备份集和归档,我们的数据就不会丢失。当执行blockrecover datafile xxx block xxx时,Oracle会直接去RMAN备份集中恢复,不需要额外的catalog步骤,也不用我们过多地人为干预。


无RMAN备份集情况下的坏块恢复