首页 > 代码库 > Exadata Adaptive Scrubbing Schedule

Exadata Adaptive Scrubbing Schedule

1、为什么要引入"Hard Disk Scrub and Repair"特性

在exadata的11.2.3.3.0版本中,开始引进了"Automatic Hard Disk Scrub and Repair"特性,主要是为了解决日立硬盘的"oil migration"问题,"oil migration"问题有可能会引发IO错误。

技术分享

一般而言,刚开始只有一些坏的扇区,但随着时间推移,很可能会出现更多的坏扇区。

 

2、Fixed Scrubbing Schedule的不足

默认情况下,"Hard Disk Scrub and Repair"每两周执行一次。然而,当一块磁盘刚开始有坏扇区出现,则在后续的一段时间内很可能会出现更多的坏扇区,所以当"Hard Disk Scrub and Repair"时发现了某块磁盘有坏扇区,则应该加大"Scrub and Repair"的频率,默认的2周一次就不能满足要求了。

如下图:在第一次"Fixed Scrubbing Schedule"时,发现HDD10这块磁盘有11个坏块,并修复了这11个坏块。但实际上在week1,立刻又出现了40个坏块,在week2再次进行"Fixed Scrubbing Schedule"时,实际上已经有75个坏块。

技术分享

 

3、什么是Adaptive Scrubbing Schedule

从exadata的12.1.2.3.0开始, 如果在当前的Scrubbing工作中发现了某块磁盘有坏块,则存储软件会自动地单独为这块磁盘制定一个Scrubbing Schedule(默认:一周一次),在后续的Scrubbing工作中,如果这块磁盘没有再发现坏块,则自动地单独为这块磁盘制定一个Scrubbing Schedule会中止,Scrubbing工作恢复为Fixed Scrubbing Schedule。

如下图:在第一次"Fixed Scrubbing Schedule"时,发现HDD10这块磁盘有11个坏块,并修复了这11个坏块。但在week0 和week1这段时间又产生了30个坏块,而week1时新的Adaptive Scrubbing修复了这30个坏块,在week1 和week2这段时间没有产生新的坏块,而week2时"Fixed Scrubbing Schedule"再次工作,未发现任何坏块,所以Adaptive Scrubbing消失,week3没有任何Scrubbing工作,直到week4时,"Fixed Scrubbing Schedule"再次工作。

技术分享

 

4、关于Adaptive Scrubbing Schedule

  • If current scrubbing job finds some bad blocks on one CD, follow-up scrubbing will kick-off after one week
  • Follow-up scrubbing job will only scrub CDs with bad blocks found in previous scrubbing job
  • Always honor normal scrubbing schedule
  • Default follow-up interval is weekly
  • If follow-up interval is larger than normal interval, follow-up scrubbing schedule will be ignored(e.g. normal interval is daily, follow-up interval is weekly)

 

5、修改Adaptive Scrubbing Schedule的follow-up interval

By default, the follow-up interval is weekly.

 

- Set follow-up interval as 3 days:

CellCLI –e alter cell hardDiskScrubFollowupIntervalInDays=3

- Disable follow-up schedule:

CellCLI –e alter cell hardDiskScrubFollowupIntervalInDays=0

 

Exadata Adaptive Scrubbing Schedule