首页 > 代码库 > Oracle等待事件DFS lock handle
Oracle等待事件DFS lock handle
在做性能压力测试,测试结果不能通过,获取现场一个小时的AWR报告,发现大量的等待事件,数据库是RAC,版本是11.2.0.4.0。
Snap Id | Snap Time | Sessions | Cursors/Session | Instances | |
---|---|---|---|---|---|
Begin Snap: | 1607 | 21-10月-14 20:00:03 | 560 | 67.9 | 2 |
End Snap: | 1608 | 21-10月-14 21:00:11 | 573 | 12.4 | 2 |
Elapsed: | 60.13 (mins) | ||||
DB Time: | 2,090.75 (mins) |
Event | Waits | Total Wait Time (sec) | Wait Avg(ms) | Wait Class | |
---|---|---|---|---|---|
rdbms ipc reply | 32,876,281 | 44.9K | 1 | 35.8 | Other |
DB CPU | 21.3K | 17.0 | |||
direct path read | 435,808 | 18.8K | 43 | 15.0 | User I/O |
DFS lock handle | 4,204,866 | 7977.9 | 2 | 6.4 | Other |
log file sync | 8,541 | 252.7 | 30 | .2 | Commit |
1. 排在第一的等待事件是rdbms ipc reply , 解释是The rdbms ipc reply Oracle metric event is used to wait for a reply from one of the background processes.说明lgwr,dbwr等后台进程空闲,等待前台进程给予他们的工作任务。DFS lock handle这个等待事件很可疑,官方解释是:
The session waits for the lock handle of a global lock request. The lock handle identifies a global lock. With this lock handle, other operations can be performed on this global lock (to identify the global lock in future operations such as conversions or release). The global lock is maintained by the DLM.
大致意思是无法获得global cache lock的handle时候所记录的等待事件。
2. 在网上看了下大家的处理方式,序列的cache过小,数据库服务器CPU过高,做过相应的调整和监控,都不解决问题。在做性能测试的时候,
select chr(bitand(p1,-16777216)/16777215) || chr(bitand(p1, 16711680)/65535) "Lock",
to_char(bitand(p1, 65536)) "Mode",
p2, p3 , seconds_in_wait
from v$session_wait
where event = ‘DFS lock handle‘;
发现了BB锁,意思是:2PC distributed transaction branch across RAC instances DX Serializes tightly coupled distributed transaction branches。
大致意思是分布式事务两个RAC实例中across。我随即做出调整,将weblogic连接改为只是连接一个RAC节点,再进行测试。测试结果如下:
Snap Id | Snap Time | Sessions | Cursors/Session | Instances | |
---|---|---|---|---|---|
Begin Snap: | 1680 | 24-10月-14 12:00:13 | 864 | 9.5 | 2 |
End Snap: | 1681 | 24-10月-14 13:00:17 | 863 | 9.9 | 2 |
Elapsed: | 60.07 (mins) | ||||
DB Time: | 80.28 (mins) |
Event | Waits | Total Wait Time (sec) | Wait Avg(ms) | Wait Class | |
---|---|---|---|---|---|
DB CPU | 2335.6 | 48.5 | |||
rdbms ipc reply | 5,326,201 | 645.6 | 0 | 13.4 | Other |
gc buffer busy acquire | 39,052 | 226.7 | 6 | 4.7 | Cluster |
DFS lock handle | 672,757 | 225.8 | 0 | 4.7 | Other |
3. 如何彻底解决呢?先说下DFS lock handle,说简单一点就是一个object在不同的实例中DML,每个实例在自己处理自己的object。这是一个权衡的问题,如果weblogic动态连接实例,就无法保证每次处理自己的object,但这样可以容灾,其他的实例挂了也没问题;如果是指定单独的实例,相对于动态是优、缺点是反的。还有一种说法是metalink中有关于DFS lock handle的都是bug,目前尚不清楚数据库升级后是不是会好一点。
Oracle等待事件DFS lock handle