首页 > 代码库 > 分区扫描执行计划分析简介

分区扫描执行计划分析简介

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

SQL> create table tab01 partition by range(object_id)
  2  (partition p1 values less than(10000),
  3  partition p2 values less than(20000),
  4  partition p3 values less than(30000),
  5  partition p_max values less than(maxvalue))
  6  as select * from dba_objects;

SQL> create index index_tab01_name on tab01(object_name) local;

PARTITION RANGE ALL是所有分区扫描,比如需要扫描所有的分区才能满足查询
SQL> select * from tab01 where object_name=‘TT‘;

Execution Plan
----------------------------------------------------------
Plan hash value: 3187188282

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                  |     3 |   258 |     8   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL               |                  |     3 |   258 |     8   (0)| 00:00:01 |     1 |     4 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TAB01            |     3 |   258 |     8   (0)| 00:00:01 |     1 |     4 |
|*  3 |    INDEX RANGE SCAN                | INDEX_TAB01_NAME |     3 |       |     5   (0)| 00:00:01 |     1 |     4 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("OBJECT_NAME"=‘TT‘)

我们观察到执行计划中跟非分区表多出了pstart和pstop这两项,这个表示扫描该表的所有分区,也就是分区:1到分区:4.

Cbo下的分区还有一个特别重要的特性:cbo会根据where过滤条件中的分区键值判断只扫描需要访问的分区,对于不需要的分区不会访问,这个特性叫做partition purging
PARTITION RANGE SINGLE是单个分区扫描,比如查询中可以看见很清楚的过滤条件表示只需要扫描某一个分区即可
SQL>  select * from tab01 where object_id=5000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2393410319

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |     1 |    79 |    57   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|       |     1 |    79 |    57   (2)| 00:00:01 |     2 |     2 |
|*  2 |   TABLE ACCESS FULL    | TAB01 |     1 |    79 |    57   (2)| 00:00:01 |     2 |     2 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"=5000)


SQL>  select * from tab01 where object_id=5000 and object_name=‘OP‘;

Execution Plan
----------------------------------------------------------
Plan hash value: 3913001591

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                  |     1 |    79 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE            |                  |     1 |    79 |     2   (0)| 00:00:01 |     2 |     2 |
|*  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TAB01            |     1 |    79 |     2   (0)| 00:00:01 |     2 |     2 |
|*  3 |    INDEX RANGE SCAN                | INDEX_TAB01_NAME |     2 |       |     1   (0)| 00:00:01 |     2 |     2 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"=5000)
   3 - access("OBJECT_NAME"=‘OP‘)

这里的pstart和pstop都是2,表示只扫描了一个分区。

PARTITION RANGE INLIST:当分区关键字存在多个可选值时,比较明显的是查询中有in(n1,n2 。。。nmax)和par_key=n1 or par_key=n2,注意单个可选值并不会出现PARTITION RANGE INLIST而是出现partition range  single
SQL> select * from tab01 where object_id in (5000,5500);

Execution Plan
----------------------------------------------------------
Plan hash value: 2325940683

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |     2 |   172 |    57   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE INLIST|       |     2 |   172 |    57   (2)| 00:00:01 |KEY(I) |KEY(I) |
|*  2 |   TABLE ACCESS FULL    | TAB01 |     2 |   172 |    57   (2)| 00:00:01 |KEY(I) |KEY(I) |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"=5000 OR "OBJECT_ID"=5500)

注意这里的执行计划是partition range inlist,而且pstart和pstop都是用的key(I),不过这个并不代表只扫描了一个分区,如下:
SQL> select * from tab01 where object_id in (5000,20000);

Execution Plan
----------------------------------------------------------
Plan hash value: 2325940683

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |     2 |   172 |    88   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE INLIST|       |     2 |   172 |    88   (2)| 00:00:02 |KEY(I) |KEY(I) |
|*  2 |   TABLE ACCESS FULL    | TAB01 |     2 |   172 |    88   (2)| 00:00:02 |KEY(I) |KEY(I) |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"=5000 OR "OBJECT_ID"=20000)

该查询会去扫描两个分区,而这里还是partition range inlist ,pstart和pstop都是KEY(I),所以这里并不表示只扫描了一个分区,至于这个key(I)在官档中并没有找到一个确切的说明,小鱼觉得这个应该是个变量的形式,所以访问的分区当然也可能出现多个。

partition by iterator:分区迭代扫描,当cbo需要扫描多个连续的分区时就会出现partition by iterator分区迭代,pstart和pstop为需要扫描的分区
SQL> select * from tab01 where object_id>10000;

Execution Plan
----------------------------------------------------------
Plan hash value: 232993693

--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       | 40772 |  3424K|   153   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|       | 40772 |  3424K|   153   (2)| 00:00:02 |     2 |     4 |
|*  2 |   TABLE ACCESS FULL      | TAB01 | 40772 |  3424K|   153   (2)| 00:00:02 |     2 |     4 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">10000)

初步来看partition range iterator应该是多个分区的迭代扫描,而上述测试用例也是如此,扫描的分区为pstart:2到pstop :4,一共三个分区。

但是小鱼发现优化器在下列查询时也会采取partition range iterator,请看下例:
SQL>  select * from tab01 where object_id<5000;

Execution Plan
----------------------------------------------------------
Plan hash value: 232993693

--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |  4660 |   391K|    60   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|       |  4660 |   391K|    60   (2)| 00:00:01 |     1 |     2 |
|*  2 |   TABLE ACCESS FULL      | TAB01 |  4660 |   391K|    60   (2)| 00:00:01 |     1 |     2 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"<5000)

按理说object_id<5000只需要扫描分区p1即可,而这里cbo确给出了partition range iterator执行计划,并且pstart为1,pstop为2,cbo根据统计信息得出需要扫描partition :1和:2两个分区。

这里小鱼想的是否能够通过动态采样让cbo选择只扫描一个分区,从而出现partition range single的执行计划,不过这里小鱼即使采取动态采样level 10,依然cbo给出的执行计划依然是partition range iterator,从这里看出cbo并不是绝对的智能,当统计信息越准确它给出的执行计划就越准,但是还是可能出现不完全执行计划。
SQL> select /*+dynamic_sampling(tab01 10)*/* from tab01 where object_id<2000;

Execution Plan
----------------------------------------------------------
Plan hash value: 232993693

--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |  1953 |   164K|    60   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|       |  1953 |   164K|    60   (2)| 00:00:01 |     1 |     2 |
|*  2 |   TABLE ACCESS FULL      | TAB01 |  1953 |   164K|    60   (2)| 00:00:01 |     1 |     2 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"<2000)

Note
-----
   - dynamic sampling used for this statement

但是如果我们缩小查询的范围,果然就出现了我们想要的partition range single执行计划,而且pstart和pstop都是1,这里表示优化器确实只准备扫描一个分区。
SQL> select * from tab01 where object_id<200;

Execution Plan
----------------------------------------------------------
Plan hash value: 2393410319

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |   189 | 11529 |     4   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|       |   189 | 11529 |     4   (0)| 00:00:01 |     1 |     1 |
|*  2 |   TABLE ACCESS FULL    | TAB01 |   189 | 11529 |     4   (0)| 00:00:01 |     1 |     1 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"<200)

PARTITION RANGE EMPTY这个表示cbo根据统计信息和sql语句,得出该表所对应的分区不存在任何符合查询的结果集,就会显示partition range empty。

例如sql业务逻辑存在问题:
SQL> select * from tab01 where object_id<2000 and object_id>20000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2001509460

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |     1 |    86 |     0   (0)|          |       |       |
|*  1 |  FILTER                |       |       |       |            |          |       |       |
|   2 |   PARTITION RANGE EMPTY|       |     1 |    86 |     2   (0)| 00:00:01 |INVALID|INVALID|
|*  3 |    TABLE ACCESS FULL   | TAB01 |     1 |    86 |     2   (0)| 00:00:01 |INVALID|INVALID|
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(NULL IS NOT NULL)
   3 - filter("OBJECT_ID"<2000 AND "OBJECT_ID">20000)

再例如指定的查询根据表的定义确定没有满足查询的分区:
SQL> create table tab02 partition by range(object_id)
  2      (partition p1 values less than(10000),
  3      partition p2 values less than(20000),
  4      partition p3 values less than(30000))
  5      as select * from dba_objects where object_id<30000;

Table created.

SQL> select * from tab02 where object_id>30000;

Execution Plan
----------------------------------------------------------
Plan hash value: 559767399

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |   177 |    28   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE EMPTY|       |     1 |   177 |    28   (0)| 00:00:01 |INVALID|INVALID|
|*  2 |   TABLE ACCESS FULL   | TAB02 |     1 |   177 |    28   (0)| 00:00:01 |INVALID|INVALID|
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">30000)

Note
-----
   - dynamic sampling used for this statement

看出这里的pstart和pstop都是invalid的。

Partition range or:表示where条件中存在了or运算,并且cbo得出这些运算需要跨越多个不连续的分区时
SQL> select * from tab01 where object_id<5000 or (object_id>5000 and object_id<8000);

Execution Plan
----------------------------------------------------------
Plan hash value: 1971037408

--------------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |  7196 |   604K|    60   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE OR|       |  7196 |   604K|    60   (2)| 00:00:01 |KEY(OR)|KEY(OR)|
|*  2 |   TABLE ACCESS FULL| TAB01 |  7196 |   604K|    60   (2)| 00:00:01 |KEY(OR)|KEY(OR)|
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"<5000 OR "OBJECT_ID"<8000 AND "OBJECT_ID">5000)

可能有些朋友觉得这个其实只需要扫描一个分区即可,因为这个where object_id<5000 or (object_id>5000 and object_id<8000)过滤条件,但是cbo并不能如此的智能,此时cbo根据其统计信息和sql语句得出其需要计算扫描时可能需要跨越不连续的多个分区,当然也可能不需要,所以这里也自然也就出现了PARTITION RANGE OR了,而这个key(OR)应该跟partition range inlist中的key(I)一样,也是一个变化的值,所以这里扫描的分区数量具有不确切性。

同样动态采样也是如此:
SQL> select /*+dynamic_sampling(tab01 10)*/* from tab01 where object_id<5000 or (object_id>5000 and object_id<8000);

Execution Plan
----------------------------------------------------------
Plan hash value: 1971037408

--------------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |  7772 |   652K|    60   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE OR|       |  7772 |   652K|    60   (2)| 00:00:01 |KEY(OR)|KEY(OR)|
|*  2 |   TABLE ACCESS FULL| TAB01 |  7772 |   652K|    60   (2)| 00:00:01 |KEY(OR)|KEY(OR)|
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"<5000 OR "OBJECT_ID"<8000 AND "OBJECT_ID">5000)

Note
-----
   - dynamic sampling used for this statement

上面小鱼简单介绍了range分区扫描的几种执行路径,有些oracle的官档中都没有确切的说明,当然随着我们对cbo的认识,有些问题会随着反复学习和接触新的东西慢慢解开。

新年上班已经一周多了,渐渐也要抓紧了,这年任务还是很大的:ocm考试、深入oracle和常用的高可用、mysql和一些nosql、os的东西等,当然还有最头疼的english!

原文:http://www.dbaxiaoyu.com/archives/1780

分区扫描执行计划分析简介