首页 > 代码库 > oracle查询转换_view merge

oracle查询转换_view merge

oracle对于子查询的支持做的很好,oracle optimizer会对inline view进行query transfomation,即视图合并,不过也经常带来意想不到的问题。
下面是一个inline view的merge的例子:

1,  创建临时表

1 create table test1 as select * from dba_objects;
2 create table test2 as select * from dba_objects;

2,  以下查询语句

select *
  from test1 t1,
       (select owner,
               max(object_id) maxo,
               min(object_id) mino,
               avg(object_id) avgo
          from test2
         group by owner) t2
 where t1.owner = t2.owner
   and t1.object_id > t2.avgo;

3,  下面看下语句的执行计划:
  

--------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       | 90420 |    18M|       |  1540   (1)| 00:00:19 |
|*  1 |  FILTER              |       |       |       |       |            |          |
|   2 |   HASH GROUP BY      |       | 90420 |    18M|       |  1540   (1)| 00:00:19 |
|*  3 |    HASH JOIN         |       | 90420 |    18M|  3712K|  1535   (1)| 00:00:19 |
|   4 |     TABLE ACCESS FULL| TEST2 | 90420 |  2649K|       |   246   (1)| 00:00:03 |
|   5 |     TABLE ACCESS FULL| TEST1 | 90420 |    16M|       |   247   (2)| 00:00:03 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_ID">SUM("OBJECT_ID")/COUNT("OBJECT_ID"))
   3 - access("T1"."OWNER"="OWNER")

这里,oracle对inline view进行了视图合并,其执行过程是:

  1. 对test2,test1进行hash join。
  2. 对结果进行group by分组
  3. 然后过滤object_id大于均值的记录

看下其执行的情况:

26864 rows selected.
Elapsed: 00:11:53.64
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2172  consistent gets
       1081  physical reads
          0  redo size
    2304647  bytes sent via SQL*Net to client
      20182  bytes received via SQL*Net from client
       1792  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      26864  rows processed

两个表的记录数:66957。结果集数:26864。
而整体的运行时间接近12分钟。
下面调整一下查询的执行计划,使用no_merge hint使用优化器不进行查询装换,对inline view不进行合并:

  

select /*+ no_merge(t2)*/*
  from test1 t1,
       (select owner,
               max(object_id) maxo,
               min(object_id) mino,
               avg(object_id) avgo
          from test2
         group by owner) t2
 where t1.owner = t2.owner
   and t1.object_id > t2.avgo;

26864 rows selected.
Elapsed: 00:00:01.47
Execution Plan
----------------------------------------------------------
Plan hash value: 1122024653
--------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |  4521 |  1028K|       |  1600   (1)| 00:00:20 |
|*  1 |  HASH JOIN           |       |  4521 |  1028K|  6008K|  1600   (1)| 00:00:20 |
|   2 |   VIEW               |       | 90420 |  4944K|       |   251   (3)| 00:00:04 |
|   3 |    HASH GROUP BY     |       | 90420 |  2649K|       |   251   (3)| 00:00:04 |
|   4 |     TABLE ACCESS FULL| TEST2 | 90420 |  2649K|       |   246   (1)| 00:00:03 |
|   5 |   TABLE ACCESS FULL  | TEST1 | 90420 |    15M|       |   247   (2)| 00:00:03 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."OWNER"="T2"."OWNER")
       filter("T1"."OBJECT_ID">"T2"."AVGO")
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3934  consistent gets
          0  physical reads
          0  redo size
    1602672  bytes sent via SQL*Net to client
      20182  bytes received via SQL*Net from client
       1792  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      26864  rows processed

使用no_merge hint,其执行过程是:先对test2先进行group by分组,然后对中间结果集进行hash驱动来join。
对比不同的执行计划,两个执行的时间却相差甚远。

现在打开10046 event,来看一下执行的具体花费的时间和计算的cost:

  

********************************************************************************
select /*+ no_merge(t2)*/*
  from test1 t1,
       (select owner,
               max(object_id) maxo,
               min(object_id) mino,
               avg(object_id) avgo
          from test2
         group by owner) t2
 where t1.owner = t2.owner
   and t1.object_id > t2.avgo
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     1792      0.70       0.68          0       3934          0       26864
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1794      0.70       0.68          0       3934          0       26864
Rows     Row Source Operation
-------  ---------------------------------------------------
  26864  HASH JOIN  (cr=3934 pr=0 pw=0 time=569062 us)
    121   VIEW  (cr=1086 pr=0 pw=0 time=216305 us)
    121    HASH GROUP BY (cr=1086 pr=0 pw=0 time=216052 us)
  66953     TABLE ACCESS FULL TEST2 (cr=1086 pr=0 pw=0 time=67093 us)
  66957   TABLE ACCESS FULL TEST1 (cr=2848 pr=0 pw=0 time=69776 us)
Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
  26864   HASH JOIN
    121    VIEW
    121     HASH (GROUP BY)
  66953      TABLE ACCESS (FULL) OF TEST2 (TABLE)
  66957    TABLE ACCESS (FULL) OF TEST1 (TABLE)
********************************************************************************

下面看另外一个merge后的执行情况:

********************************************************************************
select *
  from test1 t1,
       (select owner,
               max(object_id) maxo,
               min(object_id) mino,
               avg(object_id) avgo
          from test2
         group by owner) t2
 where t1.owner = t2.owner
   and t1.object_id > t2.avgo

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     1792    530.60     518.52          0       2172          0       26864
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1794    530.60     518.52          0       2172          0       26864
Rows     Row Source Operation
-------  ---------------------------------------------------
  26864  FILTER  (cr=2172 pr=0 pw=0 time=518383093 us)
  66957   HASH GROUP BY (cr=2172 pr=0 pw=0 time=518245046 us)
147500423    HASH JOIN  (cr=2172 pr=0 pw=0 time=147797490 us)
  66953     TABLE ACCESS FULL TEST2 (cr=1086 pr=0 pw=0 time=67029 us)
  66957     TABLE ACCESS FULL TEST1 (cr=1086 pr=0 pw=0 time=67029 us)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
  26864   FILTER
  66957    HASH (GROUP BY)
147500423     HASH JOIN
  66953      TABLE ACCESS (FULL) OF TEST2 (TABLE)
  66957      TABLE ACCESS (FULL) OF TEST1 (TABLE)
********************************************************************************

可以很明显的看出,时间花费在什么地方了。在进行hash group by和filter的过程中花费了大量的时间。
因为test1和test2进行hash join的中间结果有147,500,423条。

  

总结:
无论是inline view的merge还是其它的可选执行计划,最优的那个访问路径一定是具备了以下的条件:

  1. 优先对查询应用过滤性条件,尽可能多的减少中间结果集。
  2. 尽可能的只扫描期望得到的那一部分数据,减少不必要的开销。