首页 > 代码库 > 如何获得正确的基数估计值
如何获得正确的基数估计值
原文:http://www.oracle.com/technetwork/issue-archive/2013/13-jan/o13asktom-1886639.html
I have a question about joining with collections and cardinality estimation. The optimizer is always estimating that 8,168 rows are coming back from my collection, and because of that, it is using inefficient plans. The estimate of 8,168 is more than two orders of magnitude more than the real cardinality. How can I solve this problem?
我有一个关于集合和基数估计。优化器针对我的集合基数估计总是8168,就因为这,它老师使用低效率计划。8168是实际基数的2个数量级!
我该如何解决这个问题?
This is a long-running issue with pipelined functions and collections during optimization. The optimizer in general doesn’t have any information about the cardinality (the number of rows) being returned by the collection. It has to guess—and that guess is based on the block size (default statistics are block-size-driven). So, for a database with an 8 K block size, the guess is about 8,000. And because it is unlikely that your collection has about 8,000 elements (probably more like 8 or 80 in most cases), you can definitely arrive at a suboptimal plan.
这的确是个长期提出的问题:关于管道函数和集合的优化。优化器通常对于集合的基数信息毫无所知。它必须去猜!怎么猜?是基于块大小。
所以,对于8k大小数据块的数据库,猜出来的基数就是8000左右!但是通常集合的基数并不是8000,其更可能是8或80。
下面,我提供4种方法来获得正确的集合基数:
1)The cardinality hint (undocumented) 基数提示(无文件证明)
2)The OPT_ESTIMATE hint (undocumented) OPT_ESTIMATE提示(无文件证明)
3)Dynamic sampling (Oracle Database 11g Release 1 and later) 动态采样(Oracle 11g第一版或更高)
4)Oracle Database’s Cardinality Feedback feature (Oracle Database 11g Release 2 and later) Oracle基数反馈特性(Oracle 11g第2版或更高)
--1)用于字符串分割的管道函数
SQL> create or replace type str2tblType as table of varchar2(30)
/
Type created.
SQL> create or replace
function str2tbl( p_str in varchar2, p_delim in varchar2 default ‘,‘ )
return str2tblType
PIPELINED
as
l_str long default p_str || p_delim;
l_n number;
begin
loop
l_n := instr( l_str, p_delim );
exit when (nvl(l_n,0) = 0);
pipe row( ltrim(rtrim(substr(l_str,1,l_n-1))) );
l_str := substr( l_str, l_n+1 );
end loop;
end;
/
Function created.
SQL> variable x varchar2(15)
SQL> exec :x := ‘1,2,3,a,b,c‘
PL/SQL procedure successfully completed.
SQL> select * from table(str2tbl(:x));
COLUMN_VALUE
——————————————————————————————————————
1
2
3
a
b
c
6 rows selected.
SQL> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————
SQL_ID ddk1tv9s5pzq5, child number 0
————————————————————————————————————————————————————————
select * from table(str2tbl(:x))
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 29 (100)| |
| 1| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|
--使用CARDINALITY 提示
SQL> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————
SQL_ID bd2f8rh30z3ww, child number 0
————————————————————————————————————————————————————————
select /*+ cardinality(sq 10) */ * from table(str2tbl(:x)) sq
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 29 (100)| |
| 1| COLLECTION ITERATOR PICKLER...|STR2TBL| 10| 20| 29 (0)|00:00:01|
--这地方,我们给优化器提示一个10使其认为管道函数的结果基数为10
--2)使用OPT_ESTIMATE提示
--提供3个参数:对象类型;对象名;一个缩放因素,优化器将拿它认为的基数乘以这个因素
--所以,先算出这个大概的因素
SQL> select 10/8168 from dual;
10/8168
————————————————
.00122429
select /*+ opt_estimate(table, sq, scale_rows=0.00122429) */ *
from table(str2tbl(:x)) sq
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 29 (100)| |
| 1| OLLECTION ITERATOR PICKLER...|STR2TBL| 10| 20| 29 (0)|00:00:01|
--3)使用DYNAMIC SAMPLING 提示
select /*+ dynamic_sampling( sq, 2 ) */ * from table( str2tbl(:x,‘,‘) ) sq
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 11 (100)| |
| 1| COLLECTION ITERATOR PICKLER...|STR2TBL| 6| 12| 11 (0)|00:00:01|
———————————————————————————————————————————————————————————————————————————
Note
———————
dynamic sampling used for this statement (level=2)
--4)使用Cardinality Feedback
--这地方我们要将查询变一下形:
with sq
as (
select /*+ materialize */ *
from table( str2tbl( :x ) )
)
select *
from sq
Plan hash value: 630596523
—————————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
—————————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 32 (100)| |
| 1| TEMP TABLE TRANSFORMATION | | | | | |
| 2| LOAD AS SELECT | | | | | |
| 3| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|
| 4| VIEW | |8168| 135K| 3 (0)|00:00:01|
| 5| TABLE ACCESS FULL |SYS_...|8168|16336| 3 (0)|00:00:01|
—————————————————————————————————————————————————————————————————————————————
18 rows selected.
--whoop,第一次还没生效呢,没关系, one more try
with sq as (select /*+ materialize */ * from table( str2tbl( :x ) )
) select * from sq
Plan hash value: 630596523
—————————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
—————————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | 32 (100)| |
| 1| TEMP TABLE TRANSFORMATION | | | | |
| 2| LOAD AS SELECT | | | | |
| 3| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|
| 4| VIEW | 6| 102 | 3 (0)|00:00:01|
| 5| TABLE ACCESS FULL |SYS_...| 6| 12| 3 (0)|00:00:01|
—————————————————————————————————————————————————————————————————————————————
Note
———————
- cardinality feedback used for this statement
22 rows selected.
--这一次OK了
--揭晓一下基数反馈的原理,下面直接引用TOM的话:
Cardinality Feedback works by having the optimizer change its cardinality estimates after executing a query for the first time and observing that the actual cardinalities were very far off from the estimated cardinalities. That is, the optimizer starts to learn from its mistakes. If it executes a query and discovers that the real row counts are far off from the estimated counts, it will reoptimize the query, using the newly discovered values.
I have a question about joining with collections and cardinality estimation. The optimizer is always estimating that 8,168 rows are coming back from my collection, and because of that, it is using inefficient plans. The estimate of 8,168 is more than two orders of magnitude more than the real cardinality. How can I solve this problem?
我有一个关于集合和基数估计。优化器针对我的集合基数估计总是8168,就因为这,它老师使用低效率计划。8168是实际基数的2个数量级!
我该如何解决这个问题?
This is a long-running issue with pipelined functions and collections during optimization. The optimizer in general doesn’t have any information about the cardinality (the number of rows) being returned by the collection. It has to guess—and that guess is based on the block size (default statistics are block-size-driven). So, for a database with an 8 K block size, the guess is about 8,000. And because it is unlikely that your collection has about 8,000 elements (probably more like 8 or 80 in most cases), you can definitely arrive at a suboptimal plan.
这的确是个长期提出的问题:关于管道函数和集合的优化。优化器通常对于集合的基数信息毫无所知。它必须去猜!怎么猜?是基于块大小。
所以,对于8k大小数据块的数据库,猜出来的基数就是8000左右!但是通常集合的基数并不是8000,其更可能是8或80。
下面,我提供4种方法来获得正确的集合基数:
1)The cardinality hint (undocumented) 基数提示(无文件证明)
2)The OPT_ESTIMATE hint (undocumented) OPT_ESTIMATE提示(无文件证明)
3)Dynamic sampling (Oracle Database 11g Release 1 and later) 动态采样(Oracle 11g第一版或更高)
4)Oracle Database’s Cardinality Feedback feature (Oracle Database 11g Release 2 and later) Oracle基数反馈特性(Oracle 11g第2版或更高)
--1)用于字符串分割的管道函数
SQL> create or replace type str2tblType as table of varchar2(30)
/
Type created.
SQL> create or replace
function str2tbl( p_str in varchar2, p_delim in varchar2 default ‘,‘ )
return str2tblType
PIPELINED
as
l_str long default p_str || p_delim;
l_n number;
begin
loop
l_n := instr( l_str, p_delim );
exit when (nvl(l_n,0) = 0);
pipe row( ltrim(rtrim(substr(l_str,1,l_n-1))) );
l_str := substr( l_str, l_n+1 );
end loop;
end;
/
Function created.
SQL> variable x varchar2(15)
SQL> exec :x := ‘1,2,3,a,b,c‘
PL/SQL procedure successfully completed.
SQL> select * from table(str2tbl(:x));
COLUMN_VALUE
——————————————————————————————————————
1
2
3
a
b
c
6 rows selected.
SQL> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————
SQL_ID ddk1tv9s5pzq5, child number 0
————————————————————————————————————————————————————————
select * from table(str2tbl(:x))
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 29 (100)| |
| 1| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|
--使用CARDINALITY 提示
SQL> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
————————————————————————————————————————————————————————
SQL_ID bd2f8rh30z3ww, child number 0
————————————————————————————————————————————————————————
select /*+ cardinality(sq 10) */ * from table(str2tbl(:x)) sq
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 29 (100)| |
| 1| COLLECTION ITERATOR PICKLER...|STR2TBL| 10| 20| 29 (0)|00:00:01|
--这地方,我们给优化器提示一个10使其认为管道函数的结果基数为10
--2)使用OPT_ESTIMATE提示
--提供3个参数:对象类型;对象名;一个缩放因素,优化器将拿它认为的基数乘以这个因素
--所以,先算出这个大概的因素
SQL> select 10/8168 from dual;
10/8168
————————————————
.00122429
select /*+ opt_estimate(table, sq, scale_rows=0.00122429) */ *
from table(str2tbl(:x)) sq
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 29 (100)| |
| 1| OLLECTION ITERATOR PICKLER...|STR2TBL| 10| 20| 29 (0)|00:00:01|
--3)使用DYNAMIC SAMPLING 提示
select /*+ dynamic_sampling( sq, 2 ) */ * from table( str2tbl(:x,‘,‘) ) sq
Plan hash value: 2407808827
———————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
———————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 11 (100)| |
| 1| COLLECTION ITERATOR PICKLER...|STR2TBL| 6| 12| 11 (0)|00:00:01|
———————————————————————————————————————————————————————————————————————————
Note
———————
dynamic sampling used for this statement (level=2)
--4)使用Cardinality Feedback
--这地方我们要将查询变一下形:
with sq
as (
select /*+ materialize */ *
from table( str2tbl( :x ) )
)
select *
from sq
Plan hash value: 630596523
—————————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
—————————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | | 32 (100)| |
| 1| TEMP TABLE TRANSFORMATION | | | | | |
| 2| LOAD AS SELECT | | | | | |
| 3| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|
| 4| VIEW | |8168| 135K| 3 (0)|00:00:01|
| 5| TABLE ACCESS FULL |SYS_...|8168|16336| 3 (0)|00:00:01|
—————————————————————————————————————————————————————————————————————————————
18 rows selected.
--whoop,第一次还没生效呢,没关系, one more try
with sq as (select /*+ materialize */ * from table( str2tbl( :x ) )
) select * from sq
Plan hash value: 630596523
—————————————————————————————————————————————————————————————————————————————
|Id|Operation |Name |Rows|Bytes|Cost (%CPU)|Time |
—————————————————————————————————————————————————————————————————————————————
| 0|SELECT STATEMENT | | | 32 (100)| |
| 1| TEMP TABLE TRANSFORMATION | | | | |
| 2| LOAD AS SELECT | | | | |
| 3| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|
| 4| VIEW | 6| 102 | 3 (0)|00:00:01|
| 5| TABLE ACCESS FULL |SYS_...| 6| 12| 3 (0)|00:00:01|
—————————————————————————————————————————————————————————————————————————————
Note
———————
- cardinality feedback used for this statement
22 rows selected.
--这一次OK了
--揭晓一下基数反馈的原理,下面直接引用TOM的话:
Cardinality Feedback works by having the optimizer change its cardinality estimates after executing a query for the first time and observing that the actual cardinalities were very far off from the estimated cardinalities. That is, the optimizer starts to learn from its mistakes. If it executes a query and discovers that the real row counts are far off from the estimated counts, it will reoptimize the query, using the newly discovered values.
关于基数反馈请参考:http://www.oracle.com/technetwork/issue-archive/2010/10-sep/o50asktom-165477.html
--------------------------------
Dylan Presents.
如何获得正确的基数估计值
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。