欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽

程序员文章站 2022-06-23 21:26:59
场景:数据库升级第二天,操作系统CPU使用率接近100%。 查看ash报告: 再看TOP SQL 具体SQL: select count(1) as chipinCount, sum(bets) as sumBets from t_chipin_temp where status in (0) an ......

 

场景:数据库升级第二天,操作系统CPU使用率接近100%。


查看ash报告:

记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽

 

再看TOP SQL

 记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽

 

 

具体SQL:

select count(1) as chipinCount, sum(bets) as sumBets from t_chipin_temp where status in (0) and lottype='gp'


看到这个sql的执行计划走全表扫描。再查看相关列上是否有索引,结果是有索引的。那么问题来了,既然有索引,而且升级之后对该表格还进行过基本的统计信息收集,那么为什么优化器没有走索引。

记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽

 

 

 

收集该sql的10053事件

SQL> select child_number from v$sql where sql_id='57tdrj6a4ync4';

CHILD_NUMBER
  ------------
        0

SQL> execute DBMS_SQLDIAG.DUMP_TRACE(p_sql_id=>'57tdrj6a4ync4', p_child_number=>0, p_component=>'Optimizer', p_file_id=>'SQL_TRACE_10053');

PL/SQL procedure successfully completed.

SQL> SELECT value FROM v$diag_info WHERE name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl11g/ORCL/trace/ORCL_ora_32070_SQL_TRACE_10053.trc

 

 

 vi  /u01/app/oracle/diag/rdbms/orcl11g/ORCL/trace/ORCL_ora_32070_SQL_TRACE_10053.trc

Access path analysis for T_CHIPIN_TEMP
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for T_CHIPIN_TEMP[T_CHIPIN_TEMP]
Column (#8): STATUS(
AvgLen: 4 NDV: 4 Nulls: 0 Density: 0.250000 Min: -2 Max: 4
Column (#44): lottype(
AvgLen: 3 NDV: 2 Nulls: 0 Density: 0.500000
ColGroup (#4, Index) TX_CHIPIN_TMP_TIMCODE2
Col#: 5 11 31 33 CorStregth: -1.00
ColGroup (#2, Index) IDX_T_CHIPIN_TEMP
Col#: 13 34 44 CorStregth: -1.00
ColGroup (#3, Index) TX_CHIPIN_TMP_TIME2
Col#: 8 20 44 CorStregth: -1.00
ColGroup (#5, Index) TX_CHIPIN_TMP_AWARD2
Col#: 15 35 36 CorStregth: -1.00
ColGroup (#6, Index) IX_CHIPIN_TEMP_STALOC2
Col#: 8 44 CorStregth: 2.67
ColGroup (#1, Index) IDX_S_P
Col#: 8 31 CorStregth: -1.00
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420
Table: T_CHIPIN_TEMP Alias: T_CHIPIN_TEMP
Card: Original: 5671105.00 Rounded: 1939622 Computed: 1939621.71 Non Adjusted: 1939621.71
           原始行数                    近似值                       精确值                            非修正值

Access Path: TableScan
Cost: 62160.58 Resp: 62160.58 Degree: 0
Cost_io: 61966.00 Cost_cpu: 7711574370
Resp_io: 61966.00 Resp_cpu: 7711574370
kkofmx: index filter:"T_CHIPIN_TEMP"."lottype"=:B1

kkofmx: index filter:"T_CHIPIN_TEMP"."STATUS"=:B1

kkofmx: index filter:"T_CHIPIN_TEMP"."lottype"=:B1

 

Access Path: index (RangeScan)
Index: IDX_S_P
resc_io: 918156.00 resc_cpu: 8328403927
ix_sel: 0.250000 ix_sel_with_filters: 0.250000
Cost: 918366.14 Resp: 918366.14 Degree: 1

Access Path: index (skip-scan)
SS scan sel: 0.500000 SS filter sel: 0.500000 ANDV (#skips): 564.000000
SS io: 14684.000000 vs. table scan io: 61966.000000
Skip Scan chosen
Access Path: index (SkipScan)
Index: IDX_T_CHIPIN_TEMP
resc_io: 226050.00 resc_cpu: 5276879027
ix_sel: 0.500000 ix_sel_with_filters: 0.500000
Cost: 226183.15 Resp: 226183.15 Degree: 1
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420

Access Path: index (AllEqRange)
Index: IX_CHIPIN_TEMP_STALOC2
resc_io: 128169.00 resc_cpu: 3259690463
ix_sel: 0.350930 ix_sel_with_filters: 0.350930
Cost: 128251.25 Resp: 128251.25 Degree: 1
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420
Access Path: index (skip-scan)
SS scan sel: 0.344312 SS filter sel: 0.344312 ANDV (#skips): 1579520.000000
SS io: 20452.000000 vs. table scan io: 61966.000000
Skip Scan chosen
Access Path: index (SkipScan)
Index: TX_CHIPIN_TMP_TIME2
resc_io: 1157882.00 resc_cpu: 10592729810
ix_sel: 0.344312 ix_sel_with_filters: 0.344312
Cost: 1158149.27 Resp: 1158149.27 Degree: 1
****** trying bitmap/domain indexes ******
****** finished trying bitmap/domain indexes ******
******** Begin index join costing ********
****** trying bitmap/domain indexes ******
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420
ColGroup Usage:: PredCnt: 2 Matches Full: #6 Partial: Sel: 0.3420
Access Path: index (AllEqRange)
Index: IX_CHIPIN_TEMP_STALOC2
resc_io: 4610.00 resc_cpu: 420754238
ix_sel: 0.350930 ix_sel_with_filters: 0.350930
Cost: 4620.62 Resp: 4620.62 Degree: 0
Bitmap nodes:
Used IX_CHIPIN_TEMP_STALOC2
Cost = 4627.223361, sel = 0.342018
****** finished trying bitmap/domain indexes ******
******** End index join costing ********
Best:: AccessPath: TableScan
Cost: 62160.58 Degree: 1 Resp: 62160.58 Card: 1939621.71 Bytes: 0

***************************************

 

 

10053中看到因为没有直方图存在,所以这里的Column (#8) Density = 0.25 ,Column (#44) Density = 0.5 是从 1/ NDV 算得的,统计信息显示的status in (0)  数据行占总行数的1/4, lottype='gp' 数据行占总行数的1/2, 所以优化器选择做全表扫描.

 

SQL> select count(1) as chipinCount, sum(bets) as sumBets from t_chipin_temp where status in (0) and lottype='gp' ;

CHIPINCOUNT SUMBETS
-----------         ---------
   3                   20


以上符合status in (0) and lottype='gp'的只有3行,且status, lottype列上建有复合索引,索引是IX_CHIPIN_TEMP_STALOC2,同时也使用了dbms_stats包收集表和索引上的统计信息,照理说CBO应该选择 INDEX RANGE SCAN,而避免全表扫描,但实时上优化器opitimizer仍然全表扫描。

 

经过一番查阅,发现其原因是在收集统计信息时并没有收集到直方图,只要收集了直方图,那么优化器就会了解到status in (0) and lottype='gp' 条件仅有少量的行,优化器会走索引。

 

于是重新收集统计信息加上直方图:


execute dbms_stats.gather_table_stats(ownname => 'USER',tabname => 'T_CHIPIN_TEMP' ,method_opt => 'FOR ALL COLUMNS SIZE 2' ,cascade => true);

alter system flush shared_pool;

 

执行完以上操作后,系统CPU很快降了下来。

再看执行计划,便是 INDEX RANGE SCAN 了。

 

记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽

 

 

 

再看收集直方图之后10053的内容:


Access path analysis for T_CHIPIN_TEMP
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for T_CHIPIN_TEMP[T_CHIPIN_TEMP]
Column (#8):
NewDensity:0.000180, OldDensity:0.000000 BktCnt:5558, PopBktCnt:5558, PopValCnt:3, NDV:3
Column (#8): STATUS(
AvgLen: 4 NDV: 3 Nulls: 0 Density: 0.000180 Min: -2 Max: 4
Histogram: Freq #Bkts: 3 UncompBkts: 5558 EndPtVals: 3
Column (#44):
NewDensity:0.000090, OldDensity:0.000000 BktCnt:5559, PopBktCnt:5558, PopValCnt:1, NDV:2
Column (#44): lottype(
AvgLen: 3 NDV: 2 Nulls: 0 Density: 0.000090
Histogram: Freq #Bkts: 2 UncompBkts: 5559 EndPtVals: 2
ColGroup (#5, Index) TX_CHIPIN_TMP_TIMCODE2
Col#: 5 11 31 33 CorStregth: -1.00
ColGroup (#6, Index) TX_CHIPIN_TMP_AWARD2
Col#: 15 35 36 CorStregth: -1.00
ColGroup (#4, Index) TX_CHIPIN_TMP_TIME2
Col#: 8 20 44 CorStregth: -1.00
ColGroup (#3, Index) IDX_T_CHIPIN_TEMP
Col#: 13 34 44 CorStregth: -1.00
ColGroup (#1, Index) IX_CHIPIN_TEMP_STALOC2
Col#: 8 44 CorStregth: 2.00
ColGroup (#2, Index) IDX_S_P
Col#: 8 31 CorStregth: -1.00
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
Table: T_CHIPIN_TEMP Alias: T_CHIPIN_TEMP
Card: Original: 5682253.000000 Rounded: 1 Computed: 0.18 Non Adjusted: 0.18
Access Path: TableScan
Cost: 62159.09 Resp: 62159.09 Degree: 0
Cost_io: 61966.00 Cost_cpu: 7652553546
Resp_io: 61966.00 Resp_cpu: 7652553546
kkofmx: index filter:"T_CHIPIN_TEMP"."lottype"='gp'

kkofmx: index filter:"T_CHIPIN_TEMP"."STATUS"=1

kkofmx: index filter:"T_CHIPIN_TEMP"."lottype"='gp'

Access Path: index (RangeScan)
Index: IDX_S_P
resc_io: 667.00 resc_cpu: 6041476
ix_sel: 0.000180 ix_sel_with_filters: 0.000180
Cost: 667.15 Resp: 667.15 Degree: 1

Access Path: index (skip-scan)
SS scan sel: 0.000180 SS filter sel: 0.000180 ANDV (#skips): 564.000000
SS io: 14379.000000 vs. table scan io: 61966.000000
Skip Scan chosen
Access Path: index (SkipScan)
Index: IDX_T_CHIPIN_TEMP
resc_io: 14450.00 resc_cpu: 104195010
ix_sel: 0.000180 ix_sel_with_filters: 0.000180
Cost: 14452.63 Resp: 14452.63 Degree: 1
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
Access Path: index (AllEqRange)
Index: IX_CHIPIN_TEMP_STALOC2
resc_io: 4.00 resc_cpu: 29696
ix_sel: 0.000000 ix_sel_with_filters: 0.000000
Cost: 4.00 Resp: 4.00 Degree: 1
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
Access Path: index (skip-scan)
SS scan sel: 0.000000 SS filter sel: 0.000000 ANDV (#skips): 1582976.000000
SS io: 20631.000000 vs. table scan io: 61966.000000
Skip Scan chosen
Access Path: index (SkipScan)
Index: TX_CHIPIN_TMP_TIME2
resc_io: 20634.00 resc_cpu: 146945003
ix_sel: 0.000000 ix_sel_with_filters: 0.000000
Cost: 20637.71 Resp: 20637.71 Degree: 1
****** trying bitmap/domain indexes ******
****** finished trying bitmap/domain indexes ******
******** Begin index join costing ********
****** trying bitmap/domain indexes ******
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
ColGroup Usage:: PredCnt: 2 Matches Full: Partial:
Access Path: index (AllEqRange)
Index: IX_CHIPIN_TEMP_STALOC2
resc_io: 3.00 resc_cpu: 21564
ix_sel: 0.000000 ix_sel_with_filters: 0.000000
Cost: 3.00 Resp: 3.00 Degree: 0
Bitmap nodes:
Used IX_CHIPIN_TEMP_STALOC2
Cost = 3.000545, sel = 0.000000
****** finished trying bitmap/domain indexes ******
******** End index join costing ********
Best:: AccessPath: IndexRange
Index: IX_CHIPIN_TEMP_STALOC2
Cost: 4.00 Degree: 1 Resp: 4.00 Card: 0.18 Bytes: 0


从10053可以看出手动指定收集直方图后CBO优化器才能做出正确的选择

 

注:dbms_stats的自动决定直方图的收集与否及收集的桶数受到col_usage$表影响,如果某张表的列存在于col_usage$中,oracle就认为该列存在收集直方图的必要。

  Histograms are not gathered on columns for which there is no predicate information captured in the  col_usage$ view. Col_usage$ is populated with the column information only when queries are executed with columns referred in predicate information.