欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

一次EVENT 10200寻找Buffer Get之旅,最终以狗血案例而告终

程序员文章站 2022-06-05 20:38:36
...

很多案件的最终结局其实都是很狗血的,这不我也深深的经历了一遭。整个过程感觉就像小时候大家伙玩传奇一样,从迷宫一层一层的下,下到最底层之后发现,妹的,居然不是大Boss,居然是个小怪。浪费了我大量的精力。首先介绍一下背景,11g一条语句比10g一条语句

很多案件的最终结局其实都是很狗血的,这不我也深深的经历了一遭。整个过程感觉就像小时候大家伙玩>一样,从迷宫一层一层的下,下到最底层之后发现,妹的,居然不是大Boss,居然是个小怪。浪费了我大量的精力。首先介绍一下背景,11g一条语句比10g一条语句的cost下降了很多,但是buffer get确增加了很多。相关表的数据量其实没多少差异。所以当时我就在思考,这多出来的buffer get是那些块,涉及的对象是那些。

================================》10g执行计划
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 2721392943 
------------------------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                                | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | 
------------------------------------------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT                         |                          |     1 |       |   225   (1)| 00:00:03 |       |       | 
|   1 |  SORT AGGREGATE                          |                          |     1 |       |            |          |       |       | 
|   2 |   VIEW                                   |                          |    21 |       |   225   (1)| 00:00:03 |       |       | 
|   3 |    HASH GROUP BY                         |                          |    21 |  1701 |   225   (1)| 00:00:03 |       |       | 
|   4 |     VIEW                                 |                          |    21 |  1701 |   224   (1)| 00:00:03 |       |       | 
|   5 |      UNION-ALL                           |                          |       |       |            |          |       |       | 
|   6 |       NESTED LOOPS OUTER                 |                          |    20 |   860 |   191   (1)| 00:00:03 |       |       | 
|   7 |        TABLE ACCESS BY GLOBAL INDEX ROWID| UR_USER_INFO             |     6 |   186 |     8   (0)| 00:00:01 | ROWID | ROWID | 
|*  8 |         INDEX RANGE SCAN                 | IDX_USERINFO_IX          |     6 |       |     3   (0)| 00:00:01 |       |       | 
|   9 |        PARTITION RANGE ALL               |                          |     3 |    36 |    30   (0)| 00:00:01 |     1 |    18 | 
|  10 |         TABLE ACCESS BY LOCAL INDEX ROWID| OR_ORDERLINE_INFO        |     3 |    36 |    30   (0)| 00:00:01 |     1 |    18 | 
|* 11 |          INDEX RANGE SCAN                | IDX_OR_ORDERLINE_INFO_IN |     3 |       |    29   (0)| 00:00:01 |     1 |    18 | 
|  12 |       NESTED LOOPS OUTER                 |                          |     1 |    43 |    33   (0)| 00:00:01 |       |       | 
|  13 |        PARTITION RANGE ALL               |                          |     1 |    34 |    32   (0)| 00:00:01 |     1 |    18 | 
|* 14 |         TABLE ACCESS BY LOCAL INDEX ROWID| OR_ORDERLINE_INFO        |     1 |    34 |    32   (0)| 00:00:01 |     1 |    18 | 
|* 15 |          INDEX RANGE SCAN                | IDX_OR_ORDERLINE_INFO_CI |     6 |       |    30   (0)| 00:00:01 |     1 |    18 | 
|  16 |        PARTITION RANGE ITERATOR          |                          |     1 |     9 |     1   (0)| 00:00:01 |   KEY |   KEY | 
|* 17 |         INDEX UNIQUE SCAN                | PK_UR_USER_INFO          |     1 |     9 |     1   (0)| 00:00:01 |   KEY |   KEY | 
------------------------------------------------------------------------------------------------------------------------------------- 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
   8 - access("USR"."CUST_ID"=TO_NUMBER(:CUST_ID)) 
  11 - access("USR"."ID_NO"="OL"."ID_NO"(+)) 
  14 - filter("OL"."DATA_STATUS"='00') 
  15 - access("OL"."CUST_ID"=TO_NUMBER(:CUST_ID)) 
  17 - access("USR"."ID_NO"(+)="OL"."ID_NO") 
Statistics 
---------------------------------------------------------- 
          0  recursive calls 
          0  db block gets 
        582  consistent gets 
          0  physical reads 
          0  redo size 
        657  bytes sent via SQL*Net to client 
        492  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed
================================》11g执行计划
Execution Plan 
---------------------------------------------------------- 
Plan hash value: 2721392943 
------------------------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                                | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | 
------------------------------------------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT                         |                          |     1 |       |    55   (2)| 00:00:01 |       |       | 
|   1 |  SORT AGGREGATE                          |                          |     1 |       |            |          |       |       | 
|   2 |   VIEW                                   |                          |     3 |       |    55   (2)| 00:00:01 |       |       | 
|   3 |    HASH GROUP BY                         |                          |     3 |   231 |    55   (2)| 00:00:01 |       |       | 
|   4 |     VIEW                                 |                          |     3 |   231 |    54   (0)| 00:00:01 |       |       | 
|   5 |      UNION-ALL                           |                          |       |       |            |          |       |       | 
|   6 |       NESTED LOOPS OUTER                 |                          |     2 |    86 |    36   (0)| 00:00:01 |       |       | 
|   7 |        TABLE ACCESS BY GLOBAL INDEX ROWID| UR_USER_INFO             |     2 |    62 |     4   (0)| 00:00:01 | ROWID | ROWID | 
|*  8 |         INDEX RANGE SCAN                 | IDX_USERINFO_IX          |     2 |       |     2   (0)| 00:00:01 |       |       | 
|   9 |        PARTITION RANGE ALL               |                          |     1 |    12 |    16   (0)| 00:00:01 |     1 |    18 | 
|  10 |         TABLE ACCESS BY LOCAL INDEX ROWID| OR_ORDERLINE_INFO        |     1 |    12 |    16   (0)| 00:00:01 |     1 |    18 | 
|* 11 |          INDEX RANGE SCAN                | IDX_OR_ORDERLINE_INFO_IN |     2 |       |    14   (0)| 00:00:01 |     1 |    18 | 
|  12 |       NESTED LOOPS OUTER                 |                          |     1 |    43 |    18   (0)| 00:00:01 |       |       | 
|  13 |        PARTITION RANGE ALL               |                          |     1 |    34 |    17   (0)| 00:00:01 |     1 |    18 | 
|* 14 |         TABLE ACCESS BY LOCAL INDEX ROWID| OR_ORDERLINE_INFO        |     1 |    34 |    17   (0)| 00:00:01 |     1 |    18 | 
|* 15 |          INDEX RANGE SCAN                | IDX_OR_ORDERLINE_INFO_CI |     3 |       |    15   (0)| 00:00:01 |     1 |    18 | 
|  16 |        PARTITION RANGE ITERATOR          |                          |     1 |     9 |     1   (0)| 00:00:01 |   KEY |   KEY | 
|* 17 |         INDEX UNIQUE SCAN                | PK_UR_USER_INFO          |     1 |     9 |     1   (0)| 00:00:01 |   KEY |   KEY | 
------------------------------------------------------------------------------------------------------------------------------------- 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
   8 - access("USR"."CUST_ID"=TO_NUMBER(:CUST_ID)) 
  11 - access("USR"."ID_NO"="OL"."ID_NO"(+)) 
  14 - filter("OL"."DATA_STATUS"='00') 
  15 - access("OL"."CUST_ID"=TO_NUMBER(:CUST_ID)) 
  17 - access("USR"."ID_NO"(+)="OL"."ID_NO") 
Statistics 
---------------------------------------------------------- 
          0  recursive calls 
          0  db block gets 
        837  consistent gets 
          0  physical reads 
          0  redo size 
        676  bytes sent via SQL*Net to client 
        520  bytes received via SQL*Net from client 
          2  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed

事先得说明一下,11g的执行计划里面涉及的表和索引我都重新收集了统计信息。从这个执行计划中可以看出,我们的consistent gets从582增加到了837,而执行计划的cost其实是下降了,从225下降到55。所以我们从多出的这200多个consistent gets着手。如果我们要追踪这些块,这里需要使用10200 event。

1.做10200 event,从trace里面找信息

*** 2014-05-19 15:55:43.558
*** ACTION NAME:() 2014-05-19 15:55:43.556
*** MODULE NAME:(sqlplus@w25k03da (TNS V1-V3)) 2014-05-19 15:55:43.556
*** SERVICE NAME:(SYS$USERS) 2014-05-19 15:55:43.556
*** SESSION ID:(5146.581) 2014-05-19 15:55:43.556
Consistent read started for block 34 : 11c999f4
  env: (scn: 0x0af1.fa66895f  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0  parent xid: xid: 0x0000.000.00000000  scn: 0x0000.00000000 8sch: scn: 0x0000.00000000)
CR exa ret 2 on:  3800103c0  scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  scn: 0xffff.ffffffff  sfl: 0
Consistent read finished for block 34 : 11c999f4
Consistent read started for block 34 : 7ece819f
  env: (scn: 0x0af1.fa66895f  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0  parent xid: xid: 0x0000.000.00000000  scn: 0x0000.00000000 8sch: scn: 0x0000.00000000)
CR exa ret 2 on:  3800103c0  scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  scn: 0xffff.ffffffff  sfl: 0
Consistent read finished for block 34 : 7ece819f
*** 2014-05-19 15:35:19.883
ktrgtc2(): started for block  objd: 0x00020c7c
  env [0x9fffffffbf36cf94]: (scn: 0x0aef.60c6b2af  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000660)
ktrexc(): returning 2 on:  6000000000177a70  cr-scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  cl-scn: 0xffff.ffffffff  sfl: 0
ktrgtc2(): completed for block  objd: 0x00020c7c
ktrgtc2(): started for block  objd: 0x00020c7c
  env [0x9fffffffbf36cf94]: (scn: 0x0aef.60c6b2af  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000660)
ktrexc(): returning 2 on:  6000000000177a70  cr-scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  cl-scn: 0xffff.ffffffff  sfl: 0
ktrgtc2(): completed for block  objd: 0x00020c7c
ktrget2(): started for block   objd: 0x00020c7c
env [0x9fffffffbf36cf94]: (scn: 0x0aef.60c6b2af  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000660)
showing 0xc000000a65ec1b18 694591c4 (2) dscn ffffffff:ffff bcrp ffff:ffffffff, bestcrp (nil)
ktrexf(): returning 9 on:  6000000000177a70  cr-scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  cl-scn: 0xffff.ffffffff  sfl: 0
new dscn ffff:ffffffff ret=9
ktrgcm(): completed for block   objd: 0x00020c7c
ktrget3(): completed for  block  objd: 0x00020c7c

上面可以看到,在10g中,我们10200的一致读的信息是:Consistent read started for block 34 : 11c999f4,而我们11g的一致读的信息是ktrgtc2(): started for block objd: 0x00020c7c.可以看到11g比10g的trace更加的智能,它直接告诉了你object_id号,这可以帮助我们快速定位到相关的对象。而10g呢?它给出的只是一个DBA的地址,我们要通过相关函数先去转换成文件号和块号,然后通过文件号和块号去dba_extents去查具体对象名。

2.根据trace的地址找到真实的object

================================》10g Trace
Consistent read started for block 34 : 11c999f4
Consistent read started for block 34 : 7ece819f
Consistent read started for block 34 : 4ac285db
Consistent read started for block 34 : 8347e921
Consistent read started for block 12 : 0c06184e
Consistent read started for block 36 : 25c92364
Consistent read started for block 36 : 2bc7db6d
Consistent read started for block 36 : 2a01835b
Consistent read started for block 36 : 25c9237c
Consistent read started for block 36 : 25c9237f
Consistent read started for block 36 : 25c92384
Consistent read started for block 36 : 25c9898c
Consistent read started for block 36 : 25c9899c
Consistent read started for block 36 : 25c9899e
SELECT DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE (TO_NUMBER ('11C999F4', 'XXXXXXXX'))AS FILE_NO,DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK (TO_NUMBER ('11C999F4', 'XXXXXXXX')) AS BLOCK_NO FROM DUAL;
SELECT SEGMENT_NAME FROM DBA_EXTENTS A WHERE FILE_ID=71   AND  629236   BETWEEN BLOCK_ID AND BLOCK_ID+BLOCKS-1; 
================================》11g Trace
ktrgtc2(): started for block   objd: 0x00020c7c 
ktrgtc2(): started for block   objd: 0x00020c7c 
ktrget2(): started for block   objd: 0x00020c7c
ktrget2(): started for block   objd: 0x0001b36a
ktrgtc2(): started for block   objd: 0x0002149d 
ktrget2(): started for block   objd: 0x0002149d
ktrget2(): started for block   objd: 0x0002149e
ktrget2(): started for block   objd: 0x0002149e
ktrget2(): started for block   objd: 0x0002149f
ktrget2(): started for block   objd: 0x000214a0
ktrget2(): started for block   objd: 0x000214a1
ktrget2(): started for block   objd: 0x000214a1
select to_number('20c7c','XXXXX') from dual;
select object_name from dba_objects where object_id='134268';

通过上述查询及统计,我们定位出,10g里面查询索引IDX_OR_ORDERLINE_INFO_IN,读取块514次,而在11g里面查询索引IDX_OR_ORDERLINE_INFO_IN读取783次,所以这个索引应该有问题,我们来看看索引的情况。

================================》10g
SQL> select INDEX_NAME,PARTITION_NAME,BLEVEL,LEAF_BLOCKS,NUM_ROWS from DBA_IND_PARTITIONS where INDEX_NAME='IDX_OR_ORDERLINE_INFO_IN';
INDEX_NAME                     PARTITION_NAME                     BLEVEL LEAF_BLOCKS   NUM_ROWS
------------------------------ ------------------------------ ---------- ----------- ----------
IDX_OR_ORDERLINE_INFO_IN       PART_01                                 2        1106      52833
IDX_OR_ORDERLINE_INFO_IN       PART_02                                 1           3        381
IDX_OR_ORDERLINE_INFO_IN       PART_03                                 0           1          4
IDX_OR_ORDERLINE_INFO_IN       PART_04                                 0           1         10
IDX_OR_ORDERLINE_INFO_IN       PART_05                                 1           3        206
IDX_OR_ORDERLINE_INFO_IN       PART_06                                 1         165       3766
IDX_OR_ORDERLINE_INFO_IN       PART_07                                 0           1          4
IDX_OR_ORDERLINE_INFO_IN       PART_08                                 0           1          9
IDX_OR_ORDERLINE_INFO_IN       PART_09                                 0           1         14
IDX_OR_ORDERLINE_INFO_IN       PART_10                                 1         496      10939
IDX_OR_ORDERLINE_INFO_IN       PART_11                                 1         448      10377
IDX_OR_ORDERLINE_INFO_IN       PART_12                                 0           1         54
IDX_OR_ORDERLINE_INFO_IN       PART_13                                 0           1         13
IDX_OR_ORDERLINE_INFO_IN       PART_14                                 0           1          4
IDX_OR_ORDERLINE_INFO_IN       PART_15                                 1          16       4431
IDX_OR_ORDERLINE_INFO_IN       PART_16                                 1          88       3380
IDX_OR_ORDERLINE_INFO_IN       PART_17                                 1         292       5994
IDX_OR_ORDERLINE_INFO_IN       PART_18                                 0           0          0
================================》11g
SQL> select INDEX_NAME,PARTITION_NAME,BLEVEL,LEAF_BLOCKS,NUM_ROWS from DBA_IND_PARTITIONS where INDEX_NAME='IDX_OR_ORDERLINE_INFO_IN';
INDEX_NAME                     PARTITION_NAME                     BLEVEL LEAF_BLOCKS   NUM_ROWS GLO
------------------------------ ------------------------------ ---------- ----------- ---------- ---
IDX_OR_ORDERLINE_INFO_IN       PART_01                                 1         152      54315 YES
IDX_OR_ORDERLINE_INFO_IN       PART_02                                 1           2        426 YES
IDX_OR_ORDERLINE_INFO_IN       PART_03                                 0           1         23 YES
IDX_OR_ORDERLINE_INFO_IN       PART_04                                 0           1         78 YES
IDX_OR_ORDERLINE_INFO_IN       PART_05                                 1           2        621 YES
IDX_OR_ORDERLINE_INFO_IN       PART_06                                 1          15       5248 YES
IDX_OR_ORDERLINE_INFO_IN       PART_07                                 0           1         66 YES
IDX_OR_ORDERLINE_INFO_IN       PART_08                                 0           1        244 YES
IDX_OR_ORDERLINE_INFO_IN       PART_09                                 0           1         18 YES
IDX_OR_ORDERLINE_INFO_IN       PART_10                                 1          38      13537 YES
IDX_OR_ORDERLINE_INFO_IN       PART_11                                 1          42      15065 YES
IDX_OR_ORDERLINE_INFO_IN       PART_12                                 0           1        107 YES
IDX_OR_ORDERLINE_INFO_IN       PART_13                                 0           1         17 YES
IDX_OR_ORDERLINE_INFO_IN       PART_14                                 0           1         33 YES
IDX_OR_ORDERLINE_INFO_IN       PART_15                                 1          13       4431 YES
IDX_OR_ORDERLINE_INFO_IN       PART_16                                 1          12       4081 YES
IDX_OR_ORDERLINE_INFO_IN       PART_17                                 1          28       9846 YES
IDX_OR_ORDERLINE_INFO_IN       PART_18                                 0           0          0 YES

因为11g是新创建的索引,我们可以看到其实它的leaf_blocks其实比较少,而相同的10g是很多的。并且PART_01这个分区在10g上索引的层级是2级。那么按照正常思维来看,10g需要扫描更多的索引块才能获取到相关数据才对。但是我们看到的情况正好相反。那么问题究竟出在哪儿?

此时一个同事提醒我,把语句真实的运行了一把,运行出来后,顿时恍然大悟,这就是一个狗血的案例。简直是一个血的教训啊。因为环境的原因,10g上和11g上有些数据差异,但是数据量的差异其实是很小的(这点从统计信息上可以看到)。而11g统计信息也是重新收集了。问题就出在数据的差异导致11g查询出来的结果集是29条,而10g查询出来的是19条,查询的结果集多了,当然buffer get就多了。这就是最大的原因。

SQL> select a.ID_NO as ID_NO,
                  min(a.DATA_STATUS) as DATA_STATUS,
                  min(a.IS_OPEN) as IS_OPEN,
             from (select usr.id_no AS ID_NO,
                          decode(ol.data_status, null, '99', '01') AS DATA_STATUS,
                          nvl(ol.id_no, 0) as IS_OPEN,
                          usr.phone_no as SERVICE_NO
                     from UR_USER_INFO usr, OR_ORDERLINE_INFO ol
                    where usr.cust_id = :CUST_ID
                      and usr.ID_NO = ol.ID_NO(+)
                   union all
                   select ol.id_no AS ID_NO,
                          ol.data_status AS DATA_STATUS,
                          nvl(usr.id_no, 1) as IS_OPEN,
                          ol.service_no as SERVICE_NO
                     from UR_USER_INFO usr, OR_ORDERLINE_INFO ol
                    where ol.cust_id = :CUST_ID
                      and usr.ID_NO(+) = ol.ID_NO
                      and ol.DATA_STATUS = '00' and rownum select a.ID_NO as ID_NO,
                min(a.DATA_STATUS) as DATA_STATUS,
                min(a.IS_OPEN) as IS_OPEN,
           from (select usr.id_no AS ID_NO,
                        decode(ol.data_status, null, '99', '01') AS DATA_STATUS,
                        nvl(ol.id_no, 0) as IS_OPEN,
                        usr.phone_no as SERVICE_NO
                   from UR_USER_INFO usr, OR_ORDERLINE_INFO ol
                  where usr.cust_id = :CUST_ID
                    and usr.ID_NO = ol.ID_NO(+)
                 union all
                 select ol.id_no AS ID_NO,
                        ol.data_status AS DATA_STATUS,
                        nvl(usr.id_no, 1) as IS_OPEN,
                        ol.service_no as SERVICE_NO
                   from UR_USER_INFO usr, OR_ORDERLINE_INFO ol
                  where ol.cust_id = :CUST_ID
                    and usr.ID_NO(+) = ol.ID_NO
                    and ol.DATA_STATUS = '00') a
          group by a.id_no;
     ID_NO DA    IS_OPEN 
---------- -- ---------- 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6001E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6001E+13 99          0 
1.6310E+13 99          0 
1.6001E+13 99          0 
1.6001E+13 99          0 
1.6001E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
1.6310E+13 99          0 
19 rows selected.

总结:不要太过依赖于执行计划出现的那些信息,有时候需要把语句真实运行一把。