oracle中not exists对外层查询的影响详解
程序员文章站
2022-05-10 22:57:57
前言
最近同事发现了一个问题,在12c中跑的buffer get很高,但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。
这个10g的环境和12c的环...
前言
最近同事发现了一个问题,在12c中跑的buffer get很高,但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。
这个10g的环境和12c的环境,数据量大致一样,只是有很少部分的不同,但是就是这个很少部分不同,造成了not exists中的子查询返回不同的值,进而对外层查询产生不同的影响。
我们来用如下的代码模拟一下。
初始化数据:
--10g drop table t1; drop table t2; create table t1 (id number,name varchar2(20),dep_id varchar2(10)); create table t2 (id number,name varchar2(20),dep_id varchar2(10)); insert into t1 select rownum,'a','kk' from dual connect by level <=3000000; insert into t2 select rownum,'a','kk' from dual connect by level <=1000000; insert into t2 select rownum,'a','mm' from dual; commit; --12c drop table t1; drop table t2; create table t1 (id number,name varchar2(20),dep_id varchar2(10)); create table t2 (id number,name varchar2(20),dep_id varchar2(10)); insert into t1 select rownum,'a','kk' from dual connect by level <=3000000; insert into t2 select rownum,'a','kk' from dual connect by level <=1000000; commit;
我们看到,12c的数据和10g只是有很少的差别,t1表12c和10g都一样,t2表在12c只是少了一行数据。
--10g sql> select dep_id,count(*) from t1 group by dep_id; dep_id count(*) -------------------- ---------- kk 3000000 sql> select dep_id,count(*) from t2 group by dep_id; dep_id count(*) -------------------- ---------- mm 1 kk 1000000 sql> --12c sql> select dep_id,count(*) from t1 group by dep_id; dep_id count(*) -------------------- ---------- kk 3000000 sql> select dep_id,count(*) from t2 group by dep_id; dep_id count(*) -------------------- ---------- kk 1000000 sql>
我们将要执行的sql语句是:
select count(*) from t1, t2 where t1.id = t2.id and t1.dep_id = 'kk' and not exists (select 1 from t1, t2 where t1.id = t2.id and t2.dep_id = 'mm');
我们先来看执行情况的差距,10g的bufferget小,12c多:
--10g sql> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm'); count(*) ---------- 0 sql> select* from table(dbms_xplan.display_cursor(null,null,'allstats last')); plan_table_output ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- sql_id 22t5mb43w55pr, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm') plan hash value: 3404612428 ------------------------------------------------------------------------------------------------------------------ | id | operation | name | starts | e-rows | a-rows | a-time | buffers | omem | 1mem | used-mem | ------------------------------------------------------------------------------------------------------------------ | 0 | select statement | | 1 | | 1 |00:00:00.02 | 2086 | | | | | 1 | sort aggregate | | 1 | 1 | 1 |00:00:00.02 | 2086 | | | | |* 2 | filter | | 1 | | 0 |00:00:00.02 | 2086 | | | | |* 3 | hash join | | 0 | 901k| 0 |00:00:00.01 | 0 | 39m| 5518k| | | 4 | table access full| t2 | 0 | 901k| 0 |00:00:00.01 | 0 | | | | |* 5 | table access full| t1 | 0 | 2555k| 0 |00:00:00.01 | 0 | | | | |* 6 | hash join | | 1 | 23 | 1 |00:00:00.02 | 2086 | 1517k| 1517k| 612k (0)| |* 7 | table access full| t2 | 1 | 23 | 1 |00:00:00.02 | 2082 | | | | | 8 | table access full| t1 | 1 | 2555k| 1 |00:00:00.01 | 4 | | | | ------------------------------------------------------------------------------------------------------------------ predicate information (identified by operation id): --------------------------------------------------- 2 - filter( is null) 3 - access("t1"."id"="t2"."id") 5 - filter("t1"."dep_id"='kk') 6 - access("t1"."id"="t2"."id") 7 - filter("t2"."dep_id"='mm') note ----- - dynamic sampling used for this statement 34 rows selected. sql> --12c sql> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm'); count(*) ---------- 1000000 sql> select* from table(dbms_xplan.display_cursor(null,null,'allstats last')); plan_table_output ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ sql_id 22t5mb43w55pr, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm') plan hash value: 1692274438 -------------------------------------------------------------------------------------------------------------------- | id | operation | name | starts | e-rows | a-rows | a-time | buffers | omem | 1mem | used-mem | -------------------------------------------------------------------------------------------------------------------- | 0 | select statement | | 1 | | 1 |00:00:00.79 | 10662 | | | | | 1 | sort aggregate | | 1 | 1 | 1 |00:00:00.79 | 10662 | | | | |* 2 | filter | | 1 | | 1000k|00:00:00.74 | 10662 | | | | |* 3 | hash join | | 1 | 1215k| 1000k|00:00:00.52 | 8579 | 43m| 6111k| 42m (0)| | 4 | table access full | t2 | 1 | 1215k| 1000k|00:00:00.01 | 2083 | | | | |* 5 | table access full | t1 | 1 | 2738k| 3000k|00:00:00.07 | 6496 | | | | |* 6 | hash join right semi| | 1 | 35 | 0 |00:00:00.02 | 2083 | 1245k| 1245k| 461k (0)| |* 7 | table access full | t2 | 1 | 23 | 0 |00:00:00.02 | 2083 | | | | | 8 | table access full | t1 | 0 | 2738k| 0 |00:00:00.01 | 0 | | | | -------------------------------------------------------------------------------------------------------------------- predicate information (identified by operation id): --------------------------------------------------- 2 - filter( is null) 3 - access("t1"."id"="t2"."id") 5 - filter("t1"."dep_id"='kk') 6 - access("t1"."id"="t2"."id") 7 - filter("t2"."dep_id"='mm') note ----- - dynamic statistics used: dynamic sampling (level=2) 35 rows selected. sql> sql>
可以看到第23,24行,在10g中运行时,buffers是0,而在12c中,即78,79行,buffer是2083+6496。
也就是说在10g中,外层查询不进行t1和t2的扫描,直接返回结果了,而在12c中,外层查询还要进行t1表和t2表层扫描才返回结果。
这其实不是10g和12c的差别,而是not exists的返回数据对外层的影响。子查询要返回0行记录,才满足not exist的条件,从而返回外层查询结果。
在10g中,子查询返回了一行记录
--10g sql> select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm'; 1 ---------- 1 sql>
不满足not exists(即0行才满足),所以,也就不用在外层继续查询了。直接返回记录0行。
在12c中,子查询返回0行记录,满足not exist的条件,所以还需要在外层查询中继续查询。
--12c sql> select count(*) from t1,t2 where t1.id=t2.id and t2.dep_id='kk'; count(*) ---------- 1000000 sql> set line 1000 sql> set pages 1000 sql> col plan_table_output for a250 sql> sql> sql> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='kk'); count(*) ---------- 0 sql> select* from table(dbms_xplan.display_cursor(null,null,'allstats last')); plan_table_output --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- sql_id c5hj2p2jt1fxf, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='kk') plan hash value: 1692274438 -------------------------------------------------------------------------------------------------------------------- | id | operation | name | starts | e-rows | a-rows | a-time | buffers | omem | 1mem | used-mem | -------------------------------------------------------------------------------------------------------------------- | 0 | select statement | | 1 | | 1 |00:00:00.28 | 2087 | | | | | 1 | sort aggregate | | 1 | 1 | 1 |00:00:00.28 | 2087 | | | | |* 2 | filter | | 1 | | 0 |00:00:00.28 | 2087 | | | | |* 3 | hash join | | 0 | 1215k| 0 |00:00:00.01 | 0 | 69m| 7428k| | | 4 | table access full | t2 | 0 | 1215k| 0 |00:00:00.01 | 0 | | | | |* 5 | table access full | t1 | 0 | 2738k| 0 |00:00:00.01 | 0 | | | | |* 6 | hash join right semi| | 1 | 2738k| 1 |00:00:00.28 | 2087 | 43m| 6111k| 42m (0)| |* 7 | table access full | t2 | 1 | 1215k| 1000k|00:00:00.12 | 2083 | | | | | 8 | table access full | t1 | 1 | 2738k| 1 |00:00:00.01 | 4 | | | | -------------------------------------------------------------------------------------------------------------------- predicate information (identified by operation id): --------------------------------------------------- 2 - filter( is null) 3 - access("t1"."id"="t2"."id") 5 - filter("t1"."dep_id"='kk') 6 - access("t1"."id"="t2"."id") 7 - filter("t2"."dep_id"='kk') note ----- - dynamic statistics used: dynamic sampling (level=2) 35 rows selected. sql>
可以看到第38,39行的buffer为0.
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流。