欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

包含union all的view子查询无法展开表连接的模拟

程序员文章站 2022-05-29 19:14:18
...

之前小鱼写过一些关于子查询无法展开的文章,其中主要涉及到以下几点: 1关联的列没有not null约束,在子查询中使用not in去查找(oracle 10g无法展开为子查询,oracle 11g由于有新的hash join算法已经可以),如下 SQL> select a.* from ta01 a where a.obj

之前小鱼写过一些关于子查询无法展开的文章,其中主要涉及到以下几点:
1关联的列没有not null约束,在子查询中使用not in去查找(oracle 10g无法展开为子查询,oracle 11g由于有新的hash join算法已经可以),如下

SQL> select a.* from ta01 a where a.object_id not in (select b.object_id from ta02 b)
Ta01和ta02表的object_id列上没有not null的约束

2 子查询中包含主表的关联列的谓词过滤(oracle 10g也无法展开为子查询,oracle 11g可以展开为子查询),如下

SQL> select a.* from ta01 a where a.object_id in (select b.object_id from ta02 b where a.object_type=’TABLE’);

3 子查询中包含有下列形式的or的查询(oracle 10g同样无法展开为子查询,oracle 11g可以展开)

SQL> select a.* from ta01 a where a.object_id in (select b.object_id from ta02 b
where object_type='TABLE') or a.object_id in (select c.object_id from ta02 c wh
ere c.object_type='INDEX');

4 子查询包含有主表和子表列的like关联,比如下列sql语句(oracle 10g和11g都无法展开为子查询)

SQL> select a.* from ta01 a where a.object_id in (select b.object_id from ta02 b
where a.data_object_id like b.data_object_id);

http://www.eygle.com/archives/2013/04/2013_oracle_dtcc_dbsnake.html

eygle这篇文章里面又提到了一个比较常见的子查询无法展开的case,借鉴作者的案例进行一点简单的模拟来进行验证。

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for 64-bit Windows: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> create table ta01 as select * from dba_objects;

Table created.

SQL> create table ta02 as select * from dba_objects;

Table created.

SQL> create table tb01 as select * from dba_objects;

Table created.

SQL> create table tb02 as select * from dba_objects;

Table created.

SQL> create view ta_view as
2 select object_id,data_object_id,object_type from ta01 union all
3 select object_id,data_object_id,object_type from ta02;

View created.

SQL> create view tb_view as
2 select object_id,data_object_id,object_type from tb01 union all
3 select object_id,data_object_id,object_type from tb02;

View created.

SQL> select a.* from ta_view a where a.object_id in (select b.object_id from tb_
view b where b.object_type='TABLE');

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
------------------------------------------------------------

Plan hash value: 3623909176

--------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 3738K| 131M| 617 (1)| 00:00:08 |

|* 1 | FILTER | | | | | |

| 2 | VIEW | TA_VIEW | 88401 | 3194K| 308 (1)| 00:00:04 |

| 3 | UNION-ALL | | | | | |

| 4 | TABLE ACCESS FULL| TA01 | 42509 | 1535K| 154 (1)| 00:00:02 |

| 5 | TABLE ACCESS FULL| TA02 | 45892 | 1658K| 154 (1)| 00:00:02 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
------------------------------------------------------------
| 6 | VIEW | TB_VIEW | 42 | 1008 | 309 (1)| 00:00:04 |

| 7 | UNION-ALL | | | | | |

|* 8 | TABLE ACCESS FULL| TB01 | 17 | 408 | 154 (1)| 00:00:02 |

|* 9 | TABLE ACCESS FULL| TB02 | 25 | 600 | 154 (1)| 00:00:02 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter( EXISTS (SELECT /*+ */ 0 FROM ( (SELECT /*+ */
"OBJECT_ID" "OBJECT_ID","DATA_OBJECT_ID" "DATA_OBJECT_ID","OBJECT_
TYPE"

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
------------------------------------------------------------
"OBJECT_TYPE" FROM SYS."TB01" "TB01" WHERE "OBJECT_ID"=:B1 AND
"OBJECT_TYPE"='TABLE') UNION ALL (SELECT /*+ */ "OBJECT_ID"
"OBJECT_ID","DATA_OBJECT_ID" "DATA_OBJECT_ID","OBJECT_TYPE"
"OBJECT_TYPE" FROM SYS."TB02" "TB02" WHERE "OBJECT_ID"=:B2 AND
"OBJECT_TYPE"='TABLE')) "B"))
8 - filter("OBJECT_ID"=:B1 AND "OBJECT_TYPE"='TABLE')
9 - filter("OBJECT_ID"=:B1 AND "OBJECT_TYPE"='TABLE')

Note
-----
- dynamic sampling used for this statement

33 rows selected.

Statistics
----------------------------------------------------------
146 recursive calls
0 db block gets
136371691 consistent gets
0 physical reads
0 redo size
83259 bytes sent via SQL*Net to client
2868 bytes received via SQL*Net from client
218 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3247 rows processed

这里由于主查询和子查询的表都是两个表相关联的视图,这个导致in之后的括号内的子查询并不是一个常量的集合,所以oracle这里没有对该子查询展开为表连接。

而我们可以改写为下列的sql语句,但是这里好像结果集不一样,这个是因为我们选取的表不是1对1引起的,这个不在我们考虑的范围内。
SQL> SELECT a.*
2 FROM ta_view a,
3 (SELECT object_id
4 FROM tb_view
5 WHERE object_type = 'TABLE') b
6 WHERE a.object_id = b.object_id;

6494 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 48097912

--------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 3328K| 158M| 640 (5)| 00:00:08 |

|* 1 | HASH JOIN | | 3328K| 158M| 640 (5)| 00:00:08 |

| 2 | VIEW | TB_VIEW | 3306 | 42978 | 309 (1)| 00:00:04 |

| 3 | UNION-ALL | | | | | |

|* 4 | TABLE ACCESS FULL| TB01 | 1756 | 24584 | 154 (1)| 00:00:02 |

|* 5 | TABLE ACCESS FULL| TB02 | 1550 | 21700 | 154 (1)| 00:00:02 |

| 6 | VIEW | TA_VIEW | 100K| 3637K| 308 (1)| 00:00:04 |

| 7 | UNION-ALL | | | | | |

| 8 | TABLE ACCESS FULL| TA01 | 50333 | 786K| 154 (1)| 00:00:02 |

| 9 | TABLE ACCESS FULL| TA02 | 50334 | 786K| 154 (1)| 00:00:02 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."OBJECT_ID"="OBJECT_ID")
4 - filter("OBJECT_TYPE"='TABLE')
5 - filter("OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
1083 recursive calls
0 db block gets
3412 consistent gets
2782 physical reads
196 redo size
139022 bytes sent via SQL*Net to client
5244 bytes received via SQL*Net from client
434 SQL*Net roundtrips to/from client
31 sorts (memory)
0 sorts (disk)
6494 rows processed

当然如果业务也是1对多,那么这个改写是不符合业务逻辑的,那么是否就不能进行改写了,其实也不是这样的,我们只需要在构造的这个表中加上一个distinct去掉b表的重复的值,这样不就是类似为半连接了吗。
SQL> SELECT a.*
2 FROM ta_view a,
3 (SELECT distinct object_id
4 FROM tb_view
5 WHERE object_type = 'TABLE') b
6 WHERE a.object_id = b.object_id;

3247 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1607119383

--------------------------------------------------------------------------------
--
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
--------------------------------------------------------------------------------
--
| 0 | SELECT STATEMENT | | 3328K| 158M| 642 (5)| 00:00:08
|
|* 1 | HASH JOIN | | 3328K| 158M| 642 (5)| 00:00:08
|
| 2 | VIEW | | 3306 | 42978 | 310 (2)| 00:00:04
|
| 3 | HASH UNIQUE | | 3306 | 42978 | 310 (2)| 00:00:04
|
| 4 | VIEW | TB_VIEW | 3306 | 42978 | 309 (1)| 00:00:04
|
| 5 | UNION-ALL | | | | |
|
|* 6 | TABLE ACCESS FULL| TB01 | 1756 | 24584 | 154 (1)| 00:00:02
|
|* 7 | TABLE ACCESS FULL| TB02 | 1550 | 21700 | 154 (1)| 00:00:02
|
| 8 | VIEW | TA_VIEW | 100K| 3637K| 308 (1)| 00:00:04
|
| 9 | UNION-ALL | | | | |
|
| 10 | TABLE ACCESS FULL | TA01 | 50333 | 786K| 154 (1)| 00:00:02
|
| 11 | TABLE ACCESS FULL | TA02 | 50334 | 786K| 154 (1)| 00:00:02
|
--------------------------------------------------------------------------------
--

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."OBJECT_ID"="B"."OBJECT_ID")
6 - filter("OBJECT_TYPE"='TABLE')
7 - filter("OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
15 recursive calls
0 db block gets
3007 consistent gets
0 physical reads
0 redo size
83259 bytes sent via SQL*Net to client
2868 bytes received via SQL*Net from client
218 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3247 rows processed

关于我们常见的select a.* from a,b where a.id=b.id和select a.* from a where a.id in (select b.id from b)这两种查询,在业务层面有不同的是如果b表没有重复的id,那么这个查询业务逻辑是相同的,而如果b表有重复的,子查询是hash join的半连接方式,会对b.id进行去重,所以如果b.id存在多个重复值,可能会引起查询结果不一致。

并不是说包含union all的veiw会引起子查询无法展开为表连接,如上单个union all的view是不会影响子查询展开为表连接的。
SQL> select a.* from ta01 a where a.object_id in (select b.object_id from tb_vie
w b where b.object_type='TABLE');

1623 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1785931674

--------------------------------------------------------------------------------
--
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
--------------------------------------------------------------------------------
--
| 0 | SELECT STATEMENT | | 1 | 190 | 464 (2)| 00:00:06
|
|* 1 | HASH JOIN RIGHT SEMI | | 1 | 190 | 464 (2)| 00:00:06
|
| 2 | VIEW | VW_NSO_1 | 4229 | 54977 | 309 (1)| 00:00:04
|
| 3 | VIEW | TB_VIEW | 4229 | 99K| 309 (1)| 00:00:04
|
| 4 | UNION-ALL | | | | |
|
|* 5 | TABLE ACCESS FULL| TB01 | 1692 | 40608 | 154 (1)| 00:00:02
|
|* 6 | TABLE ACCESS FULL| TB02 | 2537 | 60888 | 154 (1)| 00:00:02
|
| 7 | TABLE ACCESS FULL | TA01 | 42509 | 7347K| 154 (1)| 00:00:02
|
--------------------------------------------------------------------------------
--

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."OBJECT_ID"="$nso_col_1")
5 - filter("OBJECT_TYPE"='TABLE')
6 - filter("OBJECT_TYPE"='TABLE')

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
53 recursive calls
0 db block gets
3166 consistent gets
0 physical reads
0 redo size
85130 bytes sent via SQL*Net to client
1680 bytes received via SQL*Net from client
110 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
1623 rows processed

而如果我们换到oracle 11.2.0.1的版本,再来看看包含两个union all的view的子查询是否能够展开
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 – Production

SQL> select a.* from ta_view a where a.object_id in (select b.object_id from tb_view b where b.object_type='TABLE');

7629 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1656093151

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6779K| 323M| 1206 (4)| 00:00:15 |
|* 1 | HASH JOIN | | 6779K| 323M| 1206 (4)| 00:00:15 |
| 2 | VIEW | VW_NSO_1 | 4191 | 54483 | 586 (1)| 00:00:08 |
| 3 | HASH UNIQUE | | 4191 | 54483 | 586 (1)| 00:00:08 |
| 4 | VIEW | TB_VIEW | 4191 | 54483 | 585 (1)| 00:00:08 |
| 5 | UNION-ALL | | | | | |
|* 6 | TABLE ACCESS FULL| TB01 | 2036 | 48864 | 293 (1)| 00:00:04 |
|* 7 | TABLE ACCESS FULL| TB02 | 2155 | 51720 | 292 (1)| 00:00:04 |
| 8 | VIEW | TA_VIEW | 161K| 5845K| 585 (1)| 00:00:08 |
| 9 | UNION-ALL | | | | | |
| 10 | TABLE ACCESS FULL | TA01 | 82194 | 2969K| 292 (1)| 00:00:04 |
| 11 | TABLE ACCESS FULL | TA02 | 79581 | 2875K| 292 (1)| 00:00:04 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."OBJECT_ID"="OBJECT_ID")
6 - filter("OBJECT_TYPE"='TABLE')
7 - filter("OBJECT_TYPE"='TABLE')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
63 recursive calls
0 db block gets
5917 consistent gets
3072 physical reads
0 redo size
200965 bytes sent via SQL*Net to client
6108 bytes received via SQL*Net from client
510 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
7629 rows processed

这里oracle选择了将子查询展开为了表连接,优化器确实在做着很多的改变,这些改变需要我们通过文档 测试用例和实际的案例来验证,可以这么说脱离了oracle的版本来谈性能优化 sql调优其实往往是不严谨的。

最近注册了朋友介绍的一个论坛 http://www.bi168.cn/forum.php ,之前也注册了很多论坛比如itpub、otn、ask maclean等等,都没有怎么过多的去关注论坛的东西,记得maclean liu说过每天能遇见的案例是有限的,但是网络的同行遇见的case确是很多的,如果能够将别人遇见的模拟 理解 解决那么这个就是自己的。