MySQL的子查询及相关优化学习教程
一、子查询
1、where型子查询
(把内层查询结果当作外层查询的比较条件)
#不用order by 来查询最新的商品 select goods_id,goods_name from goods where goods_id = (select max(goods_id) from goods);
#取出每个栏目下最新的产品(goods_id唯一) select cat_id,goods_id,goods_name from goods where goods_id in(select max(goods_id) from goods group by cat_id);
2、from型子查询
(把内层的查询结果供外层再次查询)
#用子查询查出挂科两门及以上的同学的平均成绩
思路:
#先查出哪些同学挂科两门以上 select name,count(*) as gk from stu where score < 60 having gk >=2; #以上查询结果,我们只要名字就可以了,所以再取一次名字 select name from (select name,count(*) as gk from stu having gk >=2) as t; #找出这些同学了,那么再计算他们的平均分 select name,avg(score) from stu where name in (select name from (select name,count(*) as gk from stu having gk >=2) as t) group by name;
3、exists型子查询
(把外层查询结果拿到内层,看内层的查询是否成立)
#查询哪些栏目下有商品,栏目表category,商品表goods select cat_id,cat_name from category where exists(select * from goods where goods.cat_id = category.cat_id);
二、优化
从句式的形式看,子查询分为特殊格式子查询和非特殊格式子查询,特殊格式的子查询中又包括in、all、any、some、exists等类型的子查询,对于有的类型的子查询,mysql有的支持优化,有的不支持,具体情况如下。
示例一,mysql不支持对exists类型的子查询的优化:
exists类型的相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where exists (select 1 from t2 where t1.a1= t2.a2 and t2.a2>10);
+----+--------------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+--------------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | dependent subquery | t2 | all | null | using where | +----+--------------------+-------+------+------+-------------+ 2 rows in set, 2 warnings (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where exists(/* select#2 */ select 1 from `test`.`t2` where ((`test`.`t1`.`a1` = `test`.`t2`.`a2`) and (`test`.`t2`.`a2` > 10)) )
从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。
另外的一个exists类型的相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where exists (select 1 from t2 where t1.b1= t2.b2 and t1.a1=10);
+----+--------------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+--------------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | dependent subquery | t2 | all | null | using where | +----+--------------------+-------+------+------+-------------+ 2 rows in set, 3 warnings (0.02 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where exists(/* select#2 */ select 1 from `test`.`t2` where ((`test`.`t1`.`b1` = `test`.`t2`.`b2`) and (`test`.`t1`.`a1` = 10)) )
从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。
示例二,mysql不支持对not exists类型的子查询的优化:
not exists类型的相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where not exists (select 1 from t2 where t1.a1= t2.a2 and t2.a2>10);
+----+--------------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+--------------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | dependent subquery | t2 | all | null | using where | +----+--------------------+-------+------+------+-------------+ 2 rows in set, 2 warnings (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where (not(exists( /* select#2 */ select 1 from `test`.`t2` where ((`test`.`t1`.`a1` = `test`.`t2`.`a2`) and (`test`.`t2`.`a2` > 10)))) )
从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。
另外的一个not exists类型的相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where not exists (select 1 from t2 where t1.b1= t2.b2 and t1.a1=10);
+----+--------------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+--------------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | dependent subquery | t2 | all | null | using where | +----+--------------------+-------+------+------+-------------+ 2 rows in set, 3 warnings (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where (not(exists( /* select#2 */ select 1 from `test`.`t2` where ((`test`.`t1`.`b1` = `test`.`t2`.`b2`) and (`test`.`t1`.`a1` = 10)))) )
从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。
示例三,mysql支持对in类型的子查询的优化,按也有不支持的情况存在:
in非相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 in (select a2 from t2 where t2.a2>10);
+----+--------------+-------------+------+------+----------------------------------------------------+ | id | select_type | table | type | key | extra | +----+--------------+-------------+------+------+----------------------------------------------------+ | 1 | simple | <subquery2> | all | null | null | | 1 | simple | t1 | all | null | using where; using join buffer (block nested loop) | | 2 | materialized | t2 | all | null | using where | +----+--------------+-------------+------+------+----------------------------------------------------+ 3 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` semi join (`test`.`t2`) where ((`test`.`t1`.`a1` = `<subquery2>`.`a2`) and (`test`.`t2`.`a2` > 10))
从查询执行计划看,表t2被物化后,与表t1执行了半连接(semi join)。尽管有“subquery2”这样的内容看起来是子查询,但是表t2已经被上拉到表t1层执行了半连接,所以mysql支持in子查询优化为半连接操作。
另外一个in非相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 in (select a2 from t2 where t2.a2=10);
+----+--------------+-------------+------+------+----------------------------------------------------+ | id | select_type | table | type | key | extra | +----+--------------+-------------+------+------+----------------------------------------------------+ | 1 | simple | <subquery2> | all | null | using where | | 1 | simple | t1 | all | null | using where; using join buffer (block nested loop) | | 2 | materialized | t2 | all | null | using where | +----+--------------+-------------+------+------+----------------------------------------------------+ 3 rows in set, 1 warning (0.02 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` semi join (`test`.`t2`) where ((`<subquery2>`.`a2` = 10) and (`test`.`t1`.`a1` = 10) and (`test`.`t2`.`a2` = 10))
从查询执行计划看,子查询不存在,表t1和t2直接做了块嵌套循环半连接(block nested loop),把子查询上拉到父查询中用嵌套循环半连接完成in操作。另外,由于子查询上拉,使得增加连接条件“a1=a2”,而原先的条件“a2=10”可以利用常量传递优化技术,使得“a1=a2=10”,所以查询执行计划中,两个索引扫描的条件分别为:a1 = 10、a2 = 10。
另外一个in非相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 in (select a2 from t2 where t1.a1=10);
+----+-------------+-------+------+------------------------------------------------------------------+ | id | select_type | table | type | extra | +----+-------------+-------+------+------------------------------------------------------------------+ | 1 | simple | t2 | all | using where; start temporary | | 1 | simple | t1 | all | using where; end temporary; using join buffer (block nested loop)| +----+-------------+-------+------+------------------------------------------------------------------+ 2 rows in set, 2 warnings (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` semi join (`test`.`t2`) where ((`test`.`t2`.`a2` = 10) and (`test`.`t1`.`a1` = 10))
从查询执行计划看,子子查询不存在,表t1和t2直接做了块嵌套循环连接(block nested loop),但属于半连接操作(semi join),把子查询上拉到父查询中用嵌套循环半连接完成in操作。
示例四,mysql支持对not in类型的子查询的优化
not in非相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 not in (select a2 from t2 where t2.a2>10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.02 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where (not(<in_optimizer>( `test`.`t1`.`a1`,`test`.`t1`.`a1` in ( <materialize> (/* select#2 */ select `test`.`t2`.`a2` from `test`.`t2` where (`test`.`t2`.`a2` > 10) having 1 ), <primary_index_lookup>( `test`.`t1`.`a1` in <temporary table> on <auto_key> where ((`test`.`t1`.`a1` = `materialized-subquery`.`a2`)) ) ) )) )
从查询执行计划看,表t2做了子查询(subquery)。而子查询被物化(materialize)。所以,mysql对于not in子查询采用了物化的优化方式,但不支持子查询的消除。
另外一个not in非相关子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 not in (select a2 from t2 where t2.a2=10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1` from `test`.`t1` where (not(<in_optimizer>( `test`.`t1`.`a1`,`test`.`t1`.`a1` in ( <materialize> (/* select#2 */ select `test`.`t2`.`a2` from `test`.`t2` where (`test`.`t2`.`a2` = 10) having 1 ), <primary_index_lookup>( `test`.`t1`.`a1` in <temporary table> on <auto_key> where ((`test`.`t1`.`a1` = `materialized-subquery`.`a2`)) ) ) )) )
从查询执行计划看,表t2做了子查询(subquery)。而子查询被物化(materialize)。所以,mysql对于not in子查询采用了物化的优化方式,但不支持子查询的消除。
示例五,mysql支持对all类型的子查询的优化:
不相关的all子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 >all (select a2 from t2 where t2.a2>10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1` from `test`.`t1` where <not>((`test`.`t1`.`a1` <= <max>( /* select#2 */ select `test`.`t2`.`a2` from `test`.`t2` where (`test`.`t2`.`a2` > 10) ) ))
从查询执行计划看,出现了子查询(subquery),但是,子查询被“<= <max>”操作符限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“>all”式的子查询优化,子查询只被执行一次即可求得最大值。
不相关的all子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 =all (select a2 from t2 where t2.a2=10);
+----+--------------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+--------------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | dependent subquery | t2 | all | null | using where | +----+--------------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1` from `test`.`t1` where <not>(<in_optimizer>( `test`.`t1`.`a1`,<exists>( /* select#2 */ select 1 from `test`.`t2` where ((`test`.`t2`.`a2` = 10) and <if>(outer_field_is_not_null, ((<cache>(`test`.`t1`.`a1`) <> 10) or <cache>(isnull(10))), true ) ) having <if>(outer_field_is_not_null, <is_not_null_test>(`test`.`t2`.`a2`), true) ) ))
从查询执行计划看,出现了子查询(subquery),但是被查询优化器处理后的语句中包含“exists”,这表明mysql对于“=all”式的子查询优化用“exists strategy”方式优化,所以mysql支持“=all”式的子查询优化。
不相关的all子查询,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 <all (select a2 from t2 where t2.a2=10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1` from `test`.`t1` where <not>((`test`.`t1`.`a1` >= <min> (/* select#2 */ select `test`.`t2`.`a2` from `test`.`t2` where (`test`.`t2`.`a2` = 10) ) ))
从查询执行计划看,出现了子查询(subquery),但是,子查询被“>= <min>”操作符限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“<all”式的子查询优化,子查询只被执行一次即可求得最小值。
示例六,mysql支持对some类型的子查询的优化:
使用了“>some”式子的子查询被优化,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 >some (select a2 from t2 where t2.a2>10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.05 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where <nop>((`test`.`t1`.`a1` > ( /* select#2 */ select min(`test`.`t2`.`a2`) from `test`.`t2` where (`test`.`t2`.`a2` > 10) )))
从查询执行计划看,出现了子查询(subquery),但是,子查询被“min”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“>some”式的子查询优化,子查询只被执行一次即可求得最大值。
使用了“=some”式子的子查询被优化,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 =some (select a2 from t2 where t2.a2=10);
+----+--------------+-------------+------+------+----------------------------------------------------+ | id | select_type | table | type | key | extra | +----+--------------+-------------+------+------+----------------------------------------------------+ | 1 | simple | <subquery2> | all | null | using where | | 1 | simple | t1 | all | null | using where; using join buffer (block nested loop) | | 2 | materialized | t2 | all | null | using where | +----+--------------+-------------+------+------+----------------------------------------------------+ 3 rows in set, 1 warning (0.01 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` semi join (`test`.`t2`) where ((`<subquery2>`.`a2` = 10) and (`test`.`t1`.`a1` = 10) and (`test`.`t2`.`a2` = 10))
从查询执行计划看,没有出现了子查询,表t2被物化,与表t1进行了半连接。
使用了“<some”式子的子查询被优化,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 <some (select a2 from t2 where t2.a2=10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where <nop>( ( `test`.`t1`.`a1` < (/* select#2 */ select max(`test`.`t2`.`a2`) from `test`.`t2` where (`test`.`t2`.`a2` = 10) ) ) )
从查询执行计划看,出现了子查询(subquery),但是,子查询被“max”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“<some”式的子查询优化,子查询只被执行一次即可求得最大值。
示例七,mysql支持对any类型的子查询的优化:
使用了“>any”式子的子查询被优化,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 >any (select a2 from t2 where t2.a2>10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where <nop>( ( `test`.`t1`.`a1` > (/* select#2 */ select min(`test`.`t2`.`a2`) from `test`.`t2` where (`test`.`t2`.`a2` > 10) ) ) )
从查询执行计划看,出现了子查询(subquery),但是,子查询被“min”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“>any”式的子查询优化,子查询只被执行一次即可求得最小值。
使用了“=any”式子的子查询被优化,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 =any (select a2 from t2 where t2.a2>10);
+----+--------------+-------------+------+------+----------------------------------------------------+ | id | select_type | table | type | key | extra | +----+--------------+-------------+------+------+----------------------------------------------------+ | 1 | simple | <subquery2> | all | null | null | | 1 | simple | t1 | all | null | using where; using join buffer (block nested loop) | | 2 | materialized | t2 | all | null | using where | +----+--------------+-------------+------+------+----------------------------------------------------+ 3 rows in set, 1 warning (0.02 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` semi join (`test`.`t2`) where ((`test`.`t1`.`a1` = `<subquery2>`.`a2`) and (`test`.`t2`.`a2` > 10))
从查询执行计划看,没有出现了子查询,表t2被物化,与表t1进行了半连接。
使用了“<any”式子的子查询被优化,查询执行计划如下:
mysql> explain extended select * from t1 where t1.a1 <any (select a2 from t2 where t2.a2>10);
+----+-------------+-------+------+------+-------------+ | id | select_type | table | type | key | extra | +----+-------------+-------+------+------+-------------+ | 1 | primary | t1 | all | null | using where | | 2 | subquery | t2 | all | null | using where | +----+-------------+-------+------+------+-------------+ 2 rows in set, 1 warning (0.00 sec)
被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`, `test`.`t1`.`b1` as `b1` from `test`.`t1` where <nop>( ( `test`.`t1`.`a1` < (/* select#2 */ select max(`test`.`t2`.`a2`) from `test`.`t2` where (`test`.`t2`.`a2` > 10) ) ) )
从查询执行计划看,出现了子查询(subquery),但是,子查询被“max”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“<any”式的子查询优化,子查询只被执行一次即可求得最大值。
下一篇: MySQL的常用命令集锦