欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

MySQL的子查询及相关优化学习教程

程序员文章站 2024-02-26 20:44:40
一、子查询 1、where型子查询 (把内层查询结果当作外层查询的比较条件) #不用order by 来查询最新的商品 select goods_id,...

一、子查询
1、where型子查询
(把内层查询结果当作外层查询的比较条件)

#不用order by 来查询最新的商品
select goods_id,goods_name from goods where goods_id = (select max(goods_id) from goods);
#取出每个栏目下最新的产品(goods_id唯一)
select cat_id,goods_id,goods_name from goods where goods_id in(select max(goods_id) from goods group by cat_id); 

2、from型子查询
(把内层的查询结果供外层再次查询)
#用子查询查出挂科两门及以上的同学的平均成绩
思路:

#先查出哪些同学挂科两门以上
select name,count(*) as gk from stu where score < 60 having gk >=2;
#以上查询结果,我们只要名字就可以了,所以再取一次名字
select name from (select name,count(*) as gk from stu having gk >=2) as t;
#找出这些同学了,那么再计算他们的平均分
select name,avg(score) from stu where name in (select name from (select name,count(*) as gk from stu having gk >=2) as t) group by name;

3、exists型子查询
(把外层查询结果拿到内层,看内层的查询是否成立)

#查询哪些栏目下有商品,栏目表category,商品表goods
select cat_id,cat_name from category where exists(select * from goods where goods.cat_id = category.cat_id);

二、优化
从句式的形式看,子查询分为特殊格式子查询和非特殊格式子查询,特殊格式的子查询中又包括in、all、any、some、exists等类型的子查询,对于有的类型的子查询,mysql有的支持优化,有的不支持,具体情况如下。

 

示例一,mysql不支持对exists类型的子查询的优化:

exists类型的相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where exists (select 1 from t2 where t1.a1= t2.a2 and t2.a2>10);

+----+--------------------+-------+------+------+-------------+
| id | select_type    | table | type | key | extra    |

+----+--------------------+-------+------+------+-------------+

| 1 | primary      | t1  | all | null | using where |

| 2 | dependent subquery | t2  | all | null | using where |

+----+--------------------+-------+------+------+-------------+

2 rows in set, 2 warnings (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where exists(/* select#2 */

  select 1

  from `test`.`t2`

  where ((`test`.`t1`.`a1` = `test`.`t2`.`a2`) and (`test`.`t2`.`a2` > 10))

)

从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。

另外的一个exists类型的相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where exists (select 1 from t2 where t1.b1= t2.b2 and t1.a1=10);

+----+--------------------+-------+------+------+-------------+

| id | select_type    | table | type | key | extra    |

+----+--------------------+-------+------+------+-------------+

| 1 | primary      | t1  | all | null | using where |

| 2 | dependent subquery | t2  | all | null | using where |

+----+--------------------+-------+------+------+-------------+

2 rows in set, 3 warnings (0.02 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where exists(/* select#2 */

  select 1

  from `test`.`t2`

  where ((`test`.`t1`.`b1` = `test`.`t2`.`b2`) and (`test`.`t1`.`a1` = 10))

)

从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。

 

示例二,mysql不支持对not exists类型的子查询的优化:

not exists类型的相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where not exists (select 1 from t2 where t1.a1= t2.a2 and t2.a2>10);

+----+--------------------+-------+------+------+-------------+

| id | select_type    | table | type | key | extra    |

+----+--------------------+-------+------+------+-------------+

| 1 | primary      | t1  | all | null | using where |

| 2 | dependent subquery | t2  | all | null | using where |

+----+--------------------+-------+------+------+-------------+

2 rows in set, 2 warnings (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where (not(exists(

  /* select#2 */ select 1

  from `test`.`t2`

  where ((`test`.`t1`.`a1` = `test`.`t2`.`a2`) and (`test`.`t2`.`a2` > 10))))

)

从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。

 

另外的一个not exists类型的相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where not exists (select 1 from t2 where t1.b1= t2.b2 and t1.a1=10);

+----+--------------------+-------+------+------+-------------+

| id | select_type    | table | type | key | extra    |

+----+--------------------+-------+------+------+-------------+

| 1 | primary      | t1  | all | null | using where |

| 2 | dependent subquery | t2  | all | null | using where |

+----+--------------------+-------+------+------+-------------+

2 rows in set, 3 warnings (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where (not(exists(

  /* select#2 */ select 1

  from `test`.`t2`

  where ((`test`.`t1`.`b1` = `test`.`t2`.`b2`) and (`test`.`t1`.`a1` = 10))))

)

从查询执行计划看,子查询存在,mysql没有进一步做子查询的优化工作。

 

示例三,mysql支持对in类型的子查询的优化,按也有不支持的情况存在:

in非相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 in (select a2 from t2 where t2.a2>10);

+----+--------------+-------------+------+------+----------------------------------------------------+

| id | select_type | table    | type | key | extra  |

+----+--------------+-------------+------+------+----------------------------------------------------+

| 1 | simple    | <subquery2> | all | null | null  |

| 1 | simple    | t1     | all | null | using where; using join buffer (block nested loop) |

| 2 | materialized | t2     | all | null | using where  |

+----+--------------+-------------+------+------+----------------------------------------------------+

3 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1` semi join (`test`.`t2`)

where ((`test`.`t1`.`a1` = `<subquery2>`.`a2`) and (`test`.`t2`.`a2` > 10))

从查询执行计划看,表t2被物化后,与表t1执行了半连接(semi join)。尽管有“subquery2”这样的内容看起来是子查询,但是表t2已经被上拉到表t1层执行了半连接,所以mysql支持in子查询优化为半连接操作。

 

另外一个in非相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 in (select a2 from t2 where t2.a2=10);

+----+--------------+-------------+------+------+----------------------------------------------------+

| id | select_type | table    | type | key | extra  |

+----+--------------+-------------+------+------+----------------------------------------------------+

| 1 | simple    | <subquery2> | all | null | using where  |

| 1 | simple    | t1     | all | null | using where; using join buffer (block nested loop) |

| 2 | materialized | t2     | all | null | using where  |

+----+--------------+-------------+------+------+----------------------------------------------------+

3 rows in set, 1 warning (0.02 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1` semi join (`test`.`t2`)

where ((`<subquery2>`.`a2` = 10) and (`test`.`t1`.`a1` = 10) and (`test`.`t2`.`a2` = 10))

从查询执行计划看,子查询不存在,表t1和t2直接做了块嵌套循环半连接(block nested loop),把子查询上拉到父查询中用嵌套循环半连接完成in操作。另外,由于子查询上拉,使得增加连接条件“a1=a2”,而原先的条件“a2=10”可以利用常量传递优化技术,使得“a1=a2=10”,所以查询执行计划中,两个索引扫描的条件分别为:a1 = 10、a2 = 10。

 

另外一个in非相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 in (select a2 from t2 where t1.a1=10);

+----+-------------+-------+------+------------------------------------------------------------------+

| id | select_type | table | type | extra      |

+----+-------------+-------+------+------------------------------------------------------------------+

| 1 | simple   | t2  | all | using where; start temporary      |

| 1 | simple   | t1  | all | using where; end temporary; using join buffer (block nested loop)|

+----+-------------+-------+------+------------------------------------------------------------------+

2 rows in set, 2 warnings (0.00 sec)


被查询优化器处理后的语句为:
/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1` semi join (`test`.`t2`)

where ((`test`.`t2`.`a2` = 10) and (`test`.`t1`.`a1` = 10))

从查询执行计划看,子子查询不存在,表t1和t2直接做了块嵌套循环连接(block nested loop),但属于半连接操作(semi join),把子查询上拉到父查询中用嵌套循环半连接完成in操作。

 

示例四,mysql支持对not in类型的子查询的优化

not in非相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 not in (select a2 from t2 where t2.a2>10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.02 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

`test`.`t1`.`b1` as `b1`

from `test`.`t1`

where (not(<in_optimizer>(

  `test`.`t1`.`a1`,`test`.`t1`.`a1` in (

    <materialize> (/* select#2 */

      select `test`.`t2`.`a2`

      from `test`.`t2`

      where (`test`.`t2`.`a2` > 10)

      having 1

    ),

    <primary_index_lookup>(

      `test`.`t1`.`a1` in <temporary table> on <auto_key>

      where ((`test`.`t1`.`a1` = `materialized-subquery`.`a2`))

    )

   )

  ))

)

从查询执行计划看,表t2做了子查询(subquery)。而子查询被物化(materialize)。所以,mysql对于not in子查询采用了物化的优化方式,但不支持子查询的消除。

 

另外一个not in非相关子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 not in (select a2 from t2 where t2.a2=10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1`

from `test`.`t1`

where (not(<in_optimizer>(

  `test`.`t1`.`a1`,`test`.`t1`.`a1` in (

    <materialize> (/* select#2 */

      select `test`.`t2`.`a2`

      from `test`.`t2`

      where (`test`.`t2`.`a2` = 10)

      having 1

    ),

    <primary_index_lookup>(

      `test`.`t1`.`a1` in <temporary table> on <auto_key>

      where ((`test`.`t1`.`a1` = `materialized-subquery`.`a2`))

    )

  )

  ))

)

从查询执行计划看,表t2做了子查询(subquery)。而子查询被物化(materialize)。所以,mysql对于not in子查询采用了物化的优化方式,但不支持子查询的消除。

 

示例五,mysql支持对all类型的子查询的优化:

不相关的all子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 >all (select a2 from t2 where t2.a2>10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <not>((`test`.`t1`.`a1` <= <max>(

  /* select#2 */

  select `test`.`t2`.`a2`

  from `test`.`t2`

  where (`test`.`t2`.`a2` > 10)

  )

))

从查询执行计划看,出现了子查询(subquery),但是,子查询被“<= <max>”操作符限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“>all”式的子查询优化,子查询只被执行一次即可求得最大值。

 

不相关的all子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 =all (select a2 from t2 where t2.a2=10);

+----+--------------------+-------+------+------+-------------+

| id | select_type    | table | type | key | extra    |

+----+--------------------+-------+------+------+-------------+

| 1 | primary      | t1  | all | null | using where |

| 2 | dependent subquery | t2  | all | null | using where |

+----+--------------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <not>(<in_optimizer>(

  `test`.`t1`.`a1`,<exists>(

    /* select#2 */ select 1 from `test`.`t2`

    where ((`test`.`t2`.`a2` = 10) and

      <if>(outer_field_is_not_null,

        ((<cache>(`test`.`t1`.`a1`) <> 10) or <cache>(isnull(10))),

        true

      )

    )

    having <if>(outer_field_is_not_null, <is_not_null_test>(`test`.`t2`.`a2`), true)

  )

))

从查询执行计划看,出现了子查询(subquery),但是被查询优化器处理后的语句中包含“exists”,这表明mysql对于“=all”式的子查询优化用“exists strategy”方式优化,所以mysql支持“=all”式的子查询优化。

 

不相关的all子查询,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 <all (select a2 from t2 where t2.a2=10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,`test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <not>((`test`.`t1`.`a1` >= <min>

  (/* select#2 */

    select `test`.`t2`.`a2`

    from `test`.`t2`

    where (`test`.`t2`.`a2` = 10)

  )

))

从查询执行计划看,出现了子查询(subquery),但是,子查询被“>= <min>”操作符限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“<all”式的子查询优化,子查询只被执行一次即可求得最小值。

 

示例六,mysql支持对some类型的子查询的优化:

使用了“>some”式子的子查询被优化,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 >some (select a2 from t2 where t2.a2>10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.05 sec)

被查询优化器处理后的语句为:

 /* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

   `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <nop>((`test`.`t1`.`a1` > (

  /* select#2 */

  select min(`test`.`t2`.`a2`)

  from `test`.`t2`

  where (`test`.`t2`.`a2` > 10)

)))

从查询执行计划看,出现了子查询(subquery),但是,子查询被“min”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“>some”式的子查询优化,子查询只被执行一次即可求得最大值。

 

使用了“=some”式子的子查询被优化,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 =some (select a2 from t2 where t2.a2=10);

+----+--------------+-------------+------+------+----------------------------------------------------+

| id | select_type | table    | type | key | extra  |

+----+--------------+-------------+------+------+----------------------------------------------------+

| 1 | simple    | <subquery2> | all | null | using where  |

| 1 | simple    | t1     | all | null | using where; using join buffer (block nested loop) |

| 2 | materialized | t2     | all | null | using where  |

+----+--------------+-------------+------+------+----------------------------------------------------+

3 rows in set, 1 warning (0.01 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

`test`.`t1`.`b1` as `b1`

from `test`.`t1` semi join (`test`.`t2`)

where ((`<subquery2>`.`a2` = 10) and (`test`.`t1`.`a1` = 10) and (`test`.`t2`.`a2` = 10))

从查询执行计划看,没有出现了子查询,表t2被物化,与表t1进行了半连接。

 

使用了“<some”式子的子查询被优化,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 <some (select a2 from t2 where t2.a2=10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <nop>(

  (

    `test`.`t1`.`a1` < (/* select#2 */

      select max(`test`.`t2`.`a2`)

      from `test`.`t2`

      where (`test`.`t2`.`a2` = 10)

    )

  )

)

从查询执行计划看,出现了子查询(subquery),但是,子查询被“max”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“<some”式的子查询优化,子查询只被执行一次即可求得最大值。

 

示例七,mysql支持对any类型的子查询的优化:

使用了“>any”式子的子查询被优化,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 >any (select a2 from t2 where t2.a2>10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <nop>(

  (

    `test`.`t1`.`a1` > (/* select#2 */

      select min(`test`.`t2`.`a2`)

      from `test`.`t2`

      where (`test`.`t2`.`a2` > 10)

    )

  )

)

从查询执行计划看,出现了子查询(subquery),但是,子查询被“min”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“>any”式的子查询优化,子查询只被执行一次即可求得最小值。

 

使用了“=any”式子的子查询被优化,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 =any (select a2 from t2 where t2.a2>10);

+----+--------------+-------------+------+------+----------------------------------------------------+

| id | select_type | table    | type | key | extra  |

+----+--------------+-------------+------+------+----------------------------------------------------+

| 1 | simple    | <subquery2> | all | null | null  |

| 1 | simple    | t1     | all | null | using where; using join buffer (block nested loop) |

| 2 | materialized | t2     | all | null | using where  |

+----+--------------+-------------+------+------+----------------------------------------------------+

3 rows in set, 1 warning (0.02 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1` semi join (`test`.`t2`)

where ((`test`.`t1`.`a1` = `<subquery2>`.`a2`) and (`test`.`t2`.`a2` > 10))

从查询执行计划看,没有出现了子查询,表t2被物化,与表t1进行了半连接。

 

使用了“<any”式子的子查询被优化,查询执行计划如下:

mysql> explain extended select * from t1 where t1.a1 <any (select a2 from t2 where t2.a2>10);

+----+-------------+-------+------+------+-------------+

| id | select_type | table | type | key | extra    |

+----+-------------+-------+------+------+-------------+

| 1 | primary   | t1  | all | null | using where |

| 2 | subquery  | t2  | all | null | using where |

+----+-------------+-------+------+------+-------------+

2 rows in set, 1 warning (0.00 sec)

被查询优化器处理后的语句为:

/* select#1 */ select `test`.`t1`.`id1` as `id1`,`test`.`t1`.`a1` as `a1`,

  `test`.`t1`.`b1` as `b1`

from `test`.`t1`

where <nop>(

  (

    `test`.`t1`.`a1` < (/* select#2 */

      select max(`test`.`t2`.`a2`)

      from `test`.`t2`

      where (`test`.`t2`.`a2` > 10)

    )

  )

)

从查询执行计划看,出现了子查询(subquery),但是,子查询被“max”函数限制,而子查询中的被查询列a2上存在唯一索引,所以可以利用索引求最值,所以mysql支持“<any”式的子查询优化,子查询只被执行一次即可求得最大值。