欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

MySQL--Like %XXX% 优化测试

程序员文章站 2022-03-03 20:01:31
...

生产系统中遇见了一个like %XXX%的查询,工作中遇见过很多,但是没有实际测试过对它的优化。整理资料,总结常见的优化方法。

1、环境介绍

MySQL 8.0.16



2、对象信息介绍

MySQL [pion]>
MySQL [pion]> select count(*) from p_channel;
+----------+
| count(*) |
+----------+
|  7559141 |
+----------+
1 row in set (0.78 sec)

3、常见方法总结

(1)POSITION('substr' IN `field`)
(2)LOCATE('substr',str,pos)
(3)INSTR(`str`,'substr')
(4)mysql5.7以上的Generated Column功能
        (1)STORED
        (2)VIRTUAL
(5)添加一个新列存储该字段的反向值,进行sql修改
        (1)修改业务完成该功能
        (2)触发器完成该功能(不建议)

4、实际案例

业务SQL:
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_name` LIKE '%qm-qtt4800383%'
ORDER BY `id` DESC
LIMIT 100

+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | p_channel | NULL       | index | NULL          | PRIMARY | 4       | NULL |  100 |    11.11 | Using where; Backward index scan |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+

执行时间:
4061ms/3937ms/3906ms/3853ms
参考SQL:
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_name` LIKE 'qm-qtt4800383%'
ORDER BY `id` DESC
LIMIT 100;

执行计划:
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table     | partitions | type  | possible_keys       | key          | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | p_channel | NULL       | range | app_id,channel_name | channel_name | 194     | NULL |    1 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+------------------------------------------+

执行时间:
25ms/27ms/22ms/25ms

从上述的执行计划可以看出,业务SQL并没有使用channel_name上的索引,使用了PRIMARY,Extra中因为order by desc的原由有了Backward index scan(降序索引), 通过对比参考SQL,我们能够确定当SQL使用索引的时候,查询的速度是非常快的,索引显现的非常重要。

5、测试案例

## SQL改写,将排序放到最外层

select SQL_NO_CACHE * from (
SELECT `id` AS `value`, `channel_name` AS `label`
FROM `p_channel` force index (channel_name) 
WHERE `channel_name` LIKE '%qm-qtt4800383%') a 
ORDER BY a.`value` DESC
LIMIT 100

+----+-------------+-----------+------------+-------+---------------+--------------+---------+------+---------+----------+------------------------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key          | key_len | ref  | rows    | filtered | Extra                                    |
+----+-------------+-----------+------------+-------+---------------+--------------+---------+------+---------+----------+------------------------------------------+
|  1 | SIMPLE      | p_channel | NULL       | index | NULL          | channel_name | 194     | NULL | 7075362 |    11.11 | Using where; Using index; Using filesort |
+----+-------------+-----------+------------+-------+---------------+--------------+---------+------+---------+----------+------------------------------------------+

执行时间:
2308ms/2268ms/2259ms/2324ms
## POSITION('substr' IN `field`)

select SQL_NO_CACHE * from (
SELECT `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`  force index (channel_name) 
WHERE POSITION('qm-qtt4800383' IN  `channel_name`) > 0) a 
ORDER BY a.`value` DESC
LIMIT 100

执行时间:
1824ms/1842ms/1826ms/1820ms
## LOCATE('substr',str,pos)

select SQL_NO_CACHE * from (
SELECT `id` AS `value`, `channel_name` AS `label`
FROM `p_channel` force index (channel_name) 
WHERE LOCATE('qm-qtt4800383', `channel_name`) > 0) a 
ORDER BY a.`value` DESC
LIMIT 100

执行时间:
1827ms/1837ms/1824ms/1817ms
## INSTR(`str`,'substr')

select SQL_NO_CACHE * from (
SELECT `id` AS `value`, `channel_name` AS `label`
FROM `p_channel` force index (channel_name) 
WHERE INSTR(`channel_name`,'qm-qtt4800383' ) > 0) a 
ORDER BY a.`value` DESC
LIMIT 100

执行时间:
1830ms/1833ms/1832ms/1819ms

从上述4个案例中,我们能够观察到通过改写SQL,或者使用MySQL内置函数,我们能够有一定的优化,但是SLQ的之际的执行时间还是在1s-2s左右,这样的优化,对大量查询而言相当于没有优化。

## Generated Column For STORED

添加虚拟列:
ALTER TABLE p_channel ADD COLUMN channel_eman2 varchar(64) GENERATED ALWAYS AS (REVERSE(channel_name)) stored;

添加索引:
ALTER TABLE p_channel add index ind_c_eman2(channel_eman2);

改写SQL:
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_name` LIKE 'qm-qtt4800383%'
union all
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_eman2` LIKE ('3830084ttq-mq%');

执行计划:
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table     | partitions | type  | possible_keys       | key          | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY     | p_channel | NULL       | range | app_id,channel_name | channel_name | 194     | NULL |    1 |   100.00 | Using where; Using index |
|  2 | UNION       | p_channel | NULL       | range | ind_c_eman2         | ind_c_eman2  | 195     | NULL |    1 |   100.00 | Using index condition    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+

执行时间:
24ms/24ms/23ms/27ms

删除虚拟列:
ALTER TABLE p_channel drop column channel_eman2;
## Generated Column For VIRTUAL

添加虚拟列:
ALTER TABLE p_channel ADD COLUMN channel_eman1 varchar(64) GENERATED ALWAYS AS (REVERSE(channel_name));

添加索引:
ALTER TABLE p_channel add index ind_c_eman1(channel_eman1);

改写SQL: 
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_name` LIKE 'qm-qtt4800383%'
union all
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_eman1` LIKE ('3830084ttq-mq%');

执行计划:
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table     | partitions | type  | possible_keys       | key          | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY     | p_channel | NULL       | range | app_id,channel_name | channel_name | 194     | NULL |    1 |   100.00 | Using where; Using index |
|  2 | UNION       | p_channel | NULL       | range | ind_c_eman1         | ind_c_eman1  | 195     | NULL |    1 |   100.00 | Using index condition    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+

删除虚拟列:
ALTER TABLE p_channel drop column channel_eman1;

执行时间:
24ms/24ms/25ms/26ms
## 新增物理反向字段

添加字段:
ALTER TABLE p_channel ADD COLUMN channel_eman3 varchar(64);

更新数据:
update p_channel set  channel_eman3 = REVERSE(channel_name);

添加索引:
ALTER TABLE p_channel add index ind_c_eman3(channel_eman3);

改写SQL
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_name` LIKE 'qm-qtt4800383%'
union all
SELECT SQL_NO_CACHE  `id` AS `value`, `channel_name` AS `label`
FROM `p_channel`
WHERE `channel_eman3` LIKE ('3830084ttq-mq%');

+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table     | partitions | type  | possible_keys       | key          | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY     | p_channel | NULL       | range | app_id,channel_name | channel_name | 194     | NULL |    1 |   100.00 | Using where; Using index |
|  2 | UNION       | p_channel | NULL       | range | ind_c_eman2         | ind_c_eman2  | 195     | NULL |    1 |   100.00 | Using index condition    |
+----+-------------+-----------+------------+-------+---------------------+--------------+---------+------+------+----------+--------------------------+

执行时间:
20ms/23ms/22ms/26ms

删除索引:
ALTER TABLE p_channel drop column channel_eman3;

从上述三种情形可以看出,SQL的优化效果非常可观。但需要注意的是使用Generated Column,存在一定的限制。如下所示:

Table 15.19 Online DDL Support for Generated Column Operations

Operation Instant In Place Rebuilds Table Permits Concurrent DML Only Modifies Metadata
Adding a STORED column No No Yes No No
Modifying STORED column order No No Yes No No
Dropping a STORED column No Yes Yes Yes No
Adding a VIRTUAL column Yes Yes No Yes Yes
Modifying VIRTUAL column order No No Yes No No
Dropping a VIRTUAL column Yes Yes No Yes Yes
参考资料:
https://www.jianshu.com/p/9f83eebc8606
https://cloud.tencent.com/developer/article/1673255
https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl.html
相关标签: SQL优化 mysql