MYSQL索引
什么是索引? 举个例子:新华字典,有目录,有正文内容。索引就相当于目录,正文内容就相当于数据。 索引有什么用? 索引用于快速查找在某列中有一特定值的行。 一条查询语句,如果没有索引,将对全表进行扫描。 如果所有的数据页面都不在内存中,则需要从硬
什么是索引?
举个例子:新华字典,有目录,有正文内容。索引就相当于目录,正文内容就相当于数据。
索引有什么用?
索引用于快速查找在某列中有一特定值的行。
一条查询语句,如果没有索引,将对全表进行扫描。
如果所有的数据页面都不在内存中,则需要从硬盘上读取这些页面,从而产生大量的I/O,每次I/O都会消耗一定时间。
最终,总的查询时间,会大的惊人。
使用索引
若此时查询列有个索引,MYSQL 就能快速定位到具体位置,找出相关列,将指定数据页面读入内存,I/O 就会大大降低。
以字典为例,查找字母为 Z 开头的某个单词,先通过索引定位 Z 开头的单词的起始位置,从这里开始查询,从而节省了大量的时间。
一次查询能使用多个索引吗?
一次查询只能使用一个索引。
哪些常见情况不能用索引?
like “%xxx” not in , != 对列进行函数运算的情况(如 where avg(age) = “20”)
如何分析是否正确用到索引?
explain select ...
联合索引的问题
假设,你有一个三列联合的索引:(col1, col2, col3)。
那么你将拥有三种索引使用方式:
(col1) (col1, col2) (col1, col2, col3)
上述说的就是最左前缀 – leftmost prefix。
So,当你有多列查询需求时,你可以考虑建一个合适的联合索引。
关于like查询
like 的参数不以非通配符 % 开头的字符常量,就能使用索引。
SELECT * FROM tbl_name WHERE key_col LIKE 'something%'; //匹配以something开头的字符串 SELECT * FROM tbl_name WHERE key_col LIKE '%something%'; //不使用索引 SELECT * FROM tbl_name WHERE key_col LIKE 'something'; //精确匹配,等效于 “ = ” 运算符
假如,你在看一本成语词典,目录是按成语拼音顺序建立。
查询需求是:你想找以 “一” 字开头的成语(“一%”),和你想找包含一字的成语(“%一%”)。
你觉得哪个会更快呢?
索引越多越好?
大多数情况下,索引都能大幅度提高查询效率。
数据的增、删、改操作都需要维护索引,索引一多,意味着维护成本高了。 更多的索引需要更多的存储空间。比如:20页的书,有15页的目录?这就不合理了。 小表建索引,往往适得其反。比如:读个2页的宣传手册,你还先去找目录?
什么样的字段不适合建索引?
更新非常频繁的列 列的值唯一性太小,比如性别,Enum 类型的字段等 太长的列 FROM:http://blog.segmentfault.com/vboy1010/1190000000461418
Mysql索引设计原则:
任务描述:
假设一高频查询如下
SELECT * FROM user WHERE area=’amoy’ AND sex=0 ORDER BY last_login DESC limit 30;
如何建立索引?描述考虑的过程
user表如下:
初始化100W条数据,其中,area要通过IP查询生成,sex为 0,1 随机
CREATE TABLE?user
?(id
?int(10) NOT NULL AUTO_INCREMENT COMMENT ‘自增编号’,username
?varchar(30) NOT NULL DEFAULT ’0′ COMMENT ‘用户名’,password
?varchar(30) NOT NULL DEFAULT ’0′ COMMENT ‘密码’,area
?varchar(30) NOT NULL COMMENT ‘地址’,sex
?int(10) NOT NULL COMMENT ‘性别1,男;2,女。’,last_login
?int(10) NOT NULL COMMENT ‘最近一次登录时间戳’,
PRIMARY KEY (id
)
) ENGINE=InnoDB AUTO_INCREMENT=892013 DEFAULT CHARSET=latin1
最终我的索引
(last_login,area)
索引原则:
1.where和order by等的字段建立索引
2.使用唯一索引:对于last_login,area等字段重复的次数比较少,可以使用索引;而sex无非就两个值:性别1,男;2,不值得索引
3.多列索引:不要为每一个列单独建立索引,这样并不能将mysql索引的效率最大化。使用“索引合并策略”
4.选择合理的索引列顺序:索引列的顺序意味着索引首先按照最左列进行排序,然后是第二列,以此类推。如(last_login,area)会先按照 last_login 进行排序,然后才是area。
5.将选择性最高的索引放到前面,也就是会所按照这个条件搜索到的数据最少,选择性就越高,比如选择性:last_login> area> sex。
6.索引不是越多越好,适合的索引可以提高查询效率,但是会降低写入效率,根据项目保持两者的平衡性最好了。
总结上面,首先sex不适合建立索引,有没有索引对于效率的提升意义不大,其次索引会按照最左列进行排序,因此将last_login放到最前面
测试过程:
user表
没有任何索引的查询相关日志:
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.57s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.56s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.55s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.59s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.55s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.55s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.57s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.58s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.57s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.57s
共花费时间:5.66s
建立索引area:
ALTER TABLE?user
?ADD INDEX?index_area
?(area
) ;
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.06s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.10s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.11s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.20s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.07s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
共花费时间:0.66s
可见,建立area以后对性能的影响是巨大的(5.66/0.66 约为8.5758倍)
删除索引:ALTER TABLE?user
?DROP INDEX?index_area
;
删除area索引发现时间又变成了0.57s
建立last_login索引:
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.03s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.09s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.51s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.07s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.06s
共花费时间:0.87s
同样能够提升性能(5.66/0.87 约为6.5057倍)
建立sex索引:
ALTER TABLE?user
?ADD INDEX?index_sex
?(sex
) ;
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.87s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.87s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.87s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.89s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.88s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.87s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.86s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.88s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.87s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.87s
共花费时间:8.73s
同样能够提升性能(5.66s/8.73 约为0.6483倍)效率反而降低了??求解?
建立这个sex索引还不如不建。
删除索引:
ALTER TABLE?user
?DROP INDEX?index_sex
;
发现时间又变成了0.57s左右,
建立两个单独的索引:
ALTER TABLE?user
ADD INDEX?index_area
?(area
) ,
ADD INDEX?index_last_login
?(last_login
) ;
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.09s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.33s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.21s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.28s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.03s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.67s
发现建立两个单独的索引还不如只建立一个索引
删除索引:
发现时间又变成了0.57s左右,
建立一个的联合索引:
ALTER TABLE?user
ADD INDEX?index_last_login_area
?(last_login
,area
) ,
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.00s
额,第二条数据这是怎么了,我测试了5次都在这附近晃悠哈!
这尼玛,找对索引啦!就该这么建立,查询不出来需要的时间啦!估计就是我们需要的索引啦!!!!
删除索引:
发现时间又变成了0.57s左右,
建立一个的联合索引:
ALTER TABLE?user
ADD INDEX?index_sex_last_login_area
?(sex
,last_login
,area
)
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.18s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.17s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.81s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.01s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.03s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
sex怎么总是你在拖后腿啊!把你调整到索引的最后一个吧!
删除索引:
发现时间又变成了0.57s左右,
建立一个的联合索引:
ALTER TABLE?user
ADD INDEX?index_last_login_area_sex
?(area
,last_login
,sex
)
SELECT * FROM user WHERE area=’美国ATT用户’ AND sex=0 ORDER BY last_login DESC limit 30; 0.03s
SELECT * FROM user WHERE area=’泰国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.07s
SELECT * FROM user WHERE area=’*省*大宽频’ AND sex=0 ORDER BY last_login DESC limit 30; 0.50s
SELECT * FROM user WHERE area=’美国弗吉尼亚州’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’德国奔驰汽车’ AND sex=0 ORDER BY last_login DESC limit 30; 0.05s
SELECT * FROM user WHERE area=’*省中华电信’ AND sex=0 ORDER BY last_login DESC limit 30; 0.06s
SELECT * FROM user WHERE area=’韩国’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’拉美地区’ AND sex=0 ORDER BY last_login DESC limit 30; 0.02s
SELECT * FROM user WHERE area=’美国纽约(Prudential)’ AND sex=0 ORDER BY last_login DESC limit 30; 0.04s
SELECT * FROM user WHERE area=’印度尼西亚’ AND sex=0 ORDER BY last_login DESC limit 30; 0.06s
综上所述:1.建立索引不一定能够加快查询效率如sex这种给重复次数特别多的列增加索引如sex这种会降低查询效率,具体的原因有待查找
2.给重复次数比较少的列增加u讴吟还是能够大幅度提高效率
3.给where和orderby之后的字段添加索引才会加快查询效率
4.为每一个列单独建立索引,不能将索引的效率最大化,应该使用索引合并策略,即根据查询条件,建立联合索引
5.联合索引的顺序问题:将选择性高的索引放到前面
6.根据资料建立索引意味着索引按照最左列进行排序,然后事第二列,以此类推。如(last_login ,area)就会按照last_login进行排序,然后才是area
7.根据这次的这个查询条件来说最好的索引是:ALTER TABLE?user
ADD INDEX?index_last_login_area
(last_login
,area
)。
本文出自:http://blog.chedushi.com, 原文地址:http://blog.chedushi.com/archives/7536, 感谢原作者分享。
推荐阅读