SQL Server中如何识别、查找未使用的索引(unused indexes)
在SQL Server中,索引是优化SQL性能的一大法宝。但是由于各种原因,索引会被当做“银弹”滥用,一方面有些开发人员(甚至是部分数据库管理员)有一些陋习,不管三七二十一,总是根据所谓的"感觉"或“经验”先增加一些索引,而不管这些索引是否未被使用或是否合理。另外一方面在数据库的生命周期中,需求总是在变化,业务也在变化,有些当初创建的有效索引可能已经变成了unused index了。变成了数据库性能的累赘; 另外,部分数据库管理员其实很少清理索引(冗余索引,重复索引,未使用索引)。其实不管是出于性能考虑,还是数据库维护管理的需要,数据库中的未使用索引(unused index)都需要定期清理,因为这些未使用索引(unused index)不但不会提高查询性能,还会影响DML操作的性能、浪费存储空间等等。本文主要总结一下,如何找到识别、查找哪些未使用的索引(unused index)
如何找到未使用索引呢? 在ORACLE数据库中提供了监控索引使用情况的功能。虽然在SQL Server中没有提供此类功能,但是提供了DMV视图sys.dm_db_index_usage_stats ,关于这个视图,详细信息可以参考官方文档,下面仅仅介绍需要用到的几个字段
user_scans 用户查询执行的扫描次数。
user_seeks 用户查询执行的搜索次数。
user_lookups 用户查询执行的书签查找次数。
user_updates 通过用户查询执行的更新次数。这表示插入、 删除,更新的次数,而不是受影响的实际行数。
例如,如果你删除在一个语句中的 1000行,此计数递增 1
Number of updates by user queries. This includes Insert, Delete, and Updates representing
number of operations done not the actual rows affected. For example, if you delete 1000
rows in one statement, this count increments by 1
我们可以使用下面SQL语句查找当前数据库中的未使用索引(unused index):
SELECT 'SQL Server Instance Start with ' + CONVERT(VARCHAR(16),create_date,120) FROM sys.databases
WHERE database_id =2;
SELECT DB_NAME(diu.database_id) AS DatabaseName ,
s.name +'.' +QUOTENAME(o.name) AS TableName ,
i.index_id AS IndexID ,
i.name AS IndexName ,
CASE WHEN i.is_unique =1 THEN 'UNIQUE INDEX'
ELSE 'NOT UNIQUE INDEX' END AS IS_UNIQUE,
CASE WHEN i.is_disabled=1 THEN 'DISABLE'
ELSE 'ENABLE' END AS IndexStatus,
o.create_date AS IndexCreated,
STATS_DATE(o.object_id,i.index_id) AS StatisticsUpdateDate,
diu.user_seeks AS UserSeek ,
diu.user_scans AS UserScans ,
diu.user_lookups AS UserLookups ,
diu.user_updates AS UserUpdates ,
p.TableRows ,
'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.'
+ QUOTENAME(OBJECT_NAME(diu.object_id)) +';' AS 'Drop Index Statement'
FROM sys.dm_db_index_usage_stats diu
INNER JOIN sys.indexes i ON i.index_id = diu.index_id
AND diu.object_id = i.object_id
INNER JOIN sys.objects o ON diu.object_id = o.object_id
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN ( SELECT SUM(p.rows) TableRows ,
p.index_id ,
p.object_id
FROM sys.partitions p
GROUP BY p.index_id ,
p.object_id
) p ON p.index_id = diu.index_id
AND diu.object_id = p.object_id
WHERE OBJECTPROPERTY(diu.object_id, 'IsUserTable') = 1
AND diu.database_id = DB_ID()
AND i.is_primary_key = 0 --排除主键索引
AND i.is_unique_constraint = 0 --排除唯一索引
AND diu.user_updates <> 0 --排除没有数据变化的索引
AND diu.user_lookups = 0
AND diu.user_seeks = 0
AND diu.user_scans = 0
AND i.name IS NOT NULL --排除那些没有任何索引的堆表
ORDER BY ( diu.user_seeks + diu.user_scans + diu.user_lookups ) ASC,diu.user_updates DESC;
GO
需要注意的几点:
1:sys.dm_db_index_usage_stats返回索引的被使用的信息,但是这个DMV视图中的数据是自数据库服务启动以来累计收集的数据(只要重启SQL Server服务,该视图的计数器就初始化为空。 而且,当分离或关闭数据库时(例如,由于 AUTO_CLOSE 设置为 ON),便会删除与该数据库关联的所有记录。),所以,如果数据库只运行了几天,那么这个视图的数据有可能不是特别准确(例如,有些OLAP的批处理或作业,一个月才运行一次)。所以在判断分析前,一定要查看数据库服务已经运行多长时间了。一般合适的时间是一个月以上,最好是两个月以上。
2:sys.dm_db_index_usage_stats不返回有关内存列存储索引的信息
3:注意字段IndexCreated,如果索引是最近几天创建的,也要谨慎分析,不要急于删除。
4:注意条件里面有些字段过滤条件,其实都是包含一定业务意义的。
另外,上面脚本只能查询当前数据库的未使用索引,如果需要查询当前实例下的所有数据库,那么可以使用下面脚本
EXEC sp_MSforeachdb 'USE [?] ;
SELECT DB_NAME(diu.database_id) AS DatabaseName ,
s.name +''.'' +QUOTENAME(o.name) AS TableName ,
i.index_id AS IndexID ,
i.name AS IndexName ,
CASE WHEN i.is_unique =1 THEN ''UNIQUE INDEX''
ELSE ''NOT UNIQUE INDEX'' END AS IS_UNIQUE,
CASE WHEN i.is_disabled=1 THEN ''DISABLE''
ELSE ''ENABLE'' END AS IndexStatus,
o.create_date AS IndexCreated,
STATS_DATE(o.object_id,i.index_id) AS StatisticsUpdateDate,
diu.user_seeks AS UserSeek ,
diu.user_scans AS UserScans ,
diu.user_lookups AS UserLookups ,
diu.user_updates AS UserUpdates ,
p.TableRows ,
''DROP INDEX '' + QUOTENAME(i.name)
+ '' ON '' + QUOTENAME(s.name) + ''.''
+ QUOTENAME(OBJECT_NAME(diu.object_id)) +'';'' AS ''Drop Index Statement''
FROM sys.dm_db_index_usage_stats diu
INNER JOIN sys.indexes i ON i.index_id = diu.index_id
AND diu.object_id = i.object_id
INNER JOIN sys.objects o ON diu.object_id = o.object_id
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN ( SELECT SUM(p.rows) TableRows ,
p.index_id ,
p.object_id
FROM sys.partitions p
GROUP BY p.index_id ,
p.object_id
) p ON p.index_id = diu.index_id
AND diu.object_id = p.object_id
WHERE OBJECTPROPERTY(diu.object_id, ''IsUserTable'') = 1
AND diu.database_id = DB_ID()
AND i.is_primary_key = 0 --排除主键索引
AND i.is_unique_constraint = 0 --排除唯一索引
AND diu.user_updates <> 0 --排除没有数据变化的索引
AND diu.user_lookups = 0
AND diu.user_seeks = 0
AND diu.user_scans = 0
AND i.name is not null
ORDER BY ( diu.user_seeks + diu.user_scans + diu.user_lookups ) ASC,diu.user_updates DESC;
'
另外,出于谨慎考虑,在删除索引前,必须先保留那些即将删除的索引的脚本,以防误删索引时(当然这种情况极少见),能够回滚,及时补救。所以可以使用下面脚本生成那些unused idnex的创建脚本。
SELECT 'SQL Server Instance Start with ' + CONVERT(VARCHAR(16),create_date,120) FROM sys.databases
WHERE database_id =2;
IF EXISTS(SELECT * FROM tempdb.dbo.sysobjects WHERE id=OBJECT_ID('tempdb.dbo.#index_stat'))
BEGIN
DROP TABLE #index_stat;
END
GO
SELECT DB_NAME(diu.database_id) AS DatabaseName ,
o.object_id AS object_id ,
s.name +'.' +QUOTENAME(o.name) AS TableName ,
i.index_id AS IndexID ,
i.name AS IndexName ,
CASE WHEN i.is_unique =1 THEN 'UNIQUE INDEX'
ELSE 'NOT UNIQUE INDEX' END AS IS_UNIQUE,
CASE WHEN i.is_disabled=1 THEN 'DISABLE'
ELSE 'ENABLE' END AS IndexStatus,
o.create_date AS IndexCreated,
STATS_DATE(o.object_id,i.index_id) AS StatisticsUpdateDate,
diu.user_seeks AS UserSeek ,
diu.user_scans AS UserScans ,
diu.user_lookups AS UserLookups ,
diu.user_updates AS UserUpdates ,
p.TableRows ,
'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.'
+ QUOTENAME(OBJECT_NAME(diu.object_id)) +';' AS 'Drop Index Statement' INTO #index_stat
FROM sys.dm_db_index_usage_stats diu
INNER JOIN sys.indexes i ON i.index_id = diu.index_id
AND diu.object_id = i.object_id
INNER JOIN sys.objects o ON diu.object_id = o.object_id
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN ( SELECT SUM(p.rows) TableRows ,
p.index_id ,
p.object_id
FROM sys.partitions p
GROUP BY p.index_id ,
p.object_id
) p ON p.index_id = diu.index_id
AND diu.object_id = p.object_id
WHERE OBJECTPROPERTY(diu.object_id, 'IsUserTable') = 1
AND diu.database_id = DB_ID()
AND i.is_primary_key = 0 --排除主键索引
AND i.is_unique_constraint = 0 --排除唯一索引
AND diu.user_updates <> 0 --排除没有数据变化的索引
AND diu.user_lookups = 0
AND diu.user_seeks = 0
AND diu.user_scans = 0
AND i.name IS NOT NULL
ORDER BY ( diu.user_seeks + diu.user_scans + diu.user_lookups ) ASC,diu.user_updates DESC;
GO
SELECT * FROM #index_stat WHERE IndexName IS NOT NULL ORDER BY TableName, IndexID;
SELECT ' CREATE ' +
CASE WHEN I.is_unique = 1 THEN ' UNIQUE ' ELSE '' END +
I.type_desc COLLATE DATABASE_DEFAULT +' INDEX ' +
I.name + ' ON ' +
Schema_name(T.Schema_id)+'.'+T.name + ' ( ' +
KeyColumns + ' ) ' +
ISNULL(' INCLUDE ('+IncludedColumns+' ) ','') +
ISNULL(' WHERE '+I.Filter_definition,'') + ' WITH ( ' +
CASE WHEN I.is_padded = 1 THEN ' PAD_INDEX = ON ' ELSE ' PAD_INDEX = OFF ' END + ',' +
'FILLFACTOR = '+CONVERT(CHAR(5),CASE WHEN I.Fill_factor = 0 THEN 100 ELSE I.Fill_factor END) + ',' +
-- default value
'SORT_IN_TEMPDB = OFF ' + ',' +
CASE WHEN I.ignore_dup_key = 1 THEN ' IGNORE_DUP_KEY = ON ' ELSE ' IGNORE_DUP_KEY = OFF ' END + ',' +
CASE WHEN ST.no_recompute = 0 THEN ' STATISTICS_NORECOMPUTE = OFF ' ELSE ' STATISTICS_NORECOMPUTE = ON ' END + ',' +
-- default value
' DROP_EXISTING = ON ' + ',' +
-- default value
' ONLINE = OFF ' + ',' +
CASE WHEN I.allow_row_locks = 1 THEN ' ALLOW_ROW_LOCKS = ON ' ELSE ' ALLOW_ROW_LOCKS = OFF ' END + ',' +
CASE WHEN I.allow_page_locks = 1 THEN ' ALLOW_PAGE_LOCKS = ON ' ELSE ' ALLOW_PAGE_LOCKS = OFF ' END + ' ) ON [' +
DS.name + ' ] ' [CreateIndexScript]
FROM sys.indexes I
JOIN sys.tables T ON T.Object_id = I.Object_id
JOIN sys.sysindexes SI ON I.Object_id = SI.id AND I.index_id = SI.indid
JOIN (SELECT * FROM (
SELECT IC2.object_id , IC2.index_id ,
STUFF((SELECT ' , ' + C.name + CASE WHEN MAX(CONVERT(INT,IC1.is_descending_key)) = 1 THEN ' DESC ' ELSE ' ASC ' END
FROM sys.index_columns IC1
JOIN Sys.columns C
ON C.object_id = IC1.object_id
AND C.column_id = IC1.column_id
AND IC1.is_included_column = 0
WHERE IC1.object_id = IC2.object_id
AND IC1.index_id = IC2.index_id
GROUP BY IC1.object_id,C.name,index_id
ORDER BY MAX(IC1.key_ordinal)
FOR XML PATH('')), 1, 2, '') KeyColumns
FROM sys.index_columns IC2
GROUP BY IC2.object_id ,IC2.index_id) tmp3 )tmp4
ON I.object_id = tmp4.object_id AND I.Index_id = tmp4.index_id
JOIN sys.stats ST ON ST.object_id = I.object_id AND ST.stats_id = I.index_id
JOIN sys.data_spaces DS ON I.data_space_id=DS.data_space_id
JOIN sys.filegroups FG ON I.data_space_id=FG.data_space_id
LEFT JOIN (SELECT * FROM (
SELECT IC2.object_id , IC2.index_id ,
STUFF((SELECT ' , ' + C.name
FROM sys.index_columns IC1
JOIN Sys.columns C
ON C.object_id = IC1.object_id
AND C.column_id = IC1.column_id
AND IC1.is_included_column = 1
WHERE IC1.object_id = IC2.object_id
AND IC1.index_id = IC2.index_id
GROUP BY IC1.object_id,C.name,index_id
FOR XML PATH('')), 1, 2, '') IncludedColumns
FROM sys.index_columns IC2
GROUP BY IC2.object_id ,IC2.index_id) tmp1
WHERE IncludedColumns IS NOT NULL ) tmp2
ON tmp2.object_id = I.object_id AND tmp2.index_id = I.index_id
WHERE I.is_primary_key = 0 AND I.is_unique_constraint = 0
AND EXISTS( SELECT 1 FROM #index_stat dx WHERE dx.IndexID = i.index_id AND dx.object_id = i.object_id)
最后在删除索引过后,需要监控一段时间,通过监控工具对比、监控索引删除后的性能情况。有时候可能也没有显著的性能提高,主要监控是否出现由于误删索引,导致数据库性能出现异常的情况。
参考资料: