关于关系数据库如何快速查询表的记录数详解

程序员文章站 2022-05-03 16:23:19

前言在数据库中，很多人员习惯使用select count（*）、select count（1) 、select count(col)来查询一个表有多少记录，对于小表，...

前言

在数据库中，很多人员习惯使用select count（*） 、select count（1) 、select count(col)来查询一个表有多少记录，对于小表，这种sql的开销倒不是很大，但是对于大表，这种查询表记录数的做法就是一个非常消耗资源了，而且效率很差。下面介绍一下sql server、 oracle、mysql中如何快速获取表的记录数。

sql server 数据库

在sql server数据库中, 对象目录视图sys.partitions中有一个字段rows会记录表的记录数。我们以adventureworks2014为测试数据库。

select object_name(object_id) as object_name, 
 i.name   as index_name,
 p.rows   as table_rows
from sys.partitions p
left join sys.sysindexes i on p.object_id = i.id and p.index_id = i.indid 
where
 object_id = object_id('tablename')

那么我们还有一些疑问，我们先来看看这些问题吧！

1：没有索引的表是否也可以使用上面脚本？

2：只有非聚集索引的堆表是否可以使用上面脚本？

3：有多个索引的表，是否记录数会存在不一致的情况？

4：统计信息不准确的表，是否rows也会不准确

5: 分区表的情况又是怎么样？

6：对象目录视图sys.partitions与sp_spaceused获取的表记录函数是否准确。

如下所示，我们先构造测试案例：

if exists(select 1 from sys.objects where type='u' and name='test_tab_row')
begin
 drop table test_tab_row;
end
 
if not exists(select 1 from sys.objects where type='u' and name='test_tab_row')
begin
 create table test_tab_row
 (
  id int,
  name char(200)
 )
end
go
 
set nocount on;
begin tran
declare @index int =1;
 
while @index <= 100000
begin
 
 insert into test_tab_row
 values(@index, newid());
 
 set @index+=1;
 
 if (@index % 5000) = 0 
 begin
 if @@trancount > 0 
  begin 
  commit;
  begin tran
  end
 end
end
 
if @@trancount > 0 
begin 
 commit;
end
go

关于问题1、问题2，都可以使用上面脚本，如下测试所示：

select object_name(object_id) as object_name, 
 i.name   as index_name,
 p.rows   as table_rows
from sys.partitions p
left join sys.sysindexes i on p.object_id = i.id and p.index_id = i.indid 
where
 object_id = object_id('dbo.test_tab_row')

在表dbo.test_tab_row 上创建非聚集索引后，查询结果如下所示：

create index ix_test_tab_row on test_tab_row(id);

我们插入500条记录，此时，这个数据量不足以触发统计信息更新，如下所示， rows sampled还是1000000

declare @index int =1;
 
while @index <= 500
begin
 
 insert into test_tab_row
 values(100000 +@index, newid());
 
 set @index+=1;
 
end

如下所示，发现sys.partitions中的记录变成了100500了，可见rows这个值的计算不依赖统计信息。

当然，如果你用sp_spaceused，发现这里面的记录也是100500

sp_spaceused 'dbo.test_tab_row'

关于问题3：有多个索引的表，是否记录数会存在不一致的情况？

答案：个人测试以及统计来看，暂时发现多个索引的情况下，sys.partitions中的rows记录数都是一致的。暂时没有发现不一致的情况，当然也不排除有特殊情况。

关于问题5: 分区表的情况又是怎么样？

答案：分区表和普通表没有任何区别。

关于问题6：对象目录视图sys.partitions与sp_spaceused获取的表记录函数是否准确？

答案：对象目录视图sys.partitions与sp_spaceused获取的表记录数是准确的。

oracle 数据库

在oracle数据库中，可以通过dba_tables、all_tables、user_tables视图查看表的记录数，不过这个值（num_rows）跟统计信息有很大的关系，有时候统计信息没有更新或采样比例会导致这个值不是很准确。

select owner  , 
  table_name, 
  num_rows ,
  last_analyzed
from dba_tables 
where owner = '&owner' 
  and table_name = '&table_name'; 
 
 
select owner,
  table_name,
  num_rows ,
  last_analyzed
from all_tables
where owner ='&owner'
and table_name='&table_name';
 
 
select table_name,
  num_rows ,
  last_analyzed
from user_tables
where table_name='&table_name'

更新统计信息后，就能得到准确的行数。所以如果需要得到正确的数据，最好更新目标表的统计信息，进行100%采样分析。对于分区表，那么就需要从dba_tab_partitions里面查询相关数据了。

sql>execute dbms_stats.gather_table_stats(ownname => 'username', tabname =>'tablename', estimate_percent =>100, cascade=>true);

mysql数据库

在mysql中比较特殊，虽然information_schema.tables也可以查到表的记录数，但是非常不准确。如下所示，即使使用analyze table更新了统计信息，从information_schema.tables中获取的记录依然不准确

select table_rows 
from information_schema.tables
where table_name='table_name'

mysql> select table_rows 
 -> from information_schema.tables
 -> where table_name='jiraissue'
 -> ;
+------------+
| table_rows |
+------------+
|  36487 |
+------------+
1 row in set (0.01 sec)
 
mysql> select count(*) from jiraissue;
+----------+
| count(*) |
+----------+
| 36973 |
+----------+
1 row in set (0.05 sec)
 
mysql> analyze table jiraissue;
+----------------+---------+----------+----------+
| table   | op  | msg_type | msg_text |
+----------------+---------+----------+----------+
| jira.jiraissue | analyze | status | ok  |
+----------------+---------+----------+----------+
1 row in set (1.41 sec)
 
mysql> select table_rows 
 -> from information_schema.tables
 -> where table_name='jiraissue';
+------------+
| table_rows |
+------------+
|  34193 |
+------------+
1 row in set (0.00 sec)
 
mysql>

如上所示，mysql这种查询表记录数的方法看来还是有缺陷的。当然如果不是要求非常精确的值，这个方法也是不错的。

当然，上面介绍的sql server、oracle、mysql数据库中的方法，还是有一些局限性的。例如，只能查询整张表的记录数，对于那些查询记录数带有查询条件（where）这类sql。还是必须使用select count(*）这种方法。

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，谢谢大家对的支持。

相关标签：关系数据库查询记录数

上一篇：如何查询日期类型的数据？

下一篇： ThinkPHP进程计数类Process用法实例详解