欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

为什么建议使用count(*)来统计数据行数

程序员文章站 2022-03-11 09:29:04
对比MySQL5.7版本、Oracle 11g版本、TiDB3.0版本其中MySQL是基于 InnoDB 引擎的。MySQL中各种count的行为参考自:丁奇老师的MySQL 45讲这里,首先你要弄清楚 count() 的语义。count() 是一个聚合函数,对于返回的结果集,一行行地判断,如果 count 函数的参数不是 NULL,累计值就加 1,否则不加。最后返回累计值。所以,count(*)、count(主键 id) 和 count(1) 都表示返回满足条件的结果集的总行数;而 ....

对比MySQL5.7版本、Oracle 11g版本、TiDB3.0版本

其中MySQL是基于 InnoDB 引擎的。

 

MySQL中各种count的行为

参考自:丁奇老师的MySQL 45讲

这里,首先你要弄清楚 count() 的语义。count() 是一个聚合函数,对于返回的结果集,一行行地判断,如果 count 函数的参数不是 NULL,累计值就加 1,否则不加。最后返回累计值。所以,count(*)、count(主键 id) 和 count(1) 都表示返回满足条件的结果集的总行数;而 count(字段),则表示返回满足条件的数据行里面,参数“字段”不为 NULL 的总个数。至于分析性能差别的时候,你可以记住这么几个原则:

  1. server 层要什么就给什么;
  2. InnoDB 只给必要的值;
  3. 现在的优化器只优化了 count(*) 的语义为“取行数”,其他“显而易见”的优化并没有做。

这是什么意思呢?接下来,我们就一个个地来看看。

 

对于 count(主键 id) 来说,InnoDB 引擎会遍历整张表,把每一行的 id 值都取出来,返回给 server 层。server 层拿到 id 后,判断是不可能为空的,就按行累加。

 

对于 count(1) 来说,InnoDB 引擎遍历整张表,但不取值。server 层对于返回的每一行,放一个数字“1”进去,判断是不可能为空的,按行累加。

单看这两个用法的差别的话,你能对比出来,count(1) 执行得要比 count(主键 id) 快。因为从引擎返回 id 会涉及到解析数据行,以及拷贝字段值的操作。

 

对于 count(字段) 来说:

  1. 如果这个“字段”是定义为 not null 的话,一行行地从记录里面读出这个字段,判断不能为 null,按行累加;
  2. 如果这个“字段”定义允许为 null,那么执行的时候,判断到有可能是 null,还要把值取出来再判断一下,不是 null 才累加。也就是前面的第一条原则,server 层要什么字段,InnoDB 就返回什么字段。

count(字段)和其它count()有明显的区别就是不会计数该字段中的空值。

 

对于count(*) 来说:

并不会把全部字段取出来,而是专门做了优化,不取值。count(*) 肯定不是 null,按行累加。看到这里,你一定会说,优化器就不能自己判断一下吗,主键 id 肯定非空啊,为什么不能按照 count(*) 来处理,多么简单的优化啊。当然,MySQL 专门针对这个语句进行优化,也不是不可以。但是这种需要专门优化的情况太多了,而且 MySQL 已经优化过 count(*) 了,你直接使用这种用法就可以了。

 

所以结论是:按照效率排序的话,count(字段)<count(主键id)<count(1) ≈count(*),所以我建议你,尽量使用 count(*)。

 

MySQL中测试下效率

测试下前面丁奇老师的按照效率高低的结论

创建一张1000万数据的测试表,表结构如下:

CREATE TABLE `t` (
  `a` int(11) NOT NULL,
  `name` varchar(32) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

表上没有主键的情况下

执行计划都一样的

explain select count(a) from t;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
|  1 | SIMPLE      | t     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9750627 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+

explain select count(1) from t;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
|  1 | SIMPLE      | t     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9750627 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
 explain select count(*) from t;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
|  1 | SIMPLE      | t     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9750627 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+

看看执行时间的差异:多执行几次取稳定的时间

+-------------+
| count(name) |
+-------------+
|    10000000 |
+-------------+
1 row in set (3.92 sec)

+----------+
| count(a) |
+----------+
| 10000000 |
+----------+
1 row in set (3.84 sec)

+----------+
| count(1) |
+----------+
| 10000000 |
+----------+
1 row in set (3.63 sec)

+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (3.63 sec)

结论:在表上没有主键和二级索引的时候 count(字段)<count(主键id)<count(1) ≈count(*),符合前面说的理论规律

给表t添加主键并收集下统计信息再测试下:

alter table t add primary key (a);
analyze table t;

先看下执行计划:除了count(字段),其它三种执行计划都一样

explain select count(name) from t;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
|  1 | SIMPLE      | t     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 10042216 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+

 explain select count(a) from t;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | PRIMARY | 4       | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+

explain select count(1) from t;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | PRIMARY | 4       | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+

explain select count(*) from t;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | PRIMARY | 4       | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+----------+----------+-------------+

再看下执行时间:多执行几次,取稳定的时间

+-------------+
| count(name) |
+-------------+
|    10000000 |
+-------------+
1 row in set (2.19 sec)

+----------+
| count(a) |
+----------+
| 10000000 |
+----------+
1 row in set (2.03 sec)

+----------+
| count(1) |
+----------+
| 10000000 |
+----------+
1 row in set (1.85 sec)

+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (1.82 sec)

结论:在表上只有一个主键的时候 count(字段)<count(主键id)<count(1) ≈count(*)

给表添加测试列以及二级索引

 

alter table t add c1 int,add c2 int,add c3 varchar(4);
alter table t add index idx_c1(c1),add index idx_name(name);
analyze table t;
show create table t;
 CREATE TABLE `t` (
  `a` int(11) NOT NULL,
  `name` varchar(32) DEFAULT NULL,
  `c1` int(11) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  `c3` varchar(4) DEFAULT NULL,
  PRIMARY KEY (`a`),
  KEY `idx_c1` (`c1`),
  KEY `idx_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |

show index from t;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t     |          0 | PRIMARY  |            1 | a           | A         |    10042216 |     NULL | NULL   |      | BTREE      |         |               |
| t     |          1 | idx_c1   |            1 | c1          | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| t     |          1 | idx_name |            1 | name        | A         |     9987441 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

看下各语句的执行计划:

 explain select count(name) from t;
+----+-------------+-------+------------+-------+---------------+----------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | idx_name | 131     | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

explain select count(a) from t;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key    | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | idx_c1 | 5       | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

explain select count(1) from t;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key    | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | idx_c1 | 5       | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

explain select count(*) from t;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key    | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | index | NULL          | idx_c1 | 5       | NULL | 10042216 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+----------+----------+-------------+

从执行计划中可以看出,select count(name) from t 因为逻辑语义的不同,所以会走自己字段上的索引。其它三种被优化为走idx_c1索引。不走name上的索引是因为name列上索引比较大。

 

测试各自的执行时间:

+-------------+
| count(name) |
+-------------+
|    10000000 |
+-------------+
1 row in set (2.12 sec)

select count(a) from t;
+----------+
| count(a) |
+----------+
| 10000000 |
+----------+
1 row in set (1.87 sec)

select count(1) from t;
+----------+
| count(1) |
+----------+
| 10000000 |
+----------+
1 row in set (1.70 sec)

select count(*) from t;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (1.72 sec)

补充测试:因为count(字段) name列上有索引,count(*) 被优化为走idx_c1索引,不在同一个对比维度,可以再测试下count(c1),也是同样的执行计划
+-----------+
| count(c1) |
+-----------+
|         0 |
+-----------+
1 row in set (1.92 sec)

总结:还是符合规律count(字段)<count(主键id)<count(1) ≈count(*)

 

测试下Oracle中的情况

同样是1000万级别的数据:

  CREATE TABLE "SCOTT"."T"
   ( "A" NUMBER(*,0) NOT NULL ENABLE,
        "NAME" VARCHAR2(32),
        "C1" NUMBER(*,0),
        "C2" NUMBER(*,0),
        "C3" VARCHAR2(40),
         PRIMARY KEY ("A")
  USING INDEX  ENABLE) ;
  CREATE INDEX "SCOTT"."IDX_C1" ON "SCOTT"."T" ("C1");
  CREATE INDEX "SCOTT"."IDX_NAME" ON "SCOTT"."T" ("NAME");

同样对表收集统计信息:

begin   
dbms_stats.gather_table_stats
(
OWNNAME => 'SCOTT',
TABNAME => 'T',
ESTIMATE_PERCENT =>dbms_stats.AUTO_SAMPLE_SIZE, 
METHOD_OPT=>'FOR ALL INDEXED COLUMNS',
degree=>4, 
cascade=> TRUE);
end;

查看执行计划:

set autotrace traceonly exp;

select count(name) from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 4041003673
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |  8589   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|   2 |   TABLE ACCESS FULL| T    |    10M|   114M|  8589   (1)| 00:00:01 |
---------------------------------------------------------------------------

select count(a) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 4041003673

------------------------------------------------------------------------------
| Id  | Operation             | Name         | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |              |     1 |  5310   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE       |              |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C0014859 |    10M|  5310   (1)| 00:00:01 |
------------------------------------------------------------------------------

select count(1) from t;
Execution Plan
----------------------------------------------------------

Plan hash value: 4041003673

------------------------------------------------------------------------------
| Id  | Operation             | Name         | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |              |     1 |  5310   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE       |              |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C0014859 |    10M|  5310   (1)| 00:00:01 |
------------------------------------------------------------------------------

select count(*) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 4041003673

------------------------------------------------------------------------------
| Id  | Operation             | Name         | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |              |     1 |  5310   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE       |              |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C0014859 |    10M|  5310   (1)| 00:00:01 |
------------------------------------------------------------------------------

补充:
select count(c1) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 1226206340

---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |    13 |     0   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |        |     1 |    13 |            |          |
|   2 |   INDEX FULL SCAN| IDX_C1 |    10M|   123M|     0   (0)| 00:00:01 |
---------------------------------------------------------------------------

执行计划说明:1、oracle会对所有count()情况做出优化

2、Oracle中的索引是不会存储空值的,mysql中会把null值做个标志位。所以count(*)的时候会优化为扫描主键上索引

3、select count(name) from t; 优化器优化为全表扫描并不奇怪,因为name列上的索引比较大,并且还存储rowid的值。全表扫描可以利用多块读、直接路径读等特性提高效率。但不能走主键快速扫,因为我统计的是name列上的非空值个数。

 

执行时间上的对比:多执行几次,把数据块缓存起来

select count(name) from t;
Elapsed: 00:00:00.49
select count(a) from t;
Elapsed: 00:00:00.34
select count(1) from t;
Elapsed: 00:00:00.29
select count(*) from t;
Elapsed: 00:00:00.28

--补充
select count(c1) from t;
Elapsed: 00:00:00.00

总结:Oracle中虽然对count()行为有不同的优化措施,但是也符合规律count(主键id)<count(1) ≈count(*)

count(字段)的效率这个要取决于该段上面索引的大小,并且和其它三种count逻辑语义上不等价,不能放一起对比。

测试TiDB中的情况

同样是1000万级别的数据:

show create table t;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                               |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
  `a` int(11) NOT NULL,
  `name` varchar(32) DEFAULT NULL,
  `c1` int(11) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  `c3` varchar(40) DEFAULT NULL,
  PRIMARY KEY (`a`),
  KEY `idx_name` (`name`),
  KEY `idx_c1` (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 

analyze table t;
show index from t;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t     |          0 | PRIMARY  |            1 | a           | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| t     |          1 | idx_name |            1 | name        | A         |           0 |     NULL | NULL   | YES  | BTREE      |         |               |
| t     |          1 | idx_c1   |            1 | c1          | A         |           0 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

 

执行计划:

explain analyze select count(name) from t;
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| id                     | count       | task | operator info                                            | execution info                                                                      | memory    |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| StreamAgg_16           | 1.00        | root | funcs:count(col_0)                                       | time:575.797216ms, loops:2, rows:1                                                  | N/A       |
| └─IndexReader_17       | 1.00        | root | index:StreamAgg_8                                        | time:575.789969ms, loops:2, rows:12                                                 | 257 Bytes |
|   └─StreamAgg_8        | 1.00        | cop  | funcs:count(mtest.t.name)                                | proc max:575ms, min:47ms, p80:548ms, p95:575ms, rows:12, iters:9771, tasks:12       | N/A       |
|     └─IndexScan_15     | 10000000.00 | cop  | table:t, index:name, range:[NULL,+inf], keep order:false | proc max:498ms, min:42ms, p80:473ms, p95:498ms, rows:10000000, iters:9771, tasks:12 | N/A       |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
4 rows in set (0.58 sec)

18:50:40[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> explain analyze select count(a) from t;
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| id                     | count       | task | operator info                                            | execution info                                                                      | memory    |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| StreamAgg_20           | 1.00        | root | funcs:count(col_0)                                       | time:605.49247ms, loops:2, rows:1                                                   | N/A       |
| └─IndexReader_21       | 1.00        | root | index:StreamAgg_8                                        | time:605.486666ms, loops:2, rows:12                                                 | 257 Bytes |
|   └─StreamAgg_8        | 1.00        | cop  | funcs:count(mtest.t.a)                                   | proc max:604ms, min:41ms, p80:495ms, p95:604ms, rows:12, iters:9771, tasks:12       | N/A       |
|     └─IndexScan_18     | 10000000.00 | cop  | table:t, index:name, range:[NULL,+inf], keep order:false | proc max:600ms, min:41ms, p80:492ms, p95:600ms, rows:10000000, iters:9771, tasks:12 | N/A       |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
4 rows in set (0.60 sec)

18:50:41[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> explain analyze select count(1) from t;
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| id                     | count       | task | operator info                                            | execution info                                                                      | memory    |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| StreamAgg_20           | 1.00        | root | funcs:count(col_0)                                       | time:676.323439ms, loops:2, rows:1                                                  | N/A       |
| └─IndexReader_21       | 1.00        | root | index:StreamAgg_8                                        | time:676.318048ms, loops:2, rows:12                                                 | 259 Bytes |
|   └─StreamAgg_8        | 1.00        | cop  | funcs:count(1)                                           | proc max:674ms, min:41ms, p80:575ms, p95:674ms, rows:12, iters:9771, tasks:12       | N/A       |
|     └─IndexScan_18     | 10000000.00 | cop  | table:t, index:name, range:[NULL,+inf], keep order:false | proc max:674ms, min:41ms, p80:575ms, p95:674ms, rows:10000000, iters:9771, tasks:12 | N/A       |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
4 rows in set (0.68 sec)

18:50:42[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> explain analyze select count(*) from t;
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| id                     | count       | task | operator info                                            | execution info                                                                      | memory    |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
| StreamAgg_20           | 1.00        | root | funcs:count(col_0)                                       | time:587.30061ms, loops:2, rows:1                                                   | N/A       |
| └─IndexReader_21       | 1.00        | root | index:StreamAgg_8                                        | time:587.293109ms, loops:2, rows:12                                                 | 257 Bytes |
|   └─StreamAgg_8        | 1.00        | cop  | funcs:count(1)                                           | proc max:587ms, min:55ms, p80:527ms, p95:587ms, rows:12, iters:9771, tasks:12       | N/A       |
|     └─IndexScan_18     | 10000000.00 | cop  | table:t, index:name, range:[NULL,+inf], keep order:false | proc max:586ms, min:55ms, p80:527ms, p95:586ms, rows:10000000, iters:9771, tasks:12 | N/A       |
+------------------------+-------------+------+----------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+
4 rows in set (0.59 sec)

其中count(*)和cunt(1)的执行计划一样,count(字段)稍微不一样

看看执行时间

18:53:25[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> select count(name) from t;
+-------------+
| count(name) |
+-------------+
|    10000000 |
+-------------+
1 row in set (0.57 sec)

18:53:26[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> select count(a) from t;
+----------+
| count(a) |
+----------+
| 10000000 |
+----------+
1 row in set (0.52 sec)

18:53:27[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> select count(1) from t;
+----------+
| count(1) |
+----------+
| 10000000 |
+----------+
1 row in set (0.53 sec)

18:53:28[5.7.25-TiDB-v3.0.16]root->127.0.0.1[mtest]> select count(*) from t;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (0.56 sec)

多次执行后发现tidb中执行时间并不稳定,因为分布式数据库中 tidb server和tikv server之间还有网络的消耗。

总结:tidb中几种count执行时间上没有多少差别,也没有找到官方关于count(*)的优化说明。tidb中也是建议使用count(*)来统计数据行数。

本文地址:https://blog.csdn.net/u010033674/article/details/110633029