欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

mysql中删除重复记录sql语句

程序员文章站 2022-05-24 10:27:05
...

在sql中我们经常会碰到有重复的一些数据,下面我来介绍在mysql中删除重复记录的多种方法,有需要的朋友可参考参考。

删除重复记录方法一:

1. 新建一个临时表

代码如下 复制代码

create table tmp as select * from youtable group by name(name为不希望有重复的列)

2. 删除原来的表

代码如下 复制代码

drop table youtable

3. 重命名表

代码如下 复制代码

alter table tmp rename youtable

但是这个方法有个问题,由临时表转变过来的最终表,其表结构会和原来的不一致,需要手工更改。这个问题,待解决。
删除重复记录方法二:

1. 新建一个临时表

代码如下 复制代码

CREATE TABLE tmp AS SELECT * FROM youtable GROUP BY name(name为不希望有重复的列)

2. 清空原来的表

代码如下 复制代码

TRUNCATE TABLE youtable

3. 把临时表插入到youtable

代码如下 复制代码

INSERT INTO tablename SELECT * FROM temp

4. 删除临时表

代码如下 复制代码

DROP TABLE temp

删除重复记录方法三:

代码如下 复制代码

delete table where ID not in(select min(ID) from table group by name(name:重复的字段))

删除重复记录方法四:


具体实现如下:

代码如下 复制代码

Table Create Table
------------ --------------------------------------------------------
users_groups CREATE TABLE `users_groups` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`uid` int(11) NOT NULL,
`gid` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8

users_groups.txt内容:

1,11,502
2,107,502
3,100,503
4,110,501
5,112,501
6,104,502
7,100,502
8,100,501
9,102,501
10,104,502
11,100,502
12,100,501
13,102,501
14,110,501

mysql> load data infile 'c:\users_groups.txt' into table users_groups fields
terminated by ',' lines terminated by 'n';
Query OK, 14 rows affected (0.05 sec)
Records: 14 Deleted: 0 Skipped: 0 Warnings: 0

mysql> select * from users_groups;

query result(14 records)

id uid gid
1 11 502
2 107 502
3 100 503
4 110 501
5 112 501
6 104 502
7 100 502
8 100 501
9 102 501
10 104 502
11 100 502
12 100 501
13 102 501
14 110 501
14 rows in set (0.00 sec)

根据一位兄弟的建议修改。

代码如下 复制代码

mysql> create temporary table tmp_wrap select * from users_groups group by uid having count(1) >= 1;
Query OK, 7 rows affected (0.11 sec)
Records: 7 Duplicates: 0 Warnings: 0

mysql> truncate table users_groups;
Query OK, 14 rows affected (0.03 sec)

mysql> insert into users_groups select * from tmp_wrap;
Query OK, 7 rows affected (0.03 sec)
Records: 7 Duplicates: 0 Warnings: 0

mysql> select * from users_groups;

query result(7 records)
id uid gid
1 11 502
2 107 502
3 100 503
4 110 501
5 112 501
6 104 502
9 102 501


mysql> drop table tmp_wrap;
Query OK, 0 rows affected (0.05 sec)

2、还有一个很精简的办法。

查找重复的,并且除掉最小的那个。

代码如下 复制代码

delete users_groups as a from users_groups as a,
(
select *,min(id) from users_groups group by uid having count(1) > 1
) as b
where a.uid = b.uid and a.id > b.id;
(7 row(s)affected)
(0 ms taken)

query result(7 records)
id uid gid
1 11 502
2 107 502
3 100 503
4 110 501
5 112 501
6 104 502
9 102 501

3、现在来看一下这两个办法的效率。
运行一下以下SQL 语句

代码如下 复制代码

create index f_uid on users_groups(uid);
explain select * from users_groups group by uid having count(1) > 1 union all
select * from users_groups group by uid having count(1) = 1;
explain select * from users_groups as a,
(
select *,min(id) from users_groups group by uid having count(1) > 1
) as b
where a.uid = b.uid and a.id > b.id;
query result(3 records)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY users_groups index (NULL) f_uid 4 (NULL) 14
2 UNION users_groups index (NULL) f_uid 4 (NULL) 14
(NULL) UNION RESULT ALL (NULL) (NULL) (NULL) (NULL) (NULL)


query result(3 records)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ALL (NULL) (NULL) (NULL) (NULL) 4
1 PRIMARY a ref PRIMARY,f_uid f_uid 4 b.uid 1 Using where
2 DERIVED users_groups index (NULL) f_uid 4 (NULL) 14



很明显的第二个比第一个扫描的函数要少。

当没有创建表或创建索引权限的时候

创建一个新表,然后将原表中不重复的数据插入新表:

代码如下 复制代码

mysql> create table demo_new as select * from demo group by site;
Query OK, 3 rows affected (0.19 sec)
Records: 3 Duplicates: 0 Warnings: 0

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| demo |
| demo_new |
+----------------+
2 rows in set (0.00 sec)

mysql> select * from demo order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.hzhuti.com |
| 2 | http://111cn.net |
| 3 | http://www.111cn.net |
| 4 | http://www.hzhuti.com |
| 5 | http://www.111cn.net |
+----+------------------------+
5 rows in set (0.00 sec)

mysql> select * from demo_new order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.hzhuti.com |
| 2 | http://111cn.net |
| 3 | http://www.111cn.net |
+----+------------------------+
3 rows in set (0.00 sec)

然后将原表备份,将新表重命名为当前表:

代码如下 复制代码

mysql> rename table demo to demo_old, demo_new to demo;
Query OK, 0 rows affected (0.04 sec)

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| demo |
| demo_old |
+----------------+
2 rows in set (0.00 sec)

mysql> select * from demo order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.hzhuti.com |
| 2 | http://111cn.net |
| 3 | http://www.111cn.net |
+----+------------------------+
3 rows in set (0.00 sec)

相关标签: 删除重复记录