数据库删除完全重复和部分关键字段重复的记录
程序员文章站
2022-11-21 12:07:45
1、第一种重复很容易解决,不同数据库环境下方法相似: 以下为引用的内容: mysql create table tmp ...
1、第一种重复很容易解决,不同数据库环境下方法相似:
以下为引用的内容:
mysql
create table tmp select distinct * from tablename;
drop table tablename;
create table tablename select * from tmp;
drop table tmp;
sql server
select distinct * into #tmp from tablename;
drop table tablename;
select * into tablename from #tmp;
drop table #tmp;
oracle
create table tmp as select distinct * from tablename;
drop table tablename;
create table tablename as select * from tmp;
drop table tmp;
发生这种重复的原因是由于表设计不周而产生的,增加唯一索引列就可以解决此问题。
2、此类重复问题通常要求保留重复记录中的第一条记录,操作方法如下。 假设有重复的字段为name,address,要求得到这两个字段唯一的结果集
mysql
以下为引用的内容:
alter table tablename add autoid int auto_increment not null;
create table tmp select min(autoid) as autoid from tablename group by name,address;
create table tmp2 select tablename.* from tablename,tmp where tablename.autoid = tmp.autoid;
drop table tablename;
rename table tmp2 to tablename;
sql server
select identity(int,1,1) as autoid, * into #tmp from tablename;
select min(autoid) as autoid into #tmp2 from #tmp group by name,address;
drop table tablename;
select * into tablename from #tmp where autoid in(select autoid from #tmp2);
drop table #tmp;
drop table #tmp2;
oracle
delete from tablename t1 where t1.rowid > (select min(t2.rowid) from tablename t2 where t2.name = t1.name and t2.address = t1.address);
说明:
1. mysql和sql server中最后一个select得到了name,address不重复的结果集(多了一个autoid字段,在大家实际写时可以写在select子句中省去此列)
2. 因为mysql和sql server没有提供rowid机制,所以需要通过一个autoid列来实现行的唯一性,而利用oracle的rowid处理就方便多了。而且使用rowid是最高效的删除重复记录方法。
以下为引用的内容:
mysql
create table tmp select distinct * from tablename;
drop table tablename;
create table tablename select * from tmp;
drop table tmp;
sql server
select distinct * into #tmp from tablename;
drop table tablename;
select * into tablename from #tmp;
drop table #tmp;
oracle
create table tmp as select distinct * from tablename;
drop table tablename;
create table tablename as select * from tmp;
drop table tmp;
发生这种重复的原因是由于表设计不周而产生的,增加唯一索引列就可以解决此问题。
2、此类重复问题通常要求保留重复记录中的第一条记录,操作方法如下。 假设有重复的字段为name,address,要求得到这两个字段唯一的结果集
mysql
以下为引用的内容:
alter table tablename add autoid int auto_increment not null;
create table tmp select min(autoid) as autoid from tablename group by name,address;
create table tmp2 select tablename.* from tablename,tmp where tablename.autoid = tmp.autoid;
drop table tablename;
rename table tmp2 to tablename;
sql server
select identity(int,1,1) as autoid, * into #tmp from tablename;
select min(autoid) as autoid into #tmp2 from #tmp group by name,address;
drop table tablename;
select * into tablename from #tmp where autoid in(select autoid from #tmp2);
drop table #tmp;
drop table #tmp2;
oracle
delete from tablename t1 where t1.rowid > (select min(t2.rowid) from tablename t2 where t2.name = t1.name and t2.address = t1.address);
说明:
1. mysql和sql server中最后一个select得到了name,address不重复的结果集(多了一个autoid字段,在大家实际写时可以写在select子句中省去此列)
2. 因为mysql和sql server没有提供rowid机制,所以需要通过一个autoid列来实现行的唯一性,而利用oracle的rowid处理就方便多了。而且使用rowid是最高效的删除重复记录方法。
上一篇: 使用C#给PDF文档添加注释的实现代码