MySQL删除重复记录但保持最新

假设您的表

test

包含以下数据：

  select id, email    from test;ID          EMAIL     ---------------------- -------------------- 1aaa       2bbb       3ccc       4bbb       5ddd       6eee       7aaa       8aaa       9eee

因此，我们需要找到所有重复的电子邮件并删除所有重复的电子邮件，但是最新的ID。
在这种情况下，

aaa

，

bbb

和

eee

重复，所以我们要删除ID为1，7,2和6。

为此，首先我们需要查找所有重复的电子邮件：

      select email         from test       group by email      having count(*) > 1;EMAIL     -------------------- aaa       bbb       eee

然后，从该数据集中，我们需要为这些重复的电子邮件中的每一封找到最新的ID：

  select max(id) as lastId, email    from test   where email in (   select email      from test    group by email   having count(*) > 1       )   group by email;LASTID      EMAIL     ---------------------- -------------------- 8aaa       4bbb       9eee

最后，我们现在可以删除所有ID小于LASTID的电子邮件。因此解决方案是：

delete test  from test inner join (  select max(id) as lastId, email    from test   where email in (   select email      from test    group by email   having count(*) > 1       )   group by email) duplic on duplic.email = test.email where test.id < duplic.lastId;

我现在没有在这台机器上安装mySql，但是应该可以

更新资料

上面的删除有效，但是我发现了一个更优化的版本：

 delete test   from test  inner join (     select max(id) as lastId, email       from test      group by email     having count(*) > 1) duplic on duplic.email = test.email  where test.id < duplic.lastId;

您会看到它删除了最旧的重复项，即1、7、2、6：

select * from test;+----+-------+| id | email |+----+-------+|  3 | ccc   ||  4 | bbb   ||  5 | ddd   ||  8 | aaa   ||  9 | eee   |+----+-------+

另一个版本是Rene Limon删除的内容

delete from test where id not in (    select max(id)      from test     group by email)

MySQL删除重复记录但保持最新

更新资料

面试问答相关栏目本月热门文章