查找单个字段中的重复数据

如果想要找出 email 重复的数据,可以基于该字段进行分组统计,并且返回行数大于 1 的分组:

mysql> select email, count(email) from people group by email having count(email) > 1;
 
mysql>
|---------------|------------|
|email          |count(email)|
|---------------|------------|
|lisi@test.com  |           2|
|wangwu@test.com|           3|

查询结果显示有两个邮箱地址存在重复情况。如果想要查看完整的重复数据,可以使用子查询或者连接查询:

mysql>  select *
		from people
		where email in (
      		select email
      		from people
      		group by email
      		having count(email) > 1
        	)
		order by email;
mysql>  
|--|------|---------------|
|id|name  |email          |
|--|------|---------------|
| 2|李四  |lisi@test.com  |
| 4|李斯  |lisi@test.com  |
| 3|王五  |wangwu@test.com|
| 5|王五  |wangwu@test.com|
| 6|王五  |wangwu@test.com|
 
mysql> select p.*
		from people p
		join (
  			select email
  			from people
  			group by email
  			having count(email) > 1
			) d on p.email = d.email
			order by email;
			
mysql> 
|--|------|---------------|
|id|name  |email          |
|-- |------|---------------|
| 2|李四  |lisi@test.com  |
| 4|李斯  |lisi@test.com  |
| 3|王五  |wangwu@test.com|
| 5|王五  |wangwu@test.com|
| 6|王五  |wangwu@test.com|

另一种查找重复记录的方法就是直接使用自连接查询和 distinct 操作符,例如:

mysql> select distinct p.*
		from people p
		join people d on p.email = d.email
		where p.id <> d.id
		order by p.email;
mysql> 
|--|------|---------------|
|id|name  |email          |
|--|------|---------------|
| 4|李斯  |lisi@test.com  |
| 2|李四  |lisi@test.com  |
| 6|王五  |wangwu@test.com|
| 5|王五  |wangwu@test.com|
| 3|王五  |wangwu@test.com|

注意,不能省略 distinct,否则会某些数据(3、5、6)会返回多次。

查找多个字段中的重复数据

如果我们想要找出 name 和 email 字段都重复的数据,实现方式也类似:

mysql> select *
		from people
		where (name, email) in (
      		select name, email
      		from people
      		group by name, email
     		 having count(1) > 1)
		order by email;
mysql> 
|--|-----|---------------|
|id|name |email          |
|--|-----|---------------|
| 3|王五  |wangwu@test.com|
| 5|王五  |wangwu@test.com|
| 6|王五  |wangwu@test.com|
 
mysql> select distinct p.*
		from people p
		join people d on p.name = d.name and p.email = d.email
		where p.id <> d.id
		order by email;
mysql> 
|--|------|--------------|
|id|name  |email         |
|--|------|--------------|
| 6|王五  |wangwu@test.com|
| 5|王五  |wangwu@test.com|
| 3|王五  |wangwu@test.com|

只有当 name 和 email 都相同时才是重复数据,所以 2 和 4 不是重复记录。

SQL 在两个表中查询重复值

方法一:使用DISTINCT和COUNT函数

使用DISTINCT和COUNT函数结合,可以快速查询重复值的数量和内容。例如,我们有两个表:表A和表B。我们想要找出在表A和表B中重复的值。可以使用以下SQL查询语句:

SELECT value, COUNT(value) AS count
FROM (
    SELECT value FROM tableA
    UNION ALL
    SELECT value FROM tableB
) AS temp
GROUP BY value
HAVING count > 1;

上述查询语句将返回在表A和表B中重复的值以及重复的次数。

方法二:使用INNER JOIN语句

INNER JOIN语句可以用来连接两个表,我们可以利用这个特性来查询重复的值。以下是一个示例:

SELECT value, COUNT(*) AS count
FROM tableA
INNER JOIN tableB ON tableA.value = tableB.value
GROUP BY value
HAVING count > 1;

上述查询将返回在表A和表B中重复的值以及重复的次数。

方法三:使用EXISTS关键字

使用EXISTS关键字,我们可以检查一个子查询是否返回结果。以下是一个示例:

SELECT value, COUNT(*) AS count
FROM tableA
WHERE EXISTS (
    SELECT 1
    FROM tableB
    WHERE tableB.value = tableA.value
)
GROUP BY value
HAVING count > 1;

上述查询将返回在表A和表B中重复的值以及重复的次数。