欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

hive数据库无主键进行去重SQL

程序员文章站 2022-06-02 13:09:12
...

去重sql ,修改表名:ap_jyz_jyzqgdwa即可

SELECT
	*
FROM
	(
		SELECT
			*, row_number () over (PARTITION BY id ORDER BY id DESC) AS rk
		FROM
			ap_jyz_jyzqgdwa
	) tmp
WHERE
	tmp.rk < 2

查看已去重之后id分组数量

SELECT
	a.id,
	count(1)
FROM
	(
		SELECT
			*
		FROM
			(
				SELECT
					*, row_number () over (PARTITION BY id ORDER BY id DESC) AS rk
				FROM
					ap_jyz_jyzqgdwa
			) tmp
		WHERE
			tmp.rk < 2
	) a
GROUP BY
	a.id

查看已去重之后的数据条数

SELECT
	COUNT(1)
FROM
	(
		SELECT
			*
		FROM
			(
				SELECT
					*, row_number () over (PARTITION BY id ORDER BY id DESC) AS rk
				FROM
					ap_jyz_jyzqgdwa
			) tmp
		WHERE
			tmp.rk < 2
	) a

 

相关标签: hive