欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

sparksql中row_number() 的用法

程序员文章站 2022-03-24 11:08:53
...

本来使用api窗口函数开发的,但是觉得写成sql更方便,但是发现sparksql中as出来的别名,不能在where中使用,要再套上一层select才可以。

val topDF = spark.sql("select * from (select day, city, cmsId ,count(cmsId) as ts, 
row_number() over(partition by city order by count(cmsId)) as rn "+
      " from data_log  where day='20170511' and cmsType='video'  
 group by city, day,cmsId   order by city, rn  ) T where T.rn<=3 ")