HiveSQL——row_number()over()使⽤语法格式:row_number() over(partition by 分组列 order by 排序列 desc)
row_number() over()分组排序功能:
在使⽤ row_number() over()函数时候,over()⾥头的分组以及排序的执⾏晚于 where 、group by、  order by 的执⾏。例⼀:
表数据:
create table TEST_ROW_NUMBER_OVER(
id varchar(10) not null,
name varchar(10) null,
age varchar(10) null,
salary int null
rows函数的使用方法及实例);
select * from TEST_ROW_NUMBER_OVER t;
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(1,'a',10,8000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(1,'a2',11,6500);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(2,'b',12,13000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(2,'b2',13,4500);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(3,'c',14,3000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(3,'c2',15,20000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(4,'d',16,30000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(5,'d2',17,1800);
⼀次排序:对查询结果进⾏排序(⽆分组)
select id,name,age,salary,row_number()over(order by salary desc) rn
from TEST_ROW_NUMBER_OVER t
结果:
进⼀步排序:根据id分组排序
select id,name,age,salary,row_number()over(partition by id order by salary desc) rank
from TEST_ROW_NUMBER_OVER t
结果:
再⼀次排序:出每⼀组中序号为⼀的数据
select * from(select id,name,age,salary,row_number()over(partition by id order by salary desc) rank
from TEST_ROW_NUMBER_OVER t)
where rank <2
结果:
排序出年龄在13岁到16岁数据,按salary排序
select id,name,age,salary,row_number()over(order by salary desc) rank
from TEST_ROW_NUMBER_OVER t where age between '13' and '16'
结果:结果中 rank 的序号,其实就表明了 over(order by salary desc) 是在where age between and 后执⾏的
例⼆:
1.使⽤row_number()函数进⾏编号,如
select email,customerID, ROW_NUMBER() over(order by psd) as rows from QT_Customer
原理:先按psd进⾏排序,排序完后,给每条数据进⾏编号。
2.在订单中按价格的升序进⾏排序,并给每条记录进⾏排序代码如下:
select DID,customerID,totalPrice,ROW_NUMBER() over(order by totalPrice) as rows from OP_Order
3.统计出每⼀个各户的所有订单并按每⼀个客户下的订单的⾦额升序排序,同时给每⼀个客户的订单进⾏编号。这样就知道每个客户下⼏单了:
select ROW_NUMBER() over(partition by customerID order by totalPrice)
as rows,customerID,totalPrice, DID from OP_Order
4.统计每⼀个客户最近下的订单是第⼏次下的订单:
with tabs as
(
select ROW_NUMBER() over(partition by customerID order by totalPrice)
as rows,customerID,totalPrice, DID from OP_Order
)
select MAX(rows) as '下单次数',customerID from tabs
group by customerID
5.统计每⼀个客户所有的订单中购买的⾦额最⼩,⽽且并统计改订单中,客户是第⼏次购买的:
思路:利⽤临时表来执⾏这⼀操作。
1.先按客户进⾏分组,然后按客户的下单的时间进⾏排序,并进⾏编号。
2.然后利⽤⼦查询查出每⼀个客户购买时的最⼩价格。
3.根据查出每⼀个客户的最⼩价格来查相应的记录。
with tabs as
(
select ROW_NUMBER() over(partition by customerID order by insDT)
as rows,customerID,totalPrice, DID from OP_Order
)
select * from tabs
where totalPrice in
(
select MIN(totalPrice)from tabs group by customerID
)
6.筛选出客户第⼀次下的订单。
思路。利⽤rows=1来查询客户第⼀次下的订单记录。
with tabs as
(
select ROW_NUMBER() over(partition by customerID order by insDT) as rows,* from OP_Order
)
select * from tabs where rows = 1
select * from OP_Order
7.注意:在使⽤over等开窗函数时,over⾥头的分组及排序的执⾏晚于“where,group by,order by”的执⾏。select
ROW_NUMBER() over(partition by customerID order by insDT) as rows,
customerID,totalPrice, DID
from OP_Order where insDT>'2011-07-22'