目录 一、函数写法 二、开窗的窗口范围ROWS与RANGE 1.范围限定用法 2.ROWS和RANGE的区别 (1) ROWS按行数限定 (2) RANGE按数据范围限定 order by 数字
目录
函数名(参数) OVER (PARTITioN BY子句 ORDER BY子句 ROWS/RANGE子句)
由三部分组成:
函数名:如sum、max、min、count、avg等聚合函数以及lead、lag行比较函数等;
over: 关键字,表示前面的函数是分析函数,不是普通的集合函数;
分组子句:over关键字后面挂号内的内容;
分析子句又由下面三部分组成:
PARTITION BY :分组子句,表示分析函数的计算范围,不同的组互不相干;
ORDER BY: 排序子句,表示分组后,组内的排序方式;
ROWS/RANGE:窗口子句,是在分组(PARTITION BY)后,组内的子分组(也称窗口),此时分析函数的计算范围窗口,而不是PARTITON。窗口有两种,ROWS和RANGE;
ROWS按行数限定
RANGE按数据范围限定
表结构及测试数据:
DROP TABLE IF EXISTS `test`;CREATE TABLE `test` ( `video_id` int(0) NOT NULL COMMENT '视频ID', `dt` date NULL DEFAULT NULL, `if_follow` tinyint(0) NULL DEFAULT NULL COMMENT '是否关注') ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;-- ------------------------------ Records of test-- ----------------------------INSERT INTO `test` VALUES (2001, '2021-09-24', 1);INSERT INTO `test` VALUES (2001, '2021-10-03', 1);INSERT INTO `test` VALUES (2001, '2021-10-02', 1);INSERT INTO `test` VALUES (2001, '2021-10-01', 1);INSERT INTO `test` VALUES (2002, '2021-09-25', 1);INSERT INTO `test` VALUES (2002, '2021-09-25', 1);INSERT INTO `test` VALUES (2002, '2021-09-26', 1);INSERT INTO `test` VALUES (2002, '2021-09-27', 1);INSERT INTO `test` VALUES (2002, '2021-09-28', 1);INSERT INTO `test` VALUES (2002, '2021-09-29', 1);INSERT INTO `test` VALUES (2002, '2021-09-30', 1);INSERT INTO `test` VALUES (2002, '2021-10-01', 1);INSERT INTO `test` VALUES (2002, '2021-10-02', 1);INSERT INTO `test` VALUES (2002, '2021-10-03', 1);
语句:
select video_id,dt, sum(if_follow) over(partition by video_id order by dt rows BETWEEN CURRENT ROW and 1 following ) from test ;
表结构及测试数据:
DROP TABLE IF EXISTS `test`;CREATE TABLE `test` ( `video_id` int(0) NOT NULL COMMENT '视频ID', `dt` date NULL DEFAULT NULL, `if_follow` tinyint(0) NULL DEFAULT NULL COMMENT '是否关注') ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;-- ------------------------------ Records of test-- ----------------------------INSERT INTO `test` VALUES (2001, '2021-09-24', 1);INSERT INTO `test` VALUES (2001, '2021-10-03', 9);INSERT INTO `test` VALUES (2001, '2021-10-02', 2);INSERT INTO `test` VALUES (2001, '2021-10-01', 6);INSERT INTO `test` VALUES (2002, '2021-09-25', 1);INSERT INTO `test` VALUES (2002, '2021-09-25', 1);INSERT INTO `test` VALUES (2002, '2021-09-26', 6);INSERT INTO `test` VALUES (2002, '2021-09-27', 1);INSERT INTO `test` VALUES (2002, '2021-09-28', 1);INSERT INTO `test` VALUES (2002, '2021-09-29', 8);INSERT INTO `test` VALUES (2002, '2021-09-30', 7);INSERT INTO `test` VALUES (2002, '2021-10-01', 1);INSERT INTO `test` VALUES (2002, '2021-10-02', 9);INSERT INTO `test` VALUES (2002, '2021-10-03', 1);
下面这个语句执行会报错,因为当RANGE和PRECEDING/FOLLOWING一起使用时,order by的表达式必须为数字或者时间差
select video_id,dt, sum(if_follow) over(partition by video_id order by dt range BETWEEN 3 preceding and CURRENT ROW ) from test ;
报错内容如下:
select video_id,dt, sum(if_follow) over(partition by video_id order by dt range BETWEEN 3 preceding and CURRENT ROW ) from test
> 3587 - Window '' with RANGE N PRECEDING/FOLLOWING frame requires exactly one ORDER BY expression, of numeric or temporal type
select video_id,dt, sum(if_follow) over(partition by video_id order by if_follow range BETWEEN CURRENT ROW and 3 following) from test ;
select video_id,dt, sum(if_follow) over(partition by video_id order by if_follow range BETWEEN 3 PRECEDING and CURRENT ROW ) from test ;
order by表达式的类型为时间(date、datetime)时,必须使用Interval
select video_id,dt, sum(if_follow) over(partition by video_id order by dt range BETWEEN CURRENT ROW and interval 2 day following) from test ;
select video_id,dt, sum(if_follow) over(partition by video_id order by dt range BETWEEN interval 2 day PRECEDING and CURRENT ROW ) from test ;
下面是Mysql中能使用的
rank()函数,如果有并列情况,会占用下一个名次的位置,比如,成绩为100的学生有三个并列第一,那么99分的学生是第二名,通过rank()函数,名次是:1,1,1,4;
dense()函数,如果有并列的情况,不会占用下一个名词,同用上个例子,名次是:1,1,1,2;
row_number()函数,会忽略并列的情况,同用上述例子,名次是:1,2,3,4;
count() over(partition by ... order by ...):求分组后的总数;
max() over(partition by ... order by ...):求分组后的最大值;
min() over(partition by ... order by ...):求分组后的最小值;
avg() over(partition by ... order by ...):求分组后的平均值;
lag() over(partition by ... order by ...):取出向前第n行数据。
lead() over(partition by ... order by ...):取出向后第n行数据。
lag(arg1,arg2,arg3)、lead(arg1,arg2,arg3)
第一个参数是列名,
第二个参数是偏移的offset,不能为负数,
第三个参数是超出记录窗口时的默认值。
表结构及测试数据:
DROP TABLE IF EXISTS `test`;CREATE TABLE `test` ( `video_id` int(0) NOT NULL COMMENT '视频ID', `dt` date NULL DEFAULT NULL, `if_follow` tinyint(0) NULL DEFAULT NULL COMMENT '是否关注') ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;-- ------------------------------ Records of test-- ----------------------------INSERT INTO `test` VALUES (2001, '2021-09-24', 1);INSERT INTO `test` VALUES (2001, '2021-10-03', 9);INSERT INTO `test` VALUES (2001, '2021-10-02', 2);INSERT INTO `test` VALUES (2001, '2021-10-01', 6);INSERT INTO `test` VALUES (2002, '2021-09-25', 1);INSERT INTO `test` VALUES (2002, '2021-09-25', 1);INSERT INTO `test` VALUES (2002, '2021-09-26', 6);INSERT INTO `test` VALUES (2002, '2021-09-27', 1);INSERT INTO `test` VALUES (2002, '2021-09-28', 1);INSERT INTO `test` VALUES (2002, '2021-09-29', 8);INSERT INTO `test` VALUES (2002, '2021-09-30', 7);INSERT INTO `test` VALUES (2002, '2021-10-01', 1);INSERT INTO `test` VALUES (2002, '2021-10-02', 9);INSERT INTO `test` VALUES (2002, '2021-10-03', 1);
语法错误,偏移offset,不能为负数
select video_id,dt, lag(dt,-1,'偏移超出了') over(order by dt ) from test ;
1064 - You have an error in your sql syntax; check the manual that corresponds to your MySQL Server version for the right syntax to use near '-1,'偏移超出了') over(order by dt ) from test' at line 1
select video_id,dt, lag(dt,0,'偏移超出了') over(order by dt ) from test ;
select video_id,dt, lag(dt,2,'偏移超出了') over(order by dt ) from test ;
select video_id,dt, lag(video_id,2,'偏移超出了') over(order by dt ) from test ;
select video_id,dt, lead(video_id,2,'偏移超出了') over(order by dt ) from test ;
select video_id,dt, lead(video_id,2) over(order by dt ) from test ;
first_value() over()和last_value() over(),分别是求分组中第一个和最后一个
ratio_to_report() over(partition by ... order by ...):ratio_to_report() 括号中就是分子,over() 括号中就是分母
percent_rank() over(partition by ... order by ...)
--结束END--
本文标题: 窗口函数OVER(PARTITION BY)详细用法——语法+函数+开窗范围ROWS和RANGE
本文链接: https://lsjlt.com/news/390976.html(转载时请注明来源链接)
有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341
2024-10-23
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
回答
回答
回答
回答
回答
回答
回答
回答
回答
回答
0