PostgreSQL 11 新特性之窗口函数(window function)增强
作者: 不剪发的Tony老师
毕业于北京航空航天大学,十多年数据库管理与开发经验,目前在一家全球性的金融公司从事数据库架构设计。CSDN学院签约讲师以及GitChat专栏作者。csdn上的博客收藏于以下地址:https://tonydong.blog.csdn.net
文章目录
PostgreSQL 11 针对窗口函数(window function)进行了增强,添加了 SQL:2011 标准中的所有窗口范围(window frame)子句选项。具体来说,新增了以下功能:
允许在RANGE窗口模式中使用off_set PRECEDING/FOLLOWING限定窗口范围
支持GROUPS窗口模式,使用基于组的方式限定窗口范围
支持窗口范围子句的排除选项,用于排除窗口内的某些数据行
首先,对于RANGE窗口模式,增加了off_set PRECEDING/FOLLOWING子句,现在可以支持所有的选项:
RANGE frame_start
RANGE BETWEEN frame_start AND frame_end
其中,frame_start 和 frame_end 可以是以下选项之一:
UNBOUNDED PRECEDING
off_set PRECEDING
CURRENT ROW
off_set FOLLOWING
UNBOUNDED FOLLOWING
创建一个测试表:
CREATE TABLE test
(
id SERIAL PRIMARY KEY,
val INT4,
logtime TIMESTAMP
);
INSERT INTO test(val, logtime)
VALUES
(1, '2019-01-15 08:08:17'),
(1, '2019-01-15 08:14:30'),
(3, '2019-01-15 08:36:00'),
(6, '2019-01-15 09:20:56'),
(6, '2019-01-15 10:15:41');
以下示例比较了ROWS模式和RANGE模式的区别:
SELECT id, val, logtime,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_rows,
sum(val) OVER (ORDER BY val RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_range,
sum(val) OVER (ORDER BY logtime RANGE BETWEEN INTERVAL '10 minute' PRECEDING AND '10 minute' FOLLOWING) as sum_range_time
FROM test;
id | val | logtime | sum_rows | sum_range | sum_range_time
----+-----+---------------------+----------+-----------+----------------
1 | 1 | 2019-01-15 08:08:17 | 2 | 2 | 2
2 | 1 | 2019-01-15 08:14:30 | 5 | 2 | 2
3 | 3 | 2019-01-15 08:36:00 | 10 | 3 | 3
4 | 6 | 2019-01-15 09:20:56 | 15 | 12 | 6
5 | 6 | 2019-01-15 10:15:41 | 12 | 12 | 6
(5 rows)
其中,
sum_rows 列计算按照字段 val 排序后,每一行以及前后各一行的合计值;
sum_range 列计算按照字段 val 排序后,每一行以及前后和它的值相差小于等于 1 的那些行的合计值;
sum_range_time 列计算按照字段 logtime 排序后,每一行以及前后和它的值相差小于等于 10 分钟的那些行的合计值。
其次,PostgreSQL 11 增加了GROUPS窗口模式。
GROUPS frame_start
GROUPS BETWEEN frame_start AND frame_end
其中,frame_start 和 frame_end 可以是以下选项之一:
UNBOUNDED PRECEDING
off_set PRECEDING
CURRENT ROW
off_set FOLLOWING
UNBOUNDED FOLLOWING
对于GROUPS窗口模式,计算窗口范围时使用组(由ORDER BY决定)为单位,而不是以行ROWS模式)或者值RANGE模式)为单位。
基于上面的示例,增加GROUPS模式:
SELECT id, val, logtime,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_rows,
sum(val) OVER (ORDER BY val RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_range,
sum(val) OVER (ORDER BY val GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_groups
FROM test;
id | val | logtime | sum_rows | sum_range | sum_groups
----+-----+---------------------+----------+-----------+------------
1 | 1 | 2019-01-15 08:08:17 | 2 | 2 | 5
2 | 1 | 2019-01-15 08:14:30 | 5 | 2 | 5
3 | 3 | 2019-01-15 08:36:00 | 10 | 3 | 17
4 | 6 | 2019-01-15 09:20:56 | 15 | 12 | 15
5 | 6 | 2019-01-15 10:15:41 | 12 | 12 | 15
(5 rows)
其中,sum_groups 计算按照字段 val 排序后,每一行所在的组以及前后各一个组中所有行的合计值。
ORDER BY val决定了组,示例中存在 3 个组(1、3、6)。对于第一行数据,它所在的组有两个 val 值(1 和 1),1 PRECEDING组不存在,1 FOLLOWING组有一个值(3),合计为 5(1+1+3)。
最后,PostgreSQL 11 支持窗口范围子句的排除选项 frame_exclusion,用于排除窗口内的某些数据:
{ RANGE | ROWS | GROUPS } frame_start [ frame_exclusion ]
{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end [ frame_exclusion ]
其中,frame_start 和 frame_end 可以是以下选项之一:
UNBOUNDED PRECEDING
off_set PRECEDING
CURRENT ROW
off_set FOLLOWING
UNBOUNDED FOLLOWING
frame_exclusion 可以是以下选项之一:
EXCLUDE CURRENT ROW
EXCLUDE GROUP
EXCLUDE TIES
EXCLUDE NO OTHERS
它们的作用分别为:
EXCLUDE CURRENT ROW排除当前行
EXCLUDE GROUP排除当前行所在的组,也就是和当前行排序相同的行
EXCLUDE TIES排除当前行所在组中的其他行,但是不排除当前行
EXCLUDE NO OTHERS不排除任何行,默认值
以下示例演示了这些选项的效果:
SELECT id, val, logtime,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE NO OTHERS) as sum_rows,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE CURRENT ROW) as exclude_current_row,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE GROUP) as exclude_group,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE TIES) as exclude_ties
FROM test;
id | val | logtime | sum_rows | exclude_current_row | exclude_group | exclude_ties
----+-----+---------------------+----------+---------------------+---------------+--------------
1 | 1 | 2019-01-15 08:08:17 | 2 | 1 | | 1
2 | 1 | 2019-01-15 08:14:30 | 5 | 4 | 3 | 4
3 | 3 | 2019-01-15 08:36:00 | 10 | 7 | 7 | 10
4 | 6 | 2019-01-15 09:20:56 | 15 | 9 | 3 | 9
5 | 6 | 2019-01-15 10:15:41 | 12 | 6 | | 6
(5 rows)
至此,PostgreSQL 11 支持 SQL:2011 标准中定义的所有窗口范围子句选项。
官方文档:
https://www.postgresql.org/docs/11/tutorial-window.html
https://www.postgresql.org/docs/11/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS
https://www.postgresql.org/docs/11/queries-table-expressions.html#QUERIES-WINDOW
https://www.postgresql.org/docs/11/functions-window.html
https://www.postgresql.org/docs/11/sql-select.html#SQL-WINDOW