马克社区,马克吐温社区,马克java社区PostgreSQL 11 新特性之分区裁剪增强

PostgreSQL 11 新特性之分区裁剪增强

作者: 不剪发的Tony老师
毕业于北京航空航天大学，十多年数据库管理与开发经验，目前在一家全球性的金融公司从事数据库架构设计。CSDN学院签约讲师以及GitChat专栏作者。csdn上的博客收藏于以下地址：https://tonydong.blog.csdn.net

文章目录

在之前的版本中，只在查询的计划阶段执行分区排除操作（通过 constraint_exclusion 变量控制），意味着许多连接查询和预编译查询无法使用分区排除。另外，这种方法占用的时间会随着分区的数量线性增长。

PostgreSQL 11 通过两个方面的改进提供了更加强大且快速的分区裁剪功能：

    查询计划阶段更快的分区排除，可以提高分区表（尤其是包含许多分区的分区表）的访问性能。
    支持执行阶段的分区排除。

分区裁剪使用选项 enable_partition_pruning 进行控制。该参数默认值为 on。

show enable_partition_pruning;
enable_partition_pruning
--------------------------
on
(1 row)

CREATE TABLE rtable(c1 INT, c2 VARCHAR(10)) PARTITION BY RANGE(c1);
CREATE TABLE rtable100 PARTITION OF rtable FOR VALUES FROM (1) TO (100);
CREATE TABLE rtable200 PARTITION OF rtable FOR VALUES FROM (101) TO (200);
CREATE TABLE rtable300 PARTITION OF rtable FOR VALUES FROM (201) TO (300);
CREATE TABLE rtable400 PARTITION OF rtable FOR VALUES FROM (301) TO (400);

explain analyze select * from rtable where c1=256;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Append (cost=0.00..24.53 rows=6 width=42) (actual time=0.009..0.009 rows=0 loops=1)
   -> Seq Scan on rtable300 (cost=0.00..24.50 rows=6 width=42) (actual time=0.007..0.007 rows=0 loops=1)
         Filter: (c1 = 256)
Planning Time: 0.397 ms
Execution Time: 0.042 ms
(5 rows)

如果将该参数设置为 off，将会禁用分区裁剪功能：

set enable_partition_pruning=off;
explain analyze select * from rtable where c1=256;
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Append (cost=0.00..98.12 rows=24 width=42) (actual time=0.015..0.015 rows=0 loops=1)
   -> Seq Scan on rtable100 (cost=0.00..24.50 rows=6 width=42) (actual time=0.007..0.007 rows=0 loops=1)
         Filter: (c1 = 256)
   -> Seq Scan on rtable200 (cost=0.00..24.50 rows=6 width=42) (actual time=0.002..0.002 rows=0 loops=1)
         Filter: (c1 = 256)
   -> Seq Scan on rtable300 (cost=0.00..24.50 rows=6 width=42) (actual time=0.002..0.002 rows=0 loops=1)
         Filter: (c1 = 256)
   -> Seq Scan on rtable400 (cost=0.00..24.50 rows=6 width=42) (actual time=0.002..0.002 rows=0 loops=1)
         Filter: (c1 = 256)
Planning Time: 0.246 ms
Execution Time: 0.167 ms
(11 rows)

    注意 Currently, pruning of partitions during the planning of an UPDATE
    目前，对于 UPDATE 和 DELETE 语句，计划阶段的分区裁剪基于之前的约束排除方法实现（但是，该功能使用 enable_partition_pruning 选项控制，而不是 constraint_exclusion 选项）。
    另外，执行阶段的分区裁剪目前支持 Append 节点类型，而不支持 MergeAppend 或者 ModifyTable（UPDATE 或者 DELETE）。
    这些行为很可能在将来的 PostgreSQL 版本中进行修改。

在 PostgreSQL 11 中，查询计划阶段的分区排除使用二分查找法搜索匹配的分区（LIST 分区表和 RANGE 分区表）；对于哈希分区表，使用哈希函数查找匹配的分区。但是，对于 UPDATE/DELETE 语句，仍然使用约束排除的方法。

PostgreSQL 11 另一个重大的改进就是支持查询执行时的动态分区裁剪。先看一个 PostgreSQL 10 中的示例。

-- PostgreSQL 10
PREPARE ps1 (INT) AS SELECT * FROM rtable WHERE c1 = $1;

explain analyze execute ps1(256);
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Append (cost=0.00..98.00 rows=24 width=42) (actual time=0.051..0.051 rows=0 loops=1)
   -> Seq Scan on rtable100 (cost=0.00..24.50 rows=6 width=42) (actual time=0.011..0.011 rows=0 loops=1)
         Filter: (c1 = 256)
   -> Seq Scan on rtable200 (cost=0.00..24.50 rows=6 width=42) (actual time=0.006..0.006 rows=0 loops=1)
         Filter: (c1 = 256)
   -> Seq Scan on rtable300 (cost=0.00..24.50 rows=6 width=42) (actual time=0.006..0.006 rows=0 loops=1)
         Filter: (c1 = 256)
   -> Seq Scan on rtable400 (cost=0.00..24.50 rows=6 width=42) (actual time=0.005..0.005 rows=0 loops=1)
         Filter: (c1 = 256)
Planning time: 0.373 ms
Execution time: 0.215 ms
(11 rows)

查询计划显示需要扫描所有的分区，因为查询计划器无法确定带参数的查询语句的执行计划。

以下是相同的操作在 PostgreSQL 11 中的结果：

-- PostgreSQL 11
PREPARE ps1 (INT) AS SELECT * FROM rtable WHERE c1 = $1;

explain analyze execute ps1(256);
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Append (cost=0.00..24.53 rows=6 width=42) (actual time=0.017..0.019 rows=1 loops=1)
   -> Seq Scan on rtable300 (cost=0.00..24.50 rows=6 width=42) (actual time=0.016..0.017 rows=1 loops=1)
         Filter: (c1 = 256)
Planning Time: 0.382 ms
Execution Time: 0.049 ms
(5 rows)

结果显示，PostgreSQL 11 可以针对带参数的查询语句执行分区排除。

动态分区排除可以利用查询的计划阶段不能确定的值执行分区裁剪；例如 PREPARE 语句中的参数，通过子查询获取的值，或者嵌套循环连接的内层参数值。运行时分区裁剪发生在以下两个时间点：

    查询计划初始化阶段。使用执行初始化阶段能够确定的参数值执行分区裁剪。这个阶段排除的分区不会显示在 EXPLAIN 或 EXPLAIN ANALYZE 的结果中。可以通过 EXPLAIN 结果的 “Subplans Removed” 属性查看这个阶段排除的分区数量。
    查询计划的实际执行阶段。在查询的实际执行阶段，仍然可能使用运行时才能确定的值完成分区裁剪的操作。例如来自子查询的值和运行时参数（例如参数化的嵌套循环连接）的值。由于这些参数的值在运行时可能会发生变化，每次参数变化时都会执行一次分区裁剪操作。判断是否在该阶段产生分区排除可以查询 EXPLAIN ANALYZE 输出中的 nloops 属性。

以下是一个使用子查询的示例：

explain analyze select * from rtable where c1=(select 256);
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
Append (cost=0.01..74.61 rows=19 width=42) (actual time=0.027..0.027 rows=0 loops=1)
   InitPlan 1 (returns $0)
     -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
   -> Seq Scan on rtable100 (cost=0.00..24.50 rows=6 width=42) (never executed)
         Filter: (c1 = $0)
   -> Seq Scan on rtable200 (cost=0.00..24.50 rows=6 width=42) (never executed)
         Filter: (c1 = $0)
   -> Seq Scan on rtable300 (cost=0.00..1.00 rows=1 width=42) (actual time=0.012..0.012 rows=0 loops=1)
         Filter: (c1 = $0)
   -> Seq Scan on rtable400 (cost=0.00..24.50 rows=6 width=42) (never executed)
         Filter: (c1 = $0)
Planning Time: 0.301 ms
Execution Time: 0.103 ms
(13 rows)

查询计划中的 (never executed) 就是运行时动态执行的分区裁剪。

官方文档：Table Partitioning