欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

pathman分区表工具的使用

程序员文章站 2022-06-02 13:39:25
...

一、概述

在PG<=10的版本中,都是通过表继承的方式进行分区的,必须使用CHECK CONSTRAINT将每个分区创建为子表 。

PostgreSQL 10提供了本机分区,它与经典方法没有什么不同,通过隐式约束条件来实现,并且其大多数限制仍然相关。

源生的两种分区表的实现在执行select/delete/update时执行计划根据约束和查询条件排除不需要查询的分区表。调用COPY或插入数据时使用触发器或规则,将数据插入对应的分区表。PLAN时需要对所有分区创建RangeTblEntry与RelOptInfo结果,同时需要加锁。无论是查询还是插入,对性能的影响都较大。

PG 12 版本以上,原生分区表再次做了一定的优化,其资源消耗和性能方面有了较大的提升。

pathman 是一块比较常用的分区表的管理插件,它将分区配置存储在pathman_config表中,表的信息会缓存在内存中,同时使用HOOK来实现RELATION的替换,所以效率非常高。目前该工具支持range和hash两种分区方式,range使用binary search查找对应的分区,hash使用hash search查找对应的分区。

pg_pathman 用到的hook如下:

  • pg_pathman uses ProcessUtility_hook hook to handle COPY queries for partitioned tables.
  • RuntimeAppend (overrides Append plan node)
  • RuntimeMergeAppend (overrides MergeAppend plan node)
  • PartitionFilter (drop-in replacement for INSERT triggers)

pg_pathman特性:

  • 目前支持HASH分区、RANGE分区。
  • 支持自动分区管理(通过函数接口创建分区,自动将主表数据迁移到分区表),或手工分区管理(通过函数实现,将已有的表绑定到分区表,或者从分区表剥离)。
  • 支持的分区字段类型包括int、float、date以及其他常用类型,包括自定义的domain。
  • 有效的分区表查询计划(JOINs、subselects 等)。
  • 使用RuntimeAppend & RuntimeMergeAppend 自定义计划节点实现了动态分区选择。
  • PartitionFilter:一种有效的插入触发器替换方法。
  • 支持自动新增分区(目前仅支持RANGE分区表)。
  • 支持copy from/to直接读取或写入分区表,提高效率。
  • 支持分区字段的更新,需要添加触发器,如果不需要更新分区字段,则不建议添加这个触发器,会产生一定的性能影响。
  • 允许用户自定义回调函数,在创建分区时会自动触发。
  • 非堵塞式创建分区表,以及后台自动将主表数据非堵塞式迁移到分区表。
  • 支持FDW,通过配置参数pg_pathman.insert_into_fdw=(disabled | postgres | any_fdw)支持postgres_fdw或任意FDW。

二、安装部署

1、下载源码安装包并编译安装

# git clone https://github.com/postgrespro/pg_pathman
# cd /usr/local/pg_pathman
# make USE_PGXS=1
# make USE_PGXS=1 install

2、数据库配置文件的修改与生效

## 配置文件修改
$ vi postgres.conf
shared_preload_libraries = 'pg_pathman'

## 重启数据库服务,使配置文件生效
$ pg_ctl start

该参数设置需要注意的一点是,若数据库存在一些使用相同 hook 函数的工具,可能会导致他们之间对 hook 函数的调用出现冲突。需要在配置文件中规范他们顺序。比较常见的一个场景就是若数据库中安装有pg_stat_statements,其配置顺序为 shared_preload_libraries = ‘pg_stat_statements, pg_pathman’。

3、数据库加载扩展包

postgres=# CREATE EXTENSION pg_pathman;
CREATE EXTENSION
postgres=# \dx
                     List of installed extensions
    Name    | Version |   Schema   |           Description
------------+---------+------------+----------------------------------
 pg_pathman | 1.5     | public     | Partitioning tool for PostgreSQL
 plpgsql    | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

4、扩展包升级

## 下载安装最新版本的源码包
## 重启数据库服务
## 执行以下命令
ALTER EXTENSION pg_pathman UPDATE;
SET pg_pathman.enable = t;

5、查看已安装插件

postgres=# \dx
                     List of installed extensions
    Name    | Version |   Schema   |           Description
------------+---------+------------+----------------------------------
 pg_pathman | 1.5     | public     | Partitioning tool for PostgreSQL
 plpgsql    | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

三、基本命令

3.1 相关视图与表

1、分区表元数据信息 pathman_config

db1=# \d+ pathman_config
                                   Table "public.pathman_config"
     Column     |   Type   | Collation | Nullable | Default | Storage  | Stats target | Description
----------------+----------+-----------+----------+---------+----------+--------------+-------------
 partrel        | regclass |           | not null |         | plain    |              |主表oid
 expr           | text     |           | not null |         | extended |              |分区字段
 parttype       | integer  |           | not null |         | plain    |              |分区类型,hash或range
 range_interval | text     |           |          |         | extended |              |range分区的interval
Indexes:
    "pathman_config_pkey" PRIMARY KEY, btree (partrel)
Check constraints:
    "pathman_config_interval_check" CHECK (validate_interval_value(partrel, expr, parttype, range_interval))
    "pathman_config_parttype_check" CHECK (parttype = ANY (ARRAY[1, 2]))
Policies:
    POLICY "allow_select" FOR SELECT
      USING (true)
    POLICY "deny_modification"
      USING (check_security_policy(partrel))
Triggers:
    pathman_config_trigger AFTER INSERT OR DELETE OR UPDATE ON pathman_config FOR EACH ROW EXECUTE FUNCTION pathman_config_params_trigger_func()
Access method: heap


db1=# select * from  pathman_config;
    partrel     |   expr   | parttype | range_interval
----------------+----------+----------+----------------
 part_test      | crt_time |        2 | 1 mon               //主表为part_test、分区字段为crt_time,分区类型2表示range分区,range的interval为1个月
 part_hash_test | crt_time |        1 |                     //主表为part_hash_test,分区字段为crt_time,分区类型1表是hash分区
(2 rows)

2、分区表元数据信息 pathman_config_params

db1=# \d+ pathman_config_params
                                Table "public.pathman_config_params"
     Column      |   Type   | Collation | Nullable | Default | Storage  | Stats target | Description
-----------------+----------+-----------+----------+---------+----------+--------------+-------------
 partrel         | regclass |           | not null |         | plain    |              |主表oid
 enable_parent   | boolean  |           | not null | false   | plain    |              |是否在优化器中过滤主表
 auto            | boolean  |           | not null | true    | plain    |              |insert时是否自动扩展不存在的分区
 init_callback   | text     |           |          |         | extended |              |create partition时的回调函数oid
 spawn_using_bgw | boolean  |           | not null | false   | plain    |              |
Indexes:
    "pathman_config_params_pkey" PRIMARY KEY, btree (partrel)
Check constraints:
    "pathman_config_params_init_callback_check" CHECK (validate_part_callback(
CASE
    WHEN init_callback IS NULL THEN 0::regprocedure
    ELSE init_callback::regprocedure
END))
Policies:
    POLICY "allow_select" FOR SELECT
      USING (true)
    POLICY "deny_modification"
      USING (check_security_policy(partrel))
Triggers:
    pathman_config_params_trigger AFTER INSERT OR DELETE OR UPDATE ON pathman_config_params FOR EACH ROW EXECUTE FUNCTION pathman_config_params_trigger_func()
Access method: heap


db1=# select * from pathman_config_params;
    partrel     | enable_parent | auto | init_callback | spawn_using_bgw
----------------+---------------+------+---------------+-----------------
 part_test      | f             | t    |               | f                  //主表为part_test,数据无法写入到主表,自动扩展分区
 part_hash_test | t             | t    |               | f                  //主表为part_hash_test,允许数据写入到主表,自动扩展分区
(2 rows)

3、分区表后台数据迁移任务信息 pathman_concurrent_part_tasks

db1=# \d+ pathman_concurrent_part_tasks
                  View "public.pathman_concurrent_part_tasks"
  Column   |   Type   | Collation | Nullable | Default | Storage  | Description
-----------+----------+-----------+----------+---------+----------+-------------
 userid    | regrole  |           |          |         | plain    |
 pid       | integer  |           |          |         | plain    |
 dbid      | oid      |           |          |         | plain    |
 relid     | regclass |           |          |         | plain    |
 processed | bigint   |           |          |         | plain    |
 status    | text     |           |          |         | extended |
View definition:
 SELECT show_concurrent_part_tasks.userid,
    show_concurrent_part_tasks.pid,
    show_concurrent_part_tasks.dbid,
    show_concurrent_part_tasks.relid,
    show_concurrent_part_tasks.processed,
    show_concurrent_part_tasks.status
   FROM show_concurrent_part_tasks() show_concurrent_part_tasks(userid, pid, dbid, relid, processed, status);

db1=#
db1=#
db1=# select * from pathman_concurrent_part_tasks;
 userid | pid | dbid | relid | processed | status
--------+-----+------+-------+-----------+--------
(0 rows)

4、分区表的分区字段信息 pathman_partition_list

db1=# \d+ pathman_partition_list
                      View "public.pathman_partition_list"
  Column   |   Type   | Collation | Nullable | Default | Storage  | Description
-----------+----------+-----------+----------+---------+----------+-------------
 parent    | regclass |           |          |         | plain    |主表oid
 partition | regclass |           |          |         | plain    |子分区表
 parttype  | integer  |           |          |         | plain    |分区表类型,1表示range,2表示hash
 expr      | text     |           |          |         | extended |分区字段
 range_min | text     |           |          |         | extended |range的左边界
 range_max | text     |           |          |         | extended |range的右边界
View definition:
 SELECT show_partition_list.parent,
    show_partition_list.partition,
    show_partition_list.parttype,
    show_partition_list.expr,
    show_partition_list.range_min,
    show_partition_list.range_max
   FROM show_partition_list() show_partition_list(parent, partition, parttype, expr, range_min, range_max);

db1=# select * from pathman_partition_list;
     parent     |    partition     | parttype |   expr   |      range_min      |      range_max
----------------+------------------+----------+----------+---------------------+---------------------
 part_test      | part_test_1      |        2 | crt_time | 2020-09-01 00:00:00 | 2020-10-01 00:00:00
 part_test      | part_test_2      |        2 | crt_time | 2020-10-01 00:00:00 | 2020-11-01 00:00:00
 part_test      | part_test_3      |        2 | crt_time | 2020-11-01 00:00:00 | 2020-12-01 00:00:00
 part_test      | part_test_4      |        2 | crt_time | 2020-12-01 00:00:00 | 2021-01-01 00:00:00
 part_test      | part_test_5      |        2 | crt_time | 2021-01-01 00:00:00 | 2021-02-01 00:00:00
 part_test      | part_test_6      |        2 | crt_time | 2021-02-01 00:00:00 | 2021-03-01 00:00:00
 part_test      | part_test_7      |        2 | crt_time | 2021-03-01 00:00:00 | 2021-04-01 00:00:00
 part_test      | part_test_8      |        2 | crt_time | 2021-04-01 00:00:00 | 2021-05-01 00:00:00
 part_test      | part_test_9      |        2 | crt_time | 2021-05-01 00:00:00 | 2021-06-01 00:00:00
 part_test      | part_test_10     |        2 | crt_time | 2021-06-01 00:00:00 | 2021-07-01 00:00:00
 part_test      | part_test_11     |        2 | crt_time | 2021-07-01 00:00:00 | 2021-08-01 00:00:00
 part_test      | part_test_12     |        2 | crt_time | 2021-08-01 00:00:00 | 2021-09-01 00:00:00
 part_test      | part_test_13     |        2 | crt_time | 2021-09-01 00:00:00 | 2021-10-01 00:00:00
 part_test      | part_test_14     |        2 | crt_time | 2021-10-01 00:00:00 | 2021-11-01 00:00:00
 part_test      | part_test_15     |        2 | crt_time | 2021-11-01 00:00:00 | 2021-12-01 00:00:00
 part_test      | part_test_16     |        2 | crt_time | 2021-12-01 00:00:00 | 2022-01-01 00:00:00
 part_test      | part_test_17     |        2 | crt_time | 2022-01-01 00:00:00 | 2022-02-01 00:00:00
 part_test      | part_test_18     |        2 | crt_time | 2022-02-01 00:00:00 | 2022-03-01 00:00:00
 part_test      | part_test_19     |        2 | crt_time | 2022-03-01 00:00:00 | 2022-04-01 00:00:00
 part_test      | part_test_20     |        2 | crt_time | 2022-04-01 00:00:00 | 2022-05-01 00:00:00
 part_test      | part_test_21     |        2 | crt_time | 2022-05-01 00:00:00 | 2022-06-01 00:00:00
 part_test      | part_test_22     |        2 | crt_time | 2022-06-01 00:00:00 | 2022-07-01 00:00:00
 part_test      | part_test_23     |        2 | crt_time | 2022-07-01 00:00:00 | 2022-08-01 00:00:00
 part_test      | part_test_24     |        2 | crt_time | 2022-08-01 00:00:00 | 2022-09-01 00:00:00
 part_test      | part_test_25     |        2 | crt_time | 2022-09-01 00:00:00 | 2022-10-01 00:00:00
 part_hash_test | part_hash_test_0 |        1 | crt_time |                     |
 part_hash_test | part_hash_test_1 |        1 | crt_time |                     |
 part_hash_test | part_hash_test_2 |        1 | crt_time |                     |
 part_hash_test | part_hash_test_3 |        1 | crt_time |                     |
(29 rows)

3.2 分区管理函数

1、range分区

1)创建range分区 create_range_partitions

create_range_partitions(relation       REGCLASS,  -- 主表OID
                        attribute      TEXT,      -- 分区列名
                        start_value    ANYELEMENT,  -- 开始值
                        p_interval     ANYELEMENT,  -- 间隔;任意类型,适合任意类型的分区表
                        p_count        INTEGER DEFAULT NULL,   --  分多少个区
                        partition_data BOOLEAN DEFAULT TRUE)   --  是否立即将数据从主表迁移到分区, 不建议这么使用

2)创建range分区 create_partitions_from_range

create_partitions_from_range(relation       REGCLASS,  -- 主表OID
                             attribute      TEXT,      -- 分区列名
                             start_value    ANYELEMENT,  -- 开始值
                             end_value      ANYELEMENT,  -- 结束值
                             p_interval     ANYELEMENT,  -- 间隔;任意类型,适合任意类型的分区表
                             partition_data BOOLEAN DEFAULT TRUE)   --  是否立即将数据从主表迁移到分区, 不建议这么使用

RANGE分区表使用建议:

  • 分区列必须有not null约束。
  • 分区个数必须能覆盖已有的所有记录。
  • 创建分区表时建议partition_data设置为false,使用非堵塞式迁移接口,在数据迁移完成后,建议禁用主表。

2、hash分区

1)创建hash分区 create_hash_partitions

create_hash_partitions(relation         REGCLASS,  -- 主表OID
                       attribute        TEXT,      -- 分区列名
                       partitions_count INTEGER,   -- 打算创建多少个分区
                       partition_data   BOOLEAN DEFAULT TRUE)   --  是否立即将数据从主表迁移到分区, 不建议这么使用

在HASH分区表使用建议:

  • 分区列必须有not null约束。
  • 创建分区表时建议partition_data设置为false,使用非堵塞式迁移接口迁移,在数据迁移完成后,建议禁用主表。
  • pg_pathman不会受制于表达式的写法,所以select * from part_test where crt_time = ‘2016-10-25 00:00:00’::timestamp;这样的写法也能用于HASH分区的。
  • HASH分区列不局限于int类型的列,会使用HASH函数自动转换。

3、分区迁移

如果创建分区表时,未将主表数据迁移到分区,那么可以使用非堵塞式的迁移接口,将数据迁移到对应分区。

1)迁移主表数据到子分区表 partition_table_concurrently

partition_table_concurrently(relation   REGCLASS,              -- 主表OID
                             batch_size INTEGER DEFAULT 1000,  -- 一个事务批量迁移多少记录
                             sleep_time FLOAT8 DEFAULT 1.0)    -- 获得行锁失败时,休眠多久再次获取,重试60次退出任务。

2)停止迁移任务

stop_concurrent_part_task(relation REGCLASS)    -- 主表OID

4、分区的分裂与合并

1)range分区分裂

如果某个分区太大,想分裂为两个分区,可以指定该分区分裂为2个子分区表(目前仅支持RANGE分区表),数据会自动迁移到另一个分区。

split_range_partition(partition      REGCLASS,            -- 分区oid
                      split_value    ANYELEMENT,          -- 分裂值
                      partition_name TEXT DEFAULT NULL)   -- 分裂后新增的分区表名

2)多个分区的合并

将两个子分区表进行合并,自动将子分区表2的数据迁移到子分区表1,然后删除子分区表2。(目前仅支持RANGE分区表)

指定两个需要合并分区,必须为相邻分区  
merge_range_partitions(partition1 REGCLASS,               -- 子分区表1
                        partition2 REGCLASS)              -- 相邻的子分区表2

5、新增分区

每个分区表默认是可以自动创建分区的,

1)向后添加范围分区

append_range_partition(parent         REGCLASS,            -- 主表OID
                       partition_name TEXT DEFAULT NULL,   -- 新增的分区表名, 默认不需要输入
                       tablespace     TEXT DEFAULT NULL)   -- 新增的分区表放到哪个表空间, 默认不需要输入

2)向前添加范围分区

prepend_range_partition(parent         REGCLASS,
                        partition_name TEXT DEFAULT NULL,
                        tablespace     TEXT DEFAULT NULL)

4)添加分区

add_range_partition(relation       REGCLASS,    -- 主表OID
                    start_value    ANYELEMENT,  -- 起始值
                    end_value      ANYELEMENT,  -- 结束值
                    partition_name TEXT DEFAULT NULL,  -- 分区名
                    tablespace     TEXT DEFAULT NULL)  -- 分区创建在哪个表空间下

6、删除分区

## 删除指定子分区
drop_range_partition(partition TEXT,   -- 子分区名称
                    delete_data BOOLEAN DEFAULT TRUE)  -- 是否删除分区数据,如果false,表示分区数据迁移到主表。  

## 批量删除全部子分区
drop_partitions(parent      REGCLASS,       -- 主表oid
                delete_data BOOLEAN DEFAULT FALSE) -- 是否删除分区数据,如果false,表示分区数据迁移到主表。

7、绑定/解绑分区

1)绑定分区(已有的表加入分区表)

将已有的表,绑定到已有的某个分区主表。已有的表与主表要保持一致的结构,包括dropped columns (查看pg_attribute的一致性)

attach_range_partition(relation    REGCLASS,    -- 主表OID
                       partition   REGCLASS,    -- 分区表OID
                       start_value ANYELEMENT,  -- 起始值
                       end_value   ANYELEMENT)  -- 结束值

2)解绑分区(将分区变成普通表)

将分区从主表的继承关系中删除,不删数据,删除继承关系,删除约束。接口如下:

detach_range_partition(partition REGCLASS)  -- 指定分区名,转换为普通表

8、禁用主表

当主表的数据全部迁移到分区后,可以禁用主表。接口函数如下:

set_enable_parent(relation REGCLASS,      -- 主表名称
                    value BOOLEAN)        -- true/false

9、自动扩展分区

范围分区表,允许自动扩展分区。如果新插入的数据不在已有的分区范围内,会自动创建分区。

set_auto(relation REGCLASS,         -- 主表名称
            value BOOLEAN)          -- true/false

四、示例

1、range 分区基本使用

postgres=# \c db1
You are now connected to database "db1" as user "postgres".

## 创建分区主表
db1=# create table part_test(id int, info text, crt_time timestamp not null);
CREATE TABLE
db1=# insert into part_test select id,md5(random()::text),clock_timestamp() + (id||' hour')::interval from generate_series(1,10000) t(id);
INSERT 0 10000
db1=# select * from part_test limit 10;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  1 | d31dbb7672abc7aac781d30f1e6fd707 | 2020-09-24 18:51:37.784476
  2 | 97d807b7e5b7438904338b12e66519d0 | 2020-09-24 19:51:37.784566
  3 | c332436057ed3e43d56c2702ae66477c | 2020-09-24 20:51:37.78457
  4 | 9973d6f7c3ba75c14ebfcae16ef5081d | 2020-09-24 21:51:37.784572
  5 | dd377f7d8ff7a78f64f1155b071837eb | 2020-09-24 22:51:37.784575
  6 | 5bf2c97d421602ba7821da5059b13225 | 2020-09-24 23:51:37.784577
  7 | 82fca34052d64defaec7ac2d59961934 | 2020-09-25 00:51:37.784578
  8 | edf9cb3eb950a4f2efbab3bd52fbef79 | 2020-09-25 01:51:37.78458
  9 | 1e263054325dfcb6d82980c595fea977 | 2020-09-25 02:51:37.784581
 10 | 77b7c4f7660e23d241af6a226cea147f | 2020-09-25 03:51:37.784582
(10 rows)


## 创建range分区子分区表
db1=# select create_range_partitions('part_test'::regclass,'crt_time','2020-09-01 00:00:00'::timestamp,interval '1 month',24,false) ;
 create_range_partitions
-------------------------
                      24
(1 row)


db1=# \d+ part_test
                                            Table "public.part_test"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Child tables: part_test_1,
              part_test_10,
              part_test_11,
              part_test_12,
              part_test_13,
              part_test_14,
              part_test_15,
              part_test_16,
              part_test_17,
              part_test_18,
              part_test_19,
              part_test_2,
              part_test_20,
              part_test_21,
              part_test_22,
              part_test_23,
              part_test_24,
              part_test_3,
              part_test_4,
              part_test_5,
              part_test_6,
              part_test_7,
              part_test_8,
              part_test_9
Access method: heap

## 使用非阻塞方式迁移主表数据到子分区表
db1=# select partition_table_concurrently('part_test'::regclass,10000,1.0);
NOTICE:  worker started, you can stop it with the following command: select public.stop_concurrent_part_task('part_test');
 partition_table_concurrently
------------------------------

(1 row)

db1=# select count(*) from only part_test;
 count
-------
     0
(1 row)

## 禁用主表
db1=# select set_enable_parent('part_test'::regclass, false);
 set_enable_parent
-------------------

(1 row)


## 分区表查询执行计划
db1=# explain select * from part_test where crt_time = '2020-10-25 00:00:00'::timestamp;
                                QUERY PLAN
---------------------------------------------------------------------------
 Seq Scan on part_test_2  (cost=0.00..13.30 rows=1 width=45)
   Filter: (crt_time = '2020-10-25 00:00:00'::timestamp without time zone)
(2 rows)

2、hash 分区表使用示例


## 创建区分主表
db1=# create table part_hash_test(id int, info text, crt_time timestamp not null);
CREATE TABLE
db1=# insert into part_hash_test select id,md5(random()::text),clock_timestamp() + (id||' hour')::interval from generate_series(1,10000) t(id);
INSERT 0 10000
db1=# select * from part_hash_test limit 10;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  1 | 5a178885f00e94d93fbe25b3ebf78cdb | 2020-09-25 11:16:09.032447
  2 | dc1c362db8122b42262e76375d4fb266 | 2020-09-25 12:16:09.032583
  3 | bb9fc8da485d9085d4c91f4e3ec86478 | 2020-09-25 13:16:09.032588
  4 | 1f503196e4337ebb9a98500ada99cf66 | 2020-09-25 14:16:09.03259
  5 | 5fa2fc57bf309cd84bc8a2a88b8a8c7e | 2020-09-25 15:16:09.032591
  6 | 9a8c7b178826466f4c32a70c84b242bc | 2020-09-25 16:16:09.032593
  7 | f6b5a272d3214a1fda9b45fdc38a457d | 2020-09-25 17:16:09.032594
  8 | 5747a82d95a026530c0dbcf42f7dcb2b | 2020-09-25 18:16:09.032595
  9 | 3d4dae8e6b5d093ed5c000d0a8348b4f | 2020-09-25 19:16:09.032597
 10 | dff90345635e9757d91f6ce8908012ea | 2020-09-25 20:16:09.032598
(10 rows)

## 创建hash分区子表
db1=# select create_hash_partitions('part_hash_test'::regclass,'crt_time',4,false) ;
 create_hash_partitions
------------------------
                      4
(1 row)

## 查看分区表定义
db1=# \d+ part_hash_test
                                          Table "public.part_hash_test"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Child tables: part_hash_test_0,
              part_hash_test_1,
              part_hash_test_2,
              part_hash_test_3
Access method: heap

db1=# select count(*) from only part_hash_test;
 count
-------
 10000
(1 row)

## 使用非阻塞方式迁移主表数据到分区子表
db1=# select partition_table_concurrently('part_hash_test'::regclass,10000,1.0);
NOTICE:  worker started, you can stop it with the following command: select public.stop_concurrent_part_task('part_hash_test');
 partition_table_concurrently
------------------------------

(1 row)

db1=# select count(*) from only part_hash_test;
 count
-------
     0
(1 row)

db1=# \d+ part_hash_test_0
                                         Table "public.part_hash_test_0"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Check constraints:
    "pathman_part_hash_test_0_check" CHECK (get_hash_part_idx(timestamp_hash(crt_time), 4) = 0)
Inherits: part_hash_test
Access method: heap

3、分区列的分裂

db1=# \d+ part_test_1
                                           Table "public.part_test_1"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Check constraints:
    "pathman_part_test_1_check" CHECK (crt_time >= '2020-09-01 00:00:00'::timestamp without time zone AND crt_time < '2020-10-01 00:00:00'::timestamp without time zone)
Inherits: part_test
Access method: heap

db1=# select count(*) from part_test_2;
 count
-------
   744
(1 row)

db1=# \d+ part_test_2
                                           Table "public.part_test_2"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Check constraints:
    "pathman_part_test_2_check" CHECK (crt_time >= '2020-10-01 00:00:00'::timestamp without time zone AND crt_time < '2020-11-01 00:00:00'::timestamp without time zone)
Inherits: part_test
Access method: heap


## 子分区表的分裂
db1=# select split_range_partition('part_test_2'::regclass,'2020-10-15 00:00:00'::timestamp,'part_test_2_1');
 split_range_partition
-----------------------
 part_test_2_1
(1 row)


## 分裂后的子分区表
db1=# select count(*) from part_test_2;
 count
-------
   336
(1 row)

db1=# select count(*) from part_test_2_1;
 count
-------
   408
(1 row)

4、子分区表的合并

db1=# \d+ part_test_2
                                           Table "public.part_test_2"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Check constraints:
    "pathman_part_test_2_check" CHECK (crt_time >= '2020-10-01 00:00:00'::timestamp without time zone AND crt_time < '2020-10-15 00:00:00'::timestamp without time zone)
Inherits: part_test
Access method: heap

db1=# \d+ part_test_2_1
                                          Table "public.part_test_2_1"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Check constraints:
    "pathman_part_test_2_1_check" CHECK (crt_time >= '2020-10-15 00:00:00'::timestamp without time zone AND crt_time < '2020-11-01 00:00:00'::timestamp without time zone)
Inherits: part_test
Access method: heap


## 子分区表的合并
db1=# select merge_range_partitions('part_test_2'::regclass, 'part_test_2_1'::regclass) ;
 merge_range_partitions
------------------------
 part_test_2
(1 row)

db1=# \d+ part_test_2_1
Did not find any relation named "part_test_2_1".
db1=# \d+ part_test_2
                                           Table "public.part_test_2"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              |
 info     | text                        |           |          |         | extended |              |
 crt_time | timestamp without time zone |           | not null |         | plain    |              |
Check constraints:
    "pathman_part_test_2_check" CHECK (crt_time >= '2020-10-01 00:00:00'::timestamp without time zone AND crt_time < '2020-11-01 00:00:00'::timestamp without time zone)
Inherits: part_test
Access method: heap

## 子分区表合并后,数据自动迁移到分区表1中
db1=# select count(*) from part_test_2;
 count
-------
   744
(1 row)

文档参考
pathman工具基本命令与示例: https://www.alibabacloud.com/help/zh/doc-detail/140900.htm?spm=a2c63.p38356.a1.2.2865177fPTXTBA
德哥博客对pathman工具的介绍: https://github.com/digoal/blog/blob/master/201610/20161024_01.md
pathman github官方文档: https://github.com/postgrespro/pg_pathman?spm=a2c6h.12873639.0.0.2b50419eBaxd7C
PG 12 分区表性能提升: https://github.com/digoal/blog/blob/master/201903/20190331_01.md

相关标签: PG 数据库