Streamlined Partitioning in PostgreSQL 17: Merge & Split Commands

Mydbops
Oct 8, 2024
3 mins
Mins to Read
All

PostgreSQL 17 focuses on improving database performance and management. One of the standout improvements is the introduction of commands that streamline partition management, specifically the ability to merge or split partitions effortlessly. Learn more about PostgreSQL 17 features.

Partitioning helps improve database performance by breaking large tables into smaller, more manageable sections based on specific criteria, such as date or list values. Effective partition management is essential for optimizing query performance, managing data growth, and simplifying maintenance tasks. Read more on partitioning strategies in PostgreSQL. PostgreSQL 17's new partitioning features make this process more efficient, especially in dynamic environments where data structures frequently change. In case you're looking for a deeper dive into these features, check out our PostgreSQL 17 New Features Webinar Presentation or watch the full Webinar Video for a more detailed discussion.

In PostgreSQL 17, partition management has been significantly enhanced with the introduction of two powerful commands: MERGE PARTITIONS and SPLIT PARTITION. These commands simplify the process of managing partitions, making it easier to restructure data without extensive manual intervention or data migration.

Challenges of Partitioning Before PostgreSQL 17

Before the release of PostgreSQL 17, managing partitions presented several challenges, especially when dealing with dynamic changes in data structure.

Previously, partition management required significant manual effort. If users needed to merge or split partitions, they had to manually create new partition structures and migrate data between them. This process was time-consuming, error-prone, and required careful planning to avoid disruptions in the database.

Managing large tables with numerous partitions was a complex task. Any restructuring, such as combining multiple partitions or splitting one into smaller sections, involved intricate steps, increasing the complexity of managing data and maintaining performance. This often became a bottleneck, particularly for large datasets with frequent structural changes.

PostgreSQL 17 has streamlined these operations, reducing the complexity and manual intervention required for partition management.

Let’s explore the powerful new commands introduced in PostgreSQL 17 through practical, real-world examples to better understand their usage and impact.

Merging Partitions in PostgreSQL 17

In PostgreSQL 17, merging partitions is now a simpler task, allowing users to combine partitions into one, as needed. Let's take an example to demonstrate how this works in a real-world scenario.

Here, we have a table called company_employees partitioned by the department_id field using the LIST partitioning strategy.

pg17=# CREATE TABLE company_employees (
pg17(#   emp_id SERIAL,
pg17(#   emp_name VARCHAR(50),
pg17(#   department_id INT,
pg17(#   PRIMARY KEY (emp_id, department_id)
pg17(# ) PARTITION BY LIST (department_id);
CREATE TABLE
pg17=# 
pg17=# 
pg17=# \d+ company_employees
                                                              Partitioned table "public.company_employees"
    Column     |         Type          | Collation | Nullable |                     Default                       | Storage  | Compression | Stats target | Description 
---------------+-----------------------+-----------+----------+-------------------------------------------------+----------+-------------+--------------+-------------
 emp_id        | integer               |           | not null | nextval('company_employees_emp_id_seq'::regclass) | plain    |             |              | 
 emp_name      | character varying(50) |           |          |                                                   | extended |             |              | 
 department_id | integer               |           | not null |                                                   | plain    |             |              | 
Partition key: LIST (department_id)
Indexes:
    "company_employees_pkey" PRIMARY KEY, btree (emp_id, department_id)
Number of partitions: 0

Initially, we created two partitions: company_employees_hr for department_id 1 (HR) and company_employees_sales for department_id 2 (Sales). Now, we want to merge these two partitions into one since the HR and Sales departments have been restructured into a single department.

pg17=# CREATE TABLE company_employees_hr PARTITION OF company_employees FOR VALUES IN (1);
CREATE TABLE
pg17=# CREATE TABLE company_employees_sales PARTITION OF company_employees FOR VALUES IN (2);
CREATE TABLE
pg17=# 
pg17=# \d+ company_employees
                                                              Partitioned table "public.company_employees"
    Column     |         Type          | Collation | Nullable |                     Default                       | Storage  | Compression | Stats target | Description 
---------------+-----------------------+-----------+----------+-------------------------------------------------+----------+-------------+--------------+-------------
 emp_id        | integer               |           | not null | nextval('company_employees_emp_id_seq'::regclass) | plain    |             |              | 
 emp_name      | character varying(50) |           |          |                                                   | extended |             |              | 
 department_id | integer               |           | not null |                                                   | plain    |             |              | 
Partition key: LIST (department_id)
Indexes:
    "company_employees_pkey" PRIMARY KEY, btree (emp_id, department_id)
Partitions: company_employees_hr FOR VALUES IN (1),
            company_employees_sales FOR VALUES IN (2)

We use the MERGE PARTITIONS command to combine them.

pg17=# ALTER TABLE company_employees
pg17-# MERGE PARTITIONS (company_employees_hr, company_employees_sales)
pg17-# INTO company_employees_hr_sales;
ALTER TABLE
pg17=# \d+ company_employees
                                                              Partitioned table "public.company_employees"
    Column     |         Type          | Collation | Nullable |                     Default                       | Storage  | Compression | Stats target | Description 
---------------+-----------------------+-----------+----------+-------------------------------------------------+----------+-------------+--------------+-------------
 emp_id        | integer               |           | not null | nextval('company_employees_emp_id_seq'::regclass) | plain    |             |              | 
 emp_name      | character varying(50) |           |          |                                                   | extended |             |              | 
 department_id | integer               |           | not null |                                                   | plain    |             |              | 
Partition key: LIST (department_id)
Indexes:
    "company_employees_pkey" PRIMARY KEY, btree (emp_id, department_id)
Partitions: company_employees_hr_sales FOR VALUES IN (1, 2)
pg17=# 

After executing the MERGE command, the partitions company_employees_hr and company_employees_sales are combined into a new partition named company_employees_hr_sales. This new partition now contains all the data from the previously separate partitions.

With PostgreSQL 17, the ability to easily merge partitions simplifies administrative tasks, reduces manual migration efforts, and enhances overall data management flexibility.

Splitting Partitions in PostgreSQL 17

Now, let’s explore splitting a partition. Consider a table company_employees, which is partitioned by join_date using a RANGE partitioning strategy.

pg17=# CREATE TABLE company_employees (
  emp_id SERIAL,
  emp_name VARCHAR(50),
  join_date DATE,
  PRIMARY KEY (emp_id, join_date)  
) PARTITION BY RANGE (join_date);
CREATE TABLE
pg17=# 
pg17=# \d+ company_employees
                                                            Partitioned table "public.company_employees"
  Column   |         Type          | Collation | Nullable |                     Default                       | Storage  | Compression | Stats target | Description 
-----------+-----------------------+-----------+----------+-------------------------------------------------+----------+-------------+--------------+-------------
 emp_id    | integer               |           | not null | nextval('company_employees_emp_id_seq'::regclass) | plain    |             |              | 
 emp_name  | character varying(50) |           |          |                                                   | extended |             |              | 
 join_date | date                  |           | not null |                                                   | plain    |             |              | 
Partition key: RANGE (join_date)
Indexes:
    "company_employees_pkey" PRIMARY KEY, btree (emp_id, join_date)
Number of partitions: 0

The company_employees_2024 partition holds all employees who joined in the year 2024. 

pg17=# CREATE TABLE company_employees_2024 PARTITION OF company_employees
pg17-# FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
CREATE TABLE
pg17=# 
pg17=# \d+ company_employees
                                                            Partitioned table "public.company_employees"
  Column   |         Type          | Collation | Nullable |                     Default                       | Storage  | Compression | Stats target | Description 
-----------+-----------------------+-----------+----------+-------------------------------------------------+----------+-------------+--------------+-------------
 emp_id    | integer               |           | not null | nextval('company_employees_emp_id_seq'::regclass) | plain    |             |              | 
 emp_name  | character varying(50) |           |          |                                                   | extended |             |              | 
 join_date | date                  |           | not null |                                                   | plain    |             |              | 
Partition key: RANGE (join_date)
Indexes:
    "company_employees_pkey" PRIMARY KEY, btree (emp_id, join_date)
Partitions: company_employees_2024 FOR VALUES FROM ('2024-01-01') TO ('2025-01-01')

However, if we want to split this year-long partition into four quarterly partitions for more efficient data management, we can use the SPLIT PARTITION command introduced in PostgreSQL 17. This allows us to break the partition into smaller ranges based on quarters.

pg17=# ALTER TABLE company_employees
pg17-# SPLIT PARTITION company_employees_2024 INTO (
pg17(#   PARTITION company_employees_q1_2024 FOR VALUES FROM ('2024-01-01') TO ('2024-04-01'),
pg17(#   PARTITION company_employees_q2_2024 FOR VALUES FROM ('2024-04-01') TO ('2024-07-01'),
pg17(#   PARTITION company_employees_q3_2024 FOR VALUES FROM ('2024-07-01') TO ('2024-10-01'),
pg17(#   PARTITION company_employees_q4_2024 FOR VALUES FROM ('2024-10-01') TO ('2025-01-01')
pg17(# );
ALTER TABLE
pg17=# 
pg17=# \d+ company_employees
                                                            Partitioned table "public.company_employees"
  Column   |         Type          | Collation | Nullable |                     Default                       | Storage  | Compression | Stats target | Description 
-----------+-----------------------+-----------+----------+-------------------------------------------------+----------+-------------+--------------+-------------
 emp_id    | integer               |           | not null | nextval('company_employees_emp_id_seq'::regclass) | plain    |             |              | 
 emp_name  | character varying(50) |           |          |                                                   | extended |             |              | 
 join_date | date                  |           | not null |                                                   | plain    |             |              | 
Partition key: RANGE (join_date)
Indexes:
    "company_employees_pkey" PRIMARY KEY, btree (emp_id, join_date)
Partitions: company_employees_q1_2024 FOR VALUES FROM ('2024-01-01') TO ('2024-04-01'),
            company_employees_q2_2024 FOR VALUES FROM ('2024-04-01') TO ('2024-07-01'),
           company_employees_q3_2024 FOR VALUES FROM ('2024-07-01') TO ('2024-10-01'),
            company_employees_q4_2024 FOR VALUES FROM ('2024-10-01') TO ('2025-01-01')

After executing the SPLIT PARTITION command, the partition company_employees_2024 is divided into four partitions, each covering a quarter of the year 2024. This granular partitioning makes it easier to manage and query specific time ranges, allowing for more efficient data handling.

However, there is a downside: both splitting and merging partitions require an ACCESS EXCLUSIVE LOCK on the parent table. This lock blocks all other operations on the table during the process, which can temporarily impact database availability and performance.

Ready to optimize your PostgreSQL performance and ensure seamless partition management? Learn how our PostgreSQL Managed and Consulting Services can help you leverage the latest features in PostgreSQL 17. Contact us today for a consultation.

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.