PostgreSQL 16 brings Load Balancing Support in libpq-psql

Mydbops
May 7, 2023
10
Mins to Read
All

Get ready for PostgreSQL 16’s release and learn about the new Connection Load Balancing Support feature in libpq. Discover how it can help scale out read queries and improve PostgreSQL’s performance.

PostgreSQL 16

PostgreSQL is one of the most popular open-source relational database management systems (RDBMS) available today. The upcoming major release version of PostgreSQL, PostgreSQL 16, is expected to be released in the third quarter of 2023. One of the most anticipated features in this release is Connection Load Balancing Support in libpq, which will make it easier to scale out read queries in PostgreSQL.

PostgreSQL 15 is a previous major release of PostgreSQL that came with a host of new features and improvements, including improvements to partitioning, faster query execution, and enhanced security features. If you want to know more about these features, you can visit our Mydbops 13th Open Source Database Meetup.

Libpq and Load Balancing

Libpq is a programming library in the C language that allows applications to communicate with PostgreSQL databases. It is widely used to develop software that interacts with PostgreSQL. Popular tools like pgbench and psql use libpq to communicate with PostgreSQL.

To handle a large number of read queries, the most common method is to distribute the connections across multiple read replicas using a technique called load balancing. There are two main ways to achieve load balancing:

  • Load balancing can be done at the client level using JDBC, which is a Java-based technology for communicating with databases.
  • Load balancing can also be done through an intermediary load balancer such as pgpool or HAproxy. These tools sit between the client application and the database and distribute the requests across multiple read replicas.

PostgreSQL 16’s New Feature for Load Balancing Multiple Servers with libpq

The following commit enabled a similar feature(point 1) of JDBC to libpq as well.

Commit Message

To balance the load across multiple servers when using libpq, a new feature has been added that lets you specify a connection parameter called “load_balance_hosts”. By setting this parameter to “random”, libpq will randomly connect to different hosts and their associated IP addresses. This helps distribute the workload when there are multiple clients or frequent connection setups.

The randomization process involves two levels of shuffling.

  • First, the given hosts are shuffled randomly before they are resolved one by one.
  • Then, once a host’s IP addresses are resolved, they are reshuffled before the connections are attempted.

Jelte Fennema is the author of this feature, and it was reviewed by Aleksander Alekseev, Michael Banck, and Andrey Borodin. Further discussion can be found at

https://www.postgresql.org/message-id/flat/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com

Starting from version 16, tools that utilize libpq, such as psql, now have support for load balancing at the connection level.

PostgreSQL Versions Prior to 16

Since the release of PostgreSQL 10, all client applications that rely on libpq now have the capability to accept multiple hosts.

 
/usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434 dbname=postgres user=mydbops' -c 'show port'
	

In this scenario, each host is accompanied by its corresponding port (e.g., 192.168.33.11:5432, 192.168.33.12:5432, 192.168.33.13:5432) and is attempted in the sequence specified until a successful connection is established.

New Enhanced Feature in PostgreSQL 16

In PostgreSQL 16, it’s possible to perform random load balancing of connections among the hosts specified in the connection string.

New Option and Environment Variables:

A new option has been added to support load balancing in the connection string: Load_balance_hosts.

This environment variable supports two values:

  • Random
  • Disable

In addition to the connection string, the environment variable PGLOADBALANCEHOST can also be used to determine the load balancing option.

load_balance_hosts=Disable:

An attempt is made to establish a successful connection using the host list provided in the specified order. Note that this approach does not involve any kind of load balancing.

Example:

 
/usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434 load_balance_hosts=disable  dbname=postgres user=mydops' -c 'show port'

/usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydbops' -c 'show port'
	

Both commands work the same way.

load_balance_hosts=Random:

This option attempts to connect to a randomly selected destination available in the connection string.

Demo:

Let’s take a look at a demonstration of this feature. I have three PostgreSQL instances running with the following details:

Case 1: Load_balance_hosts=disable

I attempted to establish connections with both of the active hosts provided in the connection string.

 
postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops' -c "show port"
 port
------
 5432
(1 row)

postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops' -c "show port"
 port
------
 5432
(1 row)
	

As expected, it only made a successful connection to the first available destination (192.168.33.11:5432).

Now that the instance running on 192.168.33.11:5432 has been shut down, let’s see what happens next.

 
postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops' -c "show port"
 port
------
 5433
(1 row)
	

Now the connection is made to the next available destination, which is running on host 192.168.33.12 and port 5433, since the first destination (192.168.33.11:5432) specified in the connection string is not available.

Case 2: Load_balance_hosts=Random

Now the connections are randomly assigned to the available destinations, resulting in connection distribution across the destinations

 
postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops load_balance_hosts=random' -c "show port"
 port
------
 5432
(1 row)

postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops load_balance_hosts=random' -c "show port"
 port
------
 5432
(1 row)

postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops load_balance_hosts=random' -c "show port"
 port
------
 5433
(1 row)
	

Once again, let’s shut down PostgreSQL on 192.168.33.11:5432. This will result in connections being made only to the available destinations.

 
postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops load_balance_hosts=random' -c "show port"
 port
------
 5433
(1 row)

postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops load_balance_hosts=random' -c "show port"
 port
------
 5433
(1 row)

postgres@mydbops:/root$ /usr/local/pgsql/bin/psql 'host=192.168.33.11,192.168.33.12,192.168.33.13 port=5432,5433,5434  dbname=postgres user=mydops load_balance_hosts=random' -c "show port"
 port
------
 5434
(1 row)
	

If you want a particular destination to be preferred (i.e., given more weight), you can specify the destination multiple times in the connection string.

To give more weightage to the host mydb_node1, follow this step.

 
/usr/local/pgsql/bin/psql 'host=mydb_node1,mydb_node1,mydb_node2 port=5432,5432,5433  dbname=postgres user=mydbops load_balance_hosts=random' -c "show port"
	

In addition, it is recommended to set the connect_timeout as well. Without setting the connect_timeout, if one of the connection destinations does not respond, the system will wait for a response instead of trying the next available destination.The addition of load balancing support in the libpq client library is a crucial step in improving the performance and availability of PostgreSQL databases.

With this new feature, developers and system administrators will be able to distribute database connections across multiple hosts, reducing the load on individual hosts (especially on replicas) and improving the overall system performance and availability.

Stay tuned for more exciting features in PostgreSQL.

We welcome you to explore our website for more informative blogs related to database consulting, support, and other technology-related topics, if you found this blog post useful. You may discover something that captures your interest and provides valuable insights and advice for your business or career. To gain further insights, you can also refer to our guide on PgPool – II Installation & Configuration.

If you require expert assistance with managing your database, do not hesitate to contact us today. Our team of skilled professionals can provide customized solutions to meet your specific needs, ensuring the security, optimization, and accessibility of your data at all times. Reach out to us now to learn more about how we can assist you in streamlining your operations and maximizing the value of your MySQL Performance and Operations.

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.