Resource management is one of the key thing in all RDBMS. DB Connections is one of them. In PostgreSQL, Allocating more connections without proper analysis of resources causes an overhead in Memory usage and the creation of an OS process for each connection will lead to a scheduling overhead in the time-sharing operating system. These things will cause a great impact on application performance and an increase in latency.
To overcome these situations, we can adapt connection pooling. Connection Pool is a cache of database connections so that connections can be reused when future requests to the database are required. The creation of a thread in Postgres requires many processes like opening a connection. user authentication and so on. Connection pooling will reduce these overheads. Also, if we reach the maximum connection limit, pooling will help us to avoid the end-user getting errors like max connections reached, instead it will keep on waiting until the connection is freed up in the database.
In this blog post, we will look at the most famous connection pooling tool for Postgres PGBouncer. At first, we will cover the configuration of PGBouncer followed by the performance improvements with and without PGBouncer.
PGBouncer Configuration:
- First I had installed the PostgreSQL server(version 12.1) in the testing instance with the specification of Centos 7 OS, 2 CPU core, 2 GB RAM and 100GB SSD disk. Installing Postgresql is not the scope of this blog. Please refer to this official Documentation.
2. Now PGBouncer can be installed by passing the below command in Centos 7
3. After the successful installation, we can start working on the configuration.
4. Edit the configuration file called pgbouncer.ini in the path /etc/pgbouncer/pgbouncer.ini. By default, it contains a lot. Just make sure you have modified the below details in it.
- * = host=localhost –>For which database you want to enable pooling. Here I have added the * for this to access all.
- Listen port and Listen Address –> Which IP address and port PGBouncer process listen to.
- Auth_type and admin_users –> Authentication type to be used – Here I have mentioned md5 which requires a password for client connections and also requires users.txt file which comes under auth_file(It contains username and password to connect to the DB) and admin user to connect to the DB
- Auth file, log and PID files –> Auth file is required for md5 authentication type which contains the username and its respective encrypted passwords. Keep the auth, log, PID files in the secure location
- Pool mode and default Pool size –> Based on the pool mode, the connection will be returned to the pool. It has three values
- Session – Connection back to the pool when client closes the session
- Transaction – Connection back to the pool when the transaction completes
- Statement – Connection back to the pool when the statement completes.
Here I leave it as the value default – session and also increased the pool size to 80.Okay, Now we need to make an entry in userlist.txt file with username and password to access the DB as pgbouncer is a third party tool which does not have access to these user credentials in the DB.
- Login the Postgres server as you usually do
- Execute this query in the psql prompt
- Add the entry in the username and password order of userlist.txt file.
Now, restart the PGBouncer to take effect and check the status too
That’s it. Now we can access the postgres server through PGbouncer just by changing the port to 6432 and also we can connect to the PGBouncer database and execute some useful commands for the pool (show stats, show pools and so on)
In applications, we just need to change the port of the connection pooling.
Performance Benchmarking Without and with PGBouncer:
Test 1:
To benchmark the postgresql server, I am using a utility called sysbench. We can install this utility by executing the following command
Created a test database in postgres server and populating the data in it
Data loading is done, now we will generate some loads without PGBouncer
Doing the same test with PGBouncer
From the above stats, we can observe that the tps value is greatly increased (2x) with pgbouncer when compared without pgbouncer and also latency is also greatly reduced by half.
The below are the test results for different concurrent connections value
Test 2:
I reduced the max_connections to 50, performed the same test.
Without PGbouncer, I got the max connection reached error
Executed the same command with PGbouncer. The load test is started without any errors
So far, we have seen the basic configuration and performance improvements with the usage of the PGBouncer. By tweaking even more config variables based on our needs in PGBouncer, we can get more performance improvements.