This post was originally published in March 2023 and was updated in March 2025.

If you’re using PostgreSQL in the cloud, there’s a good chance you’re spending more than you need in order to get the results required for your business. Effectively managing PostgreSQL costs in the cloud is crucial, and this post explores practical ways to achieve significant cost reductions.

Let’s take a look at how to get the benefits you need while spending less, based on the recommendations presented by Dani Guzmán Burgos, our Percona Monitoring and Management (PMM) Tech Lead, on this webinar (now available on demand) hosted in November last year.

Identify over-provisioning to reduce PostgreSQL cloud costs

The first step in cost reduction is simple: use what you need and not more. Don’t pay for capacity you don’t use.

Usage reduction is a continuous process. Identifying which resources you can trim to reduce your monthly bill can be difficult, but looking at the right metrics will help you understand the application’s actual requirements for your PostgreSQL costs in the cloud.

In the Home Dashboard of PMM, a low CPU utilization on any of the database services being monitored could mean that the server is inactive or over-provisioned. Marked in red in Figure 1 is a server with less than 30% of CPU usage. PMM can also show you historical data that can help you identify how long a service has been in a given state. Configuration of the CPU metrics can be changed in the dashboard. These color-coded states on panels are available in PMM 2.32.0 and later.

Percona Monitoring and Management (PMM) dashboard showing low CPU usage metric, highlighting potential server over-provisioning for PostgreSQL cost savings.

Figure 1: PMM Home Dashboard

 

From the Amazon Web Services (AWS) documentation, an instance is considered over-provisioned when at least one specification (like CPU, memory, or network) can be downsized while still meeting performance requirements, and no specification is under-provisioned. Over-provisioned instances directly lead to unnecessary infrastructure costs.

Making efficient use of resources and controlling your cloud budget isn’t a one-time fix; it’s a continuous cycle of choosing properly sized resources and eliminating over-provisioning.

Usage reduction at scale requires a cultural shift, where engineers consider cost alongside memory or bandwidth as another key deployment KPI.

Think of a gaming company: as a game gets popular, resource needs increase. But if popularity wanes, servers can become over-provisioned. Re-sizing allocated resources becomes essential to better fit the application’s current needs and control PostgreSQL costs in the cloud.

Right-sizing instances to cut PostgreSQL cloud expenses

There are three primary approaches to usage reduction:

  1. Reducing waste (eliminating unused resources)

  2. Re-architecting (potentially complex, longer-term changes)

  3. Re-sizing to exactly what you need (often the most immediate impact)

Regardless of your deployment method for PostgreSQL, monitor these key metrics to determine when re-sizing is needed:

  • CPU utilization

  • Memory usage

  • Network throughput

  • Storage usage (IOPS and throughput)

Remember, infrastructure optimization aims for more than just cost savings. You must ensure operations aren’t negatively impacted when making decisions based on metrics. The primary goal remains ensuring services have the necessary operating capacity.

Optimizing specific cloud resources for PostgreSQL savings

Let’s dive into specific components, using AWS as a common example platform. Your configuration choices heavily influence both performance and your monthly bill when managing PostgreSQL costs in the cloud.

Choosing cost-effective cloud CPUs (like AWS Graviton)

Considering AWS as your cloud platform of choice, the configuration made for your infrastructure will influence the performance of your application and monthly costs. For example, an Amazon Elastic Compute Cloud (EC2) instance with Graviton2 processors will be a better choice than non-ARM options, as it’s cheaper, and you will get real and faster cores which means the CPU cores are physical and not with hyper-threading. Graviton2 processors aim to use Reserved Instances to save costs in the long run.

Benefits of Graviton2 Processors

  • Best price performance for a broad range of workloads
  • Extensive software support
  • Enhanced security for cloud applications
  • Available with managed AWS services
  • Best performance per watt of energy used in Amazon EC2

Selecting optimal cloud storage (AWS EBS examples)

Choosing the right storage is key to performance and cost-efficiency. For PostgreSQL on EC2, Amazon Elastic Block Store (EBS) is the standard block storage solution.

From AWS documentation, Amazon EBS is an easy-to-use, scalable, high-performance block-storage service designed for Amazon EC2.

Diagram showing Amazon Elastic Block Store (EBS) as a high-performance block storage service suitable for PostgreSQL databases in the cloud.

Figure 2: Amazon Elastic Block Storage

AWS documentation confirms EBS is designed for demanding use cases like relational databases (PostgreSQL, MySQL, SQL Server, Oracle) and NoSQL databases.

You can choose HDD-based volumes (like st1, sc1 – generally not ideal for primary database storage) or SSD-based volumes recommended for database workloads:

  • io1
  • io2
  • io2 Block Express
  • gp2
  • gp3

Which SSD type is best? It depends on your specific workload requirements (Disk space, IOPS needed, Throughput needed) and your budget. gp3 volumes are often a great starting point, offering better baseline performance and independent scaling of IOPS/throughput compared to gp2, frequently resulting in lower PostgreSQL costs in the cloud for similar performance. io1/io2 are for workloads needing sustained high IOPS.

Avi Drabkin’s blog post is recommended reading on configuring EBS volumes. For detailed specs, see the official Amazon EBS Volume Types page.

Multi-AZ vs. read replicas: Balancing PostgreSQL cost and high availability

Multi-AZ deployment

In an Amazon RDS Multi-AZ deployment, Amazon RDS automatically creates a primary database (DB) instance and synchronously replicates the data to an instance in a different AZ. Amazon RDS automatically fails over to a standby instance without manual intervention when it detects a failure.

Diagram illustrating an Amazon RDS Multi-AZ deployment for PostgreSQL high availability.

Figure 3: Amazon RDS Multi-AZ Deployment

Read replica

Amazon RDS creates a second DB instance using a snapshot of the source DB instance. It then uses the engines’ native asynchronous replication to update the read replica whenever there is a change to the source DB instance. The read replica operates as a DB instance that allows only read-only connections; applications can connect to a read replica just as they would to any DB instance. Amazon RDS replicates all databases in the source DB instance.

Diagram showing Amazon RDS Read Replicas for scaling read traffic in PostgreSQL.

Figure 4: Amazon RDS Read Replicas

Which option is better for costs?

Multi-AZ deployments offer robust HA but are significantly more expensive as you pay for the standby instance which doesn’t serve read traffic.

A better option would be to deploy reader instances and combine them with the use of a reverse proxy, like pgpool-II or pgbouncer. The reader instances also cost more than a standard setup, but you can use them for production to handle everyday database traffic.

Pgpool-II can not only be used for reducing connection usage, which will be helpful to reduce CPU and memory usage but can also do load balancing. With load balancing, you can redistribute the traffic, sending the reading requests to your read replicas and writing requests to your main database instance automatically.

Regarding read replicas, in AWS, you cannot promote an RDS PostgreSQL read replica, which means a read replica can’t become the primary instance. Whenever you try to do this, the read replica detaches from the primary instance and become its own primary instance, and you will end up having two different clusters.

One solution is using the pglogical extension for creating replicas outside the RDS path. When combining the pglogical replication with a reverse proxy, you will still get the benefits of a managed database, including backups, minor upgrades maintenance, recovery support, and being tied to the Multi-AZ configuration, which translates to full control over planned failovers.

Also, converting a replica to the primary instance would be a better upgrade approach. For example, if you need to upgrade a database with a large amount of data, this process could take hours, and your instance won’t be available during this time. So, with this configuration, you can upgrade a replica and later convert that replica to the primary instance without interrupting operations.

Check this blog post for more information on how to use pglogical for upgrading your database instances.

Managing PostgreSQL vacuum processes for cost efficiency

As explained here, PostgreSQL tables and indexes experience bloating. Updates and Deletes mark old row versions as dead, but the space isn’t immediately reclaimed by the OS. This unused space is bloat.

How to remove bloat? That’s what the vacuum process is intended for with the help of autovacuum and vacuumdb.

Autovacuum is a daemon that automates the execution of VACUUM and ANALYZE (to gather statistics) commands. Autovacuum checks for bloated tables in the database and reclaims the space for reuse.

vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will also generate internal statistics used by the PostgreSQL query optimizer. vacuumdb is a wrapper around the SQL command VACUUM.

Running Autovacuum on a database with tables that store tens of terabytes can become such a big overhead, not only for the amount of data but the dead tuples generated on every transaction. A solution could be changing the EBS volume type for an io1 that can support this workload, but your monthly bill would also increase.

A potentially more cost-effective alternative in such scenarios is running VACUUM on demand during low-traffic periods. Disabling Autovacuum doesn’t mean never vacuuming; it means taking manual control. The performance overhead of carrying some dead tuples during peak hours might be cheaper than paying for constant high I/O capacity just for aggressive autovacuuming. This strategy guarantees resources are available for primary operations when needed most.

Is this the best solution for every scenario? No, it will depend on the requirements of your database.

Monitoring your database for dead tuples (bloat) is also recommended. For this matter, you can use the Experimental PostgreSQL Vacuum Monitoring. This experimental dashboard is not part of PMM, but you can try it and provide feedback.

Serverless PostgreSQL: Understanding the Total Cost of Ownership (TCO)

Serverless database options (like AWS RDS Proxy with scaling, or Aurora Serverless) promise a pay-for-what-you-use model, potentially minimizing costs from idle resources.

However, the move to serverless isn’t just about infrastructure savings. You must evaluate it through the lens of Total Cost of Ownership (TCO). TCO includes engineering time for potential application changes or migration efforts and the impact of time-to-market.

Serverless shifts responsibilities (server management, scaling, patching) to the cloud provider (AWS, GCP, Azure), freeing up developer and DevOps time to focus on application features. But, the cost of people (engineering time for migration/adaptation) might outweigh infrastructure savings, especially when moving complex monolithic applications.

Carefully weigh the potential cost reduction against the effort and TCO involved in redesigning services for a serverless architecture before deciding if it aids in reducing PostgreSQL costs in the cloud for your specific situation.

Conclusion: Proactive management reduces PostgreSQL cloud costs

Knowing your project’s database requirements is fundamental to optimizing your PostgreSQL costs in the cloud. The configuration choices you make directly impact application performance and your monthly bill.

Resource needs often change. Continuously monitoring metrics like CPU usage, memory usage, IOPS, and storage capacity helps determine when re-sizing your infrastructure is necessary to align spending with actual usage. If database activity decreases, you might be paying for unneeded capacity. Adjusting your setup ensures you only pay for what you truly use.

Key takeaways for reducing PostgreSQL cloud costs:

  • Monitor usage: Use tools like PMM to spot over-provisioning (CPU, Memory).

  • Right-size: Regularly adjust instance types and storage based on utilization metrics.

  • Choose wisely: Select cost-effective CPU (e.g., Graviton) and Storage (e.g., gp3 vs io1/io2) options.

  • Balance HA & cost: Consider Read Replicas + Proxy as a potentially cheaper alternative to Multi-AZ for some needs.

  • Optimize vacuum: Tune autovacuum or consider strategic on-demand vacuuming for large/busy tables.

  • Evaluate serverless TCO: Look beyond infrastructure costs to include engineering effort when considering serverless.

By following the recommendations discussed (originally presented in this webinar), you can design and maintain a cost-optimized infrastructure for your PostgreSQL databases in the cloud without sacrificing performance.

Enterprise PostgreSQL

 

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments