SlideShare a Scribd company logo
Clustering in PostgreSQL
Because one database server is never enough (and
neither is two)
PGConf Nepal
May 11-12, 2023
-[ RECORD 1 ]--------------------------------------------
name | Umair Shahid
description | 20 year veteran of the PostgreSQL community
company | Stormatics
designation | Founder
location | Islamabad, Pakistan
family | Mom, Wife & 2 kids
kid1 | Son, 16 year old
kid2 | Daughter, 13 year old
postgres=# select * from umair;
PostgreSQL Solutions for the Enterprise
Clustering & Analytics, backed by 24/7 Support and Professional Services
10 years of
popularity
As measured by
PostgreSQL Oracle
MySQL SQL Server
https://db-engines.com/en/ranking_trend
Loved by Developers
2022 Developer Survey
Most
Loved
Most Used
(Professional Developers)
Most
Wanted
On to the topic now!
What is High Availability?
● Remain operational even in the face of hardware or software failure
● Minimize downtime
● Essential for mission-critical applications that require 24/7 availability
● Measured in ‘Nines of Availability’
Nines of Availability
Availability Downtime per year
90% (one nine) 36.53 days
99% (two nines) 3.65 days
99.9% (three nines) 8.77 hours
99.99% (four nines) 52.60 minutes
99.999% (five nines) 5.26 minutes
But my database resides
in the cloud, and the
cloud is always available
Right?
Wrong!
Amazon RDS Service Level Agreement
Multi-AZ configurations for MySQL, MariaDB, Oracle, and PostgreSQL are covered by the
Amazon RDS Service Level Agreement ("SLA"). The RDS SLA affirms that AWS will use
commercially reasonable efforts to make Multi-AZ instances of Amazon RDS available
with a Monthly Uptime Percentage of at least 99.95% during any monthly billing cycle. In
the event Amazon RDS does not meet the Monthly Uptime Percentage commitment,
affected customers will be eligible to receive a service credit.*
99.95% = three and a half nines = 4.38 hours of downtime per year!!!
*
https://aws.amazon.com/rds/ha/
So - what do I do if I want
better reliability for my
mission-critical data?
Clustering!
● Multiple database servers work
together to provide redundancy
● Gives the appearance of a single
database server
● Application communicates with
the primary PostgreSQL instance
● Data is replicated to standby
instances
● Auto failover in case the primary
node goes down
What is clustering?
Primary
Standby 1 Standby 2
Application
Write
Read
Replicate
What is auto failover?
Primary
Standby 1 Standby 2
Application
Standby 1
Primary
Standby 2 New Standby
Application
Primary
Standby 1 Standby 2
Application
1 2 3
- Primary node goes down - Standby 1 gets promoted to Primary
- Standby 2 becomes subscriber to
Standby 1
- New Standby is added to the cluster
- Application talks to the new Primary
● Write to the primary
PostgreSQL instance and
read from standbys
● Data redundancy through
replication to two standbys
● Auto failover in case the
primary node goes down
Clusters with load balancing
Primary
Standby 1 Standby 2
Application
Write
Read
Replicate
● Incremental backups
● Redundancy introduced
from primary as well
standbys
● RTO and RPO balance
achieved per organizational
requirements
● Point-in-time recovery
Clusters with backups and disaster recovery
Primary
Standby 1 Standby 2
Application
Write
Read
Replicate
Incremental
Backup
Incremental
Backup
Incremental
Backup
Backup
● Shared-Everything architecture
● Load balancing for read as well as
write operations
● Database redundancy to achieve
high availability
● Asynchronous replication between
nodes for better efficiency
*with conflict resolution at the application layer
Multi-node clusters with Active-Active configuration*
Active 2
Application
Write Read Replicate
Active 1 Active 3
● Shared-Nothing architecture
● Automatic data sharding
based on defined criteria
● Read and write operations
are auto directed to the
relevant node
● Each node can have its own
standbys for high availability
Node 2
Application
Node 1 Node 3
Multi-node clusters with data sharding and horizontal scaling
Coordinator
Write Read
Globally distributed clusters
● Spin up clusters on the public
cloud, private cloud, on-prem,
bare metal, VMs, or a hybrid of
all the above
● Geo fencing for regulatory
compliance
● High availability across data
centers and geographies
Replication - synchronous vs asynchronous
Synchronous
● Data is transferred immediately
● Transaction waits for confirmation from
replica before it commits
● Ensures data consistency across all nodes
● Performance overhead caused by latency
● Used where data accuracy is critical, even
at the expense of performance
Asynchronous
● Data may not be transferred immediately
● Transaction commits without waiting for
confirmation from replica
● Data may be inconsistent across nodes
● Faster and more scalable
● Used where performance matters more
than data accuracy
Challenges in
Clustering
● Split brain
● Network Latency
● False alarms
● Data inconsistency
#AI
● Split brain
● Network Latency
● False alarms
● Data inconsistency
Challenges in
Clustering
● Defined: Node in a highly available cluster lose connectivity
with each other but continue to function independently
● Challenge: More than one node believes that it is the primary
leading to inconsistencies and possible data loss
Split Brain
● Quorum based voting
○ Majority nodes must agree on
primary
○ Cluster stops if quorum is not
achieved
● Redundancy and failover
○ Prevents a single point of
failure
● Proper configuration
● Split brain resolver
○ Software based detection &
resolution
○ Can shut down nodes in case
of scenario
● Network segmentation
○ Physical network separation
○ Avoid congestion
Split Brain - Prevention
Split Brain - Resolution
● Witness node
○ Third node that acts as a tie-breaker
○ The winner acts as primary, the loser is shut down
● Consensus algorithm
○ Paxos and Raft protocols are popular
● Manual resolution
○ DBA observes which nodes are competing to be the primary
○ Takes action based on best practices learnt from experience
● Split brain
● Network Latency
● False alarms
● Data inconsistency
Challenges in
Clustering
● Defined: Time delay when data is transmitted from one point to another
● Challenge: Delayed replication can result in data loss. Delayed signals
can trigger a false positive.
Network Latency
● Deploy nodes in proximity
○ Reduces time delay and network
hops
● Implement quorum-based system
○ If quorum isn’t achieved failover
won’t occur
● High speed, low latency
networking
○ High quality hardware and
associated software
● Optimize database configuration
○ Parameter tuning based on
workloads
○ max_connections,
tcp_keealive_idle, …
● Alerting system
○ Detect and alert admins of
possible issues to preempt false
positives
Network Latency - Preventing False Positives
● Split brain
● Network Latency
● False alarms
● Data inconsistency
Challenges in
Clustering
● Defined: A problem is reported, but in reality, there is no issue
● Challenge: Can trigger a failover when one isn’t required, leading to
unnecessary disruptions and impacting performance
False Alarms
False Alarms - Prevention
● Proper configuration
○ Best practices, past experience, and some hit & trial is required to ensure that the
thresholds are configured appropriately
● Regular maintenance
○ Latest version of software and firmware to be used in order to avoid known bugs
and exploits
● Regular testing
○ Testing of various use cases can help identify bottlenecks and possible
misconfigurations
● Multiple monitoring tools
○ Multiple tools can help cross-reference alerts and confirm if failover is required
False Alarms - Resolution
In case a false alarm is triggered …
● Check logs
○ Check the logs of cluster nodes, network devices, and other infrastructure
components to identify any anomalies
● Notify stakeholders
○ Notify all stakeholders involved in the cluster’s operations to prevent any
unnecessary action
● Monitor cluster health
○ Monitor to cluster’s health closely to ensure that it is functioning correctly
and no further false alarms are triggered
● Split brain
● Network Latency
● False alarms
● Data inconsistency
Challenges in
Clustering
● Defined: Situations where data in different nodes of a cluster becomes out of
sync, leading to inconsistent results and potential data corruption
● Challenge: Inaccurate query results that vary based on which node is queried.
Such issues are very hard to debug.
Data Inconsistency
● Synchronous replication
○ Ensures data is synchronized
across all nodes before it is
committed
○ Induces a performance overhead
● Load balancer
○ Ensure that all queries from a
specific application are sent to
the same node
○ Relies on eventual consistency of
the cluster
Data Inconsistency - Prevention
● Monitoring tools
○ Help with early identification of
possible issues so you can take
preventive measures
● Maintenance windows
○ Minimize data disruption with
planned downtime
● Regular testing
○ Test production use cases
regularly to ensure the cluster is
behaving as expected
Data Inconsistency - Resolution
● Resynchronization
○ Manual sync between nodes to correct data inconsistencies
● Rollback
○ Point-in-time recovery using tools like pg_rewind and pg_backrest
● Monitoring
○ Diligent monitoring to ensure that the issue doesn’t recur
This all sounds really hard
Open source clustering tools for PostgreSQL
● repmgr
○ https://repmgr.org/
○ GPL v3
○ Provides automatic failover
○ Manage and monitor replication
● pgpool-II
○ https://pgpool.net/
○ Similar to BSD & MIT
○ Middleware between PostgreSQL and client applications
○ Connection pooling, load balancing, caching, and automatic failover
● Patroni
○ https://patroni.readthedocs.io/en/latest/
○ MIT
○ Template for PostgreSQL high availability clusters
○ Automatic failover, configuration management, & cluster management
Questions?
pg_umair

More Related Content

What's hot (20)

PDF
PostgreSql query planning and tuning
Federico Campoli
 
PDF
Maxscale_메뉴얼
NeoClova
 
PDF
Federated Engine 실무적용사례
I Goo Lee
 
PPTX
Cassandra - A decentralized storage system
Arunit Gupta
 
PDF
홍성우, 게임 서버의 목차 - 시작부터 출시까지, NDC2019
devCAT Studio, NEXON
 
KEY
Unit Test Your Database
David Wheeler
 
PDF
The internals of gporca optimizer
Xin Zhang
 
DOCX
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
PDF
Mastering PostgreSQL Administration
Command Prompt., Inc
 
PPTX
MySQL_MariaDB로의_전환_기술요소-202212.pptx
NeoClova
 
PDF
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
Redis Labs
 
PDF
하둡 HDFS 훑어보기
beom kyun choi
 
PDF
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
PPTX
MySQL_MariaDB-성능개선-202201.pptx
NeoClova
 
PDF
PGConf.ASIA 2017 Logical Replication Internals (English)
Noriyoshi Shinoda
 
PDF
MariaDB 마이그레이션 - 네오클로바
NeoClova
 
PPTX
MaxScale이해와활용-2023.11
NeoClova
 
PPTX
Maxscale 소개 1.1.1
NeoClova
 
PPTX
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
PDF
Managing HotSpot Clients With FreeRadius
Dashamir Hoxha
 
PostgreSql query planning and tuning
Federico Campoli
 
Maxscale_메뉴얼
NeoClova
 
Federated Engine 실무적용사례
I Goo Lee
 
Cassandra - A decentralized storage system
Arunit Gupta
 
홍성우, 게임 서버의 목차 - 시작부터 출시까지, NDC2019
devCAT Studio, NEXON
 
Unit Test Your Database
David Wheeler
 
The internals of gporca optimizer
Xin Zhang
 
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
Mastering PostgreSQL Administration
Command Prompt., Inc
 
MySQL_MariaDB로의_전환_기술요소-202212.pptx
NeoClova
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
Redis Labs
 
하둡 HDFS 훑어보기
beom kyun choi
 
BlueStore: a new, faster storage backend for Ceph
Sage Weil
 
MySQL_MariaDB-성능개선-202201.pptx
NeoClova
 
PGConf.ASIA 2017 Logical Replication Internals (English)
Noriyoshi Shinoda
 
MariaDB 마이그레이션 - 네오클로바
NeoClova
 
MaxScale이해와활용-2023.11
NeoClova
 
Maxscale 소개 1.1.1
NeoClova
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Managing HotSpot Clients With FreeRadius
Dashamir Hoxha
 

Similar to 20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database server is never enough (and neither is two).pdf (20)

PDF
Clustering in PostgreSQL - Because one database server is never enough (and n...
Umair Shahid
 
PPTX
20240515 - Chicago PUG - Clustering in PostgreSQL: Because one database serve...
Umair Shahid
 
PDF
Highly available distributed databases, how they work, javier ramirez at teowaki
javier ramirez
 
PDF
Tiger oracle
d0nn9n
 
PDF
Design (Cloud systems) for Failures
Rodolfo Kohn
 
PPTX
Compare Clustering Methods for MS SQL Server
AlexDepo
 
ODP
Zero Downtime JEE Architectures
Alexander Penev
 
PPTX
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
Julyanto SUTANDANG
 
PDF
Basics of the Highly Available Distributed Databases - teowaki - javier ramir...
javier ramirez
 
PDF
Everything you always wanted to know about highly available distributed datab...
Codemotion
 
PDF
Architecture for building scalable and highly available Postgres Cluster
Ashnikbiz
 
PDF
Cluster Computing
BishowRajBaral
 
PDF
Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)
Ontico
 
ODP
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
PDF
SVCC-2014
John Brinnand
 
PDF
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Severalnines
 
PDF
Scala like distributed collections - dumping time-series data with apache spark
Demi Ben-Ari
 
PDF
Monitoring Clusters and Load Balancers
Prince JabaKumar
 
PDF
HA Summary
Zhang Fan
 
PDF
Introduction to Distributed Computing & Distributed Databases
Shankar Iyer
 
Clustering in PostgreSQL - Because one database server is never enough (and n...
Umair Shahid
 
20240515 - Chicago PUG - Clustering in PostgreSQL: Because one database serve...
Umair Shahid
 
Highly available distributed databases, how they work, javier ramirez at teowaki
javier ramirez
 
Tiger oracle
d0nn9n
 
Design (Cloud systems) for Failures
Rodolfo Kohn
 
Compare Clustering Methods for MS SQL Server
AlexDepo
 
Zero Downtime JEE Architectures
Alexander Penev
 
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
Julyanto SUTANDANG
 
Basics of the Highly Available Distributed Databases - teowaki - javier ramir...
javier ramirez
 
Everything you always wanted to know about highly available distributed datab...
Codemotion
 
Architecture for building scalable and highly available Postgres Cluster
Ashnikbiz
 
Cluster Computing
BishowRajBaral
 
Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)
Ontico
 
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
SVCC-2014
John Brinnand
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Severalnines
 
Scala like distributed collections - dumping time-series data with apache spark
Demi Ben-Ari
 
Monitoring Clusters and Load Balancers
Prince JabaKumar
 
HA Summary
Zhang Fan
 
Introduction to Distributed Computing & Distributed Databases
Shankar Iyer
 
Ad

More from Umair Shahid (6)

PDF
20240518 - VixulCon 2024 - The Rise of PostgreSQL_ Historic Trends and Modern...
Umair Shahid
 
PDF
20221019 - Singapore Roadshow - Open source licenses, the impact on PostgreSQ...
Umair Shahid
 
PDF
Driving the future of PostgreSQL adoption
Umair Shahid
 
PDF
Islamabad PUG - 7th Meetup - performance tuning
Umair Shahid
 
PDF
Islamabad PUG - 7th meetup - performance tuning
Umair Shahid
 
ODP
Logical replication with pglogical
Umair Shahid
 
20240518 - VixulCon 2024 - The Rise of PostgreSQL_ Historic Trends and Modern...
Umair Shahid
 
20221019 - Singapore Roadshow - Open source licenses, the impact on PostgreSQ...
Umair Shahid
 
Driving the future of PostgreSQL adoption
Umair Shahid
 
Islamabad PUG - 7th Meetup - performance tuning
Umair Shahid
 
Islamabad PUG - 7th meetup - performance tuning
Umair Shahid
 
Logical replication with pglogical
Umair Shahid
 
Ad

Recently uploaded (20)

PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 

20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database server is never enough (and neither is two).pdf

  • 1. Clustering in PostgreSQL Because one database server is never enough (and neither is two) PGConf Nepal May 11-12, 2023
  • 2. -[ RECORD 1 ]-------------------------------------------- name | Umair Shahid description | 20 year veteran of the PostgreSQL community company | Stormatics designation | Founder location | Islamabad, Pakistan family | Mom, Wife & 2 kids kid1 | Son, 16 year old kid2 | Daughter, 13 year old postgres=# select * from umair;
  • 3. PostgreSQL Solutions for the Enterprise Clustering & Analytics, backed by 24/7 Support and Professional Services
  • 4. 10 years of popularity As measured by PostgreSQL Oracle MySQL SQL Server https://db-engines.com/en/ranking_trend
  • 5. Loved by Developers 2022 Developer Survey Most Loved Most Used (Professional Developers) Most Wanted
  • 6. On to the topic now!
  • 7. What is High Availability? ● Remain operational even in the face of hardware or software failure ● Minimize downtime ● Essential for mission-critical applications that require 24/7 availability ● Measured in ‘Nines of Availability’
  • 8. Nines of Availability Availability Downtime per year 90% (one nine) 36.53 days 99% (two nines) 3.65 days 99.9% (three nines) 8.77 hours 99.99% (four nines) 52.60 minutes 99.999% (five nines) 5.26 minutes
  • 9. But my database resides in the cloud, and the cloud is always available Right?
  • 11. Amazon RDS Service Level Agreement Multi-AZ configurations for MySQL, MariaDB, Oracle, and PostgreSQL are covered by the Amazon RDS Service Level Agreement ("SLA"). The RDS SLA affirms that AWS will use commercially reasonable efforts to make Multi-AZ instances of Amazon RDS available with a Monthly Uptime Percentage of at least 99.95% during any monthly billing cycle. In the event Amazon RDS does not meet the Monthly Uptime Percentage commitment, affected customers will be eligible to receive a service credit.* 99.95% = three and a half nines = 4.38 hours of downtime per year!!! * https://aws.amazon.com/rds/ha/
  • 12. So - what do I do if I want better reliability for my mission-critical data? Clustering!
  • 13. ● Multiple database servers work together to provide redundancy ● Gives the appearance of a single database server ● Application communicates with the primary PostgreSQL instance ● Data is replicated to standby instances ● Auto failover in case the primary node goes down What is clustering? Primary Standby 1 Standby 2 Application Write Read Replicate
  • 14. What is auto failover? Primary Standby 1 Standby 2 Application Standby 1 Primary Standby 2 New Standby Application Primary Standby 1 Standby 2 Application 1 2 3 - Primary node goes down - Standby 1 gets promoted to Primary - Standby 2 becomes subscriber to Standby 1 - New Standby is added to the cluster - Application talks to the new Primary
  • 15. ● Write to the primary PostgreSQL instance and read from standbys ● Data redundancy through replication to two standbys ● Auto failover in case the primary node goes down Clusters with load balancing Primary Standby 1 Standby 2 Application Write Read Replicate
  • 16. ● Incremental backups ● Redundancy introduced from primary as well standbys ● RTO and RPO balance achieved per organizational requirements ● Point-in-time recovery Clusters with backups and disaster recovery Primary Standby 1 Standby 2 Application Write Read Replicate Incremental Backup Incremental Backup Incremental Backup Backup
  • 17. ● Shared-Everything architecture ● Load balancing for read as well as write operations ● Database redundancy to achieve high availability ● Asynchronous replication between nodes for better efficiency *with conflict resolution at the application layer Multi-node clusters with Active-Active configuration* Active 2 Application Write Read Replicate Active 1 Active 3
  • 18. ● Shared-Nothing architecture ● Automatic data sharding based on defined criteria ● Read and write operations are auto directed to the relevant node ● Each node can have its own standbys for high availability Node 2 Application Node 1 Node 3 Multi-node clusters with data sharding and horizontal scaling Coordinator Write Read
  • 19. Globally distributed clusters ● Spin up clusters on the public cloud, private cloud, on-prem, bare metal, VMs, or a hybrid of all the above ● Geo fencing for regulatory compliance ● High availability across data centers and geographies
  • 20. Replication - synchronous vs asynchronous Synchronous ● Data is transferred immediately ● Transaction waits for confirmation from replica before it commits ● Ensures data consistency across all nodes ● Performance overhead caused by latency ● Used where data accuracy is critical, even at the expense of performance Asynchronous ● Data may not be transferred immediately ● Transaction commits without waiting for confirmation from replica ● Data may be inconsistent across nodes ● Faster and more scalable ● Used where performance matters more than data accuracy
  • 21. Challenges in Clustering ● Split brain ● Network Latency ● False alarms ● Data inconsistency
  • 22. #AI
  • 23. ● Split brain ● Network Latency ● False alarms ● Data inconsistency Challenges in Clustering
  • 24. ● Defined: Node in a highly available cluster lose connectivity with each other but continue to function independently ● Challenge: More than one node believes that it is the primary leading to inconsistencies and possible data loss Split Brain
  • 25. ● Quorum based voting ○ Majority nodes must agree on primary ○ Cluster stops if quorum is not achieved ● Redundancy and failover ○ Prevents a single point of failure ● Proper configuration ● Split brain resolver ○ Software based detection & resolution ○ Can shut down nodes in case of scenario ● Network segmentation ○ Physical network separation ○ Avoid congestion Split Brain - Prevention
  • 26. Split Brain - Resolution ● Witness node ○ Third node that acts as a tie-breaker ○ The winner acts as primary, the loser is shut down ● Consensus algorithm ○ Paxos and Raft protocols are popular ● Manual resolution ○ DBA observes which nodes are competing to be the primary ○ Takes action based on best practices learnt from experience
  • 27. ● Split brain ● Network Latency ● False alarms ● Data inconsistency Challenges in Clustering
  • 28. ● Defined: Time delay when data is transmitted from one point to another ● Challenge: Delayed replication can result in data loss. Delayed signals can trigger a false positive. Network Latency
  • 29. ● Deploy nodes in proximity ○ Reduces time delay and network hops ● Implement quorum-based system ○ If quorum isn’t achieved failover won’t occur ● High speed, low latency networking ○ High quality hardware and associated software ● Optimize database configuration ○ Parameter tuning based on workloads ○ max_connections, tcp_keealive_idle, … ● Alerting system ○ Detect and alert admins of possible issues to preempt false positives Network Latency - Preventing False Positives
  • 30. ● Split brain ● Network Latency ● False alarms ● Data inconsistency Challenges in Clustering
  • 31. ● Defined: A problem is reported, but in reality, there is no issue ● Challenge: Can trigger a failover when one isn’t required, leading to unnecessary disruptions and impacting performance False Alarms
  • 32. False Alarms - Prevention ● Proper configuration ○ Best practices, past experience, and some hit & trial is required to ensure that the thresholds are configured appropriately ● Regular maintenance ○ Latest version of software and firmware to be used in order to avoid known bugs and exploits ● Regular testing ○ Testing of various use cases can help identify bottlenecks and possible misconfigurations ● Multiple monitoring tools ○ Multiple tools can help cross-reference alerts and confirm if failover is required
  • 33. False Alarms - Resolution In case a false alarm is triggered … ● Check logs ○ Check the logs of cluster nodes, network devices, and other infrastructure components to identify any anomalies ● Notify stakeholders ○ Notify all stakeholders involved in the cluster’s operations to prevent any unnecessary action ● Monitor cluster health ○ Monitor to cluster’s health closely to ensure that it is functioning correctly and no further false alarms are triggered
  • 34. ● Split brain ● Network Latency ● False alarms ● Data inconsistency Challenges in Clustering
  • 35. ● Defined: Situations where data in different nodes of a cluster becomes out of sync, leading to inconsistent results and potential data corruption ● Challenge: Inaccurate query results that vary based on which node is queried. Such issues are very hard to debug. Data Inconsistency
  • 36. ● Synchronous replication ○ Ensures data is synchronized across all nodes before it is committed ○ Induces a performance overhead ● Load balancer ○ Ensure that all queries from a specific application are sent to the same node ○ Relies on eventual consistency of the cluster Data Inconsistency - Prevention ● Monitoring tools ○ Help with early identification of possible issues so you can take preventive measures ● Maintenance windows ○ Minimize data disruption with planned downtime ● Regular testing ○ Test production use cases regularly to ensure the cluster is behaving as expected
  • 37. Data Inconsistency - Resolution ● Resynchronization ○ Manual sync between nodes to correct data inconsistencies ● Rollback ○ Point-in-time recovery using tools like pg_rewind and pg_backrest ● Monitoring ○ Diligent monitoring to ensure that the issue doesn’t recur
  • 38. This all sounds really hard
  • 39. Open source clustering tools for PostgreSQL ● repmgr ○ https://repmgr.org/ ○ GPL v3 ○ Provides automatic failover ○ Manage and monitor replication ● pgpool-II ○ https://pgpool.net/ ○ Similar to BSD & MIT ○ Middleware between PostgreSQL and client applications ○ Connection pooling, load balancing, caching, and automatic failover ● Patroni ○ https://patroni.readthedocs.io/en/latest/ ○ MIT ○ Template for PostgreSQL high availability clusters ○ Automatic failover, configuration management, & cluster management