1. Introduction

This article is based on the latest webinar by Cloud Control Solutions on How to Deploy PostgreSQL HA on Kubernetes, which covers the importance of High Availability (HA) for software systems, the basics of PostgreSQL DBMS, and its deployment to containers and Kubernetes, AppZ PostgreSQL HA Architecture and its salient features. 

There are multiple tools and technologies available to deploy PostgreSQL in HA mode. But, AppZ PostgreSQL HA Stack is a unique solution developed by Cloud Control Solutions, which ensures 99.999 (Five Nines) availability.

2. What is High Availability (HA)?

Availability refers to the percentage of time a system is operational for use without any failures in a given time. A measure of availability is critical as every enterprise wants its system and services to be available 24/7 for its customers worldwide. Downtime can lead to huge losses for companies engaged in many critical operations. Availability is usually expressed in Nines (9). For example, five-nines availability means that the system is available 99.999% of the time, which is less than 6 minutes of downtime in a year. 

Achieving High Availability requires the underlying system to be designed to eliminate any single point of failure. Such HA designs ensure continuous operation or uptime for extended time periods. HA Architectures are built using redundant hardware, software, and network combinations, so that even if one component fails, the backup component will resume operations without downtime. In addition, HA Architectures ensures that the system is configured to withstand different loads and can recover failures with no or minimal downtimes.

High availability is essential for any enterprise that wants to protect its business against the risks caused by system outages and resultant revenue loss.

 Some of the best practices to be followed while designing and implementing HA systems are,

  • Data Replication, backups, and recovery
  • Clustering
  • Load Balancing
  • Failover designs
  • Geographic redundancy
  • Containers and Kubernetes 

3. PostgreSQL – The open source DBMS

PostgreSQL is one of the most commonly used free and open source object-relational database management systems (DBMS) used in many mission-critical applications by small, medium, and large enterprises. It is one of the most advanced, versatile, and well-documented database systems. PostgreSQL is ranked 4th in the overall DBMS ranking by DB-Engines in 2022. PostgreSQL was also ranked as the second most used and loved DBMS by developers in a survey conducted by stack overflow. 

PostgreSQL runs on different OS platforms, and it supports ANSI SQL, stored procedures, triggers, ACID, transactions, extension functions, replication, etc. It is highly reliable, offering robust performance. PostgreSQL has been supported by an active development community over the last 30 years.

4. Containers and Kubernetes

Containers offer compelling advantages over virtualization technologies. Containers are lightweight, resource-efficient, faster, portable, highly scalable, and easier to manage. Containers increase developer productivity and time to market. 

While containers are the way to bundle and run applications, Kubernetes is for managing these containers that run applications and ensuring that there is no down time. For example, when a container fails, Kubernetes detects it and starts another one. Kubernetes is one of the most commonly used orchestration platforms for managing containerized workloads and services. 

According to CNCF Annual Report 2021, container adoption and Kubernetes have truly gone mainstream – usage has risen across organizations globally, particularly in large businesses. According to CNCF’s respondents, 96% of organizations are either using or evaluating Kubernetes – a record high since our surveys began in 2016.

5. PostgreSQL HA Architectures

There are multiple PostgreSQL HA architectures available with varying levels of sophistication. 

5.1 Log Shipping Replication 

In this HA Architecture, one primary and one or more standby PostgreSQL servers work together in continuous archiving/recovery modes. Log files (WAL – Write Ahead Log files) from the primary are shipped to the secondary and restored there continuously. Log shipping is asynchronous, and there is a possibility of data loss when the primary fails. During the primary node failure, if any log files are left unshipped from primary to secondary, that data may get lost. Log Shipping architecture is simple to implement and less costly, but does not ensure data availability. 

5.2 Streaming Replication

PostgreSQL has built-in streaming replication support from version 9 onwards. In streaming replication, the secondary node/replica connects to the primary and continuously receives a stream of WAL records. In this case, replicas will be more up-to-date compared with Log Shipping Replication. Replica nodes act as a hot standby and can also be used to serve the users for their read requests. Streaming replication by default is asynchronous, and there are chances for data loss in case of a primary node failure. This is because the primary does not wait for a confirmation from the replica for it to commit a transaction. 

5.3 Synchronous Streaming Replication

PostgreSQL can be configured for synchronous streaming replication by making some replicas as a synchronous standby. In synchronous streaming architecture, the primary commits a transaction only after the replica acknowledges transaction persistence at its side.   Synchronous streaming replication ensures no data loss at the cost of higher user transaction delay/latency. 

Need More Features to be an HA System!

All the above architectures are still missing some of the basic components and features required to become a real HA environment, like 

  • A mechanism to detect primary failures
  • Failover mechanism (promoting one of the replicas as primary in case of a failure)
  • Switching of application to connect to the new primary node and switching back to the primary once it is recovered and 
  • Automation of the above steps.

Alternate PostgreSQL HA Architectures

There are some other commonly used architectures that implement the above HA features to some extent.

5.4 HA Using Patrioni 

Patroni is an open source template with tools to configure, deploy and operate a Postgres HA architecture. It has a Distributed Configuration System (DCS) and Patroni agents operating on the PostgreSQL nodes in the cluster. Patroni agent syncs up with the DCS system to notify any failures in the primary and then promotes any of the secondaries as primary.

5.5 HA Using pg_auto_failover 

Pg_auto_failover is another open-source PostgreSQL extension that configures the HA architecture by extending PostgreSQL capabilities. Each node runs a PostgreSQL extension called keeper, which manages the PostgreSQL service and a Monitor watching the health status of each of these nodes. The monitor is capable of doing the configurations, promotions, and demotions, in case of failures. It supports failure detection and auto-failover, but no query re-routing capabilities. 

6. AppZ PostgreSQL HA Stack

Based on our discussion so far, you may have found out that setting up of a highly available PostgreSQL DBMS system with Five Nines (99.999) availability is not an easy task. Based on the experience in working with PostgreSQL application deployments, management, and further research, Cloud Control Solutions released its latest AppZ PostgreSQL HA Stack in Feb 2022, addressing most of the pain points in setting up PostgreSQL DBMS in HA mode. 

Fig: AppZ PostgreSQL HA Architecture

AppZ PostgreSQL HA can support one primary and one or more secondary nodes as a Kubernetes cluster. Primary and secondaries are configured to have synchronous streaming replication. The application can do read/write to the primary whereas read-only from the secondaries, which are load-balanced. Pg_auto_failover (PostgreSQL extension) is used to set up a keeper on each node and the monitor on a separate node. Monitor and keeper are in sync to detect any failures in the primary, and once a failure is detected, the monitor will promote any of the secondaries to Primary and will allow read/write. PgConnector is deployed as HA Proxy, which can forward the traffic to primary/secondary nodes based on the request/traffic. 

6.1 Salient Features of AppZ PostgreSQL HA Stack

  • Synchronized Replication

          Primary and secondaries in Synchronous streaming replication to achieve zero data loss.

  • Auto-Failover

          PgMonitor to watch the primary and secondary to detect failures and establish switch over.

  • Separate RO & RW End Points

          PgConnector with separate end-points for Read-Only and Read/Write Access.

  • Multi-Hybrid Cloud Support

          Capable of running on AWS, Azure, GCP, or On-Prem.

  • Kubernetes First

          AppZ PostgreSQL HA is deployed to Kubernetes Clusters in containers.

  • Achieve Five 9s availability.

Interested to Know More?

Here is the link to a live demo of this game-changing stack on Kubernetes https://lnkd.in/eD9tEXat. This video covers a live webinar demonstrating the power of our PostgreSQL HA stack, which runs on any cloud or on-prem (Nutanix, VMWare, AWS, Azure, and Google).

For more information, get connected to us @: 

info@ecloudcontrol.com | www.ecloudcontrol.com 

About the Author

Dr. Anil Kumar

VP Engineering, Cloud Control
Founder | Architect | Consultant | Mentor | Advisor | Faculty

Solution Architect and IT Consultant with more than 25 years of IT Experience. Served in various roles with both national and international institutions. Expertise in working with both legacy and advanced technology stacks and business domains. 

One Comment

  1. […] services, and multi-cloud operations for more agility and resilience. In addition, containers and Kubernetes provide self-service infrastructures that allow enterprises to access resources whenever they need […]

Comments are closed.

About Cloud Control

Cloud Control simplifies cloud management with AppZ, DataZ, and ManageZ, optimizing operations, enhancing security, and accelerating time-to-market. We help businesses achieve cloud goals efficiently and reliably.

2024
GITEX
14-18 October

Dubai, UAE