High availability overview

This strategy guide provides technical guidance and best practices for designing and deploying highly available (HA) workloads to a Google Distributed Cloud (GDC) air-gapped universe configured with multiple zones, or multi-zone. The guide outlines key architectural patterns, service configurations, and operational considerations necessary to minimize downtime and ensure business continuity for applications running on GDC.

High availability strategies are intended for technical professionals involved in designing, deploying, and managing applications on GDC, including:

  • Cloud Architects: Designing resilient infrastructure and application architectures on GDC.

  • DevOps Engineers and Site Reliability Engineers (SREs): Implementing deployment strategies, automation, monitoring, and incident response for HA workloads.

  • Application Developers: Building applications that are fault-tolerant and integrate seamlessly with HA infrastructure patterns.

Importance of high availability

In modern distributed systems, ensuring high availability is critical. Downtime, whether planned or unplanned, can lead to significant business disruption, revenue loss, damage to reputation, and poor user experience. For workloads running at the edge or in private data centers using GDC, availability often correlates directly with core operational success, especially for latency-sensitive or mission-critical applications. Designing for HA from the outset is essential to build resilient and reliable services.

Hyperscale capabilities, delivered locally

GDC extends Google Cloud infrastructure and services to the edge and your data centers. GDC provides a fully managed hardware and software solution, letting you run Google Kubernetes Engine (GKE) on GDC clusters and other Google Cloud services closer to where your data is generated and consumed.

This guide focuses specifically on GDC universes configured in a multi-zone topology. With multi-zone, a single GDC universe comprises multiple, physically isolated zones within the same location, such as a data center campus or metropolitan area. These zones have independent power, cooling, and networking, providing protection against localized physical infrastructure failures. The low-latency, high-bandwidth network connectivity between zones within a GDC universe enables synchronous replication and rapid failover, forming the foundation for building highly available applications.

Scalability and load balancing

Beyond basic component redundancy, managing traffic effectively and enabling seamless scaling are crucial for maintaining high availability, especially with varying load conditions. GDC provides several mechanisms for load balancing and sophisticated traffic management.

External load balancer for north-south traffic

To expose your applications to users or systems outside a GKE on GDC cluster (north-south traffic), you use GDC's managed external load balancing capabilities. The external load balancer (ELB) service provides these capabilities and integrates seamlessly with Kubernetes.

The key characteristics of the ELB service that provides HA and scalability are the following:

  • Managed service: ELB is managed by GDC, designed for high availability and resilience.

  • External access: Provisions stable external IP addresses from GDC-managed pools, providing a consistent entry point for external clients.

  • Load balancer integration with Kubernetes: Automatically provisions and configures the load balancer when you create a Kubernetes Service of type: LoadBalancer without specific internal annotations.

  • Zone awareness: Distributes incoming traffic across healthy application pods running in all available zones within the GDC universe. The ELB relies on pod readiness probes to determine backend health.

  • Scalability: Handles distribution of external traffic as your application scales horizontally across nodes and zones.

Using an external load balancer is the standard and recommended way to achieve HA for external traffic data transfer in, ensuring that client requests are automatically routed away from failing zones or instances.

For more information, see Configure external load balancers.

Internal load balancer for east-west traffic

For communication between services running within the same GKE on GDC cluster (east-west traffic), GDC provides an internal load balancer (ILB). This is crucial for decoupling internal services and ensuring internal communication paths are also highly available and scalable.

The key characteristics of the ILB service that provides HA and scalability are the following:

  • Internal access: Provisions a stable internal IP address accessible only from within the GDC network, such as cluster nodes or other internal services.

  • Load balancer integration with Kubernetes: Typically provisioned by creating a Kubernetes Service of type: LoadBalancer with a specific annotation to indicate it must be internal. For example, networking.gke.io/load-balancer-type: "Internal".

  • Zone awareness: Distributes traffic across healthy backend pods, which are identified with readiness probes, located in all available zones. This distribution prevents internal communication failures if one zone experiences issues.

  • Service discovery and decoupling: Provides a stable internal IP address and DNS name with kube-dns and CoreDNS integration. Services can discover and communicate with each other, removing the need for clients to know individual pod IP addresses.

  • Scalability: Facilitates scaling of internal backend services by distributing traffic across all available healthy replicas.

Using ILB for internal service-to-service communication ensures that internal traffic flow is also resilient to zone failures and can scale effectively, complementing the HA provided by the external ELB and underlying compute distribution. This is often used for tiered applications where frontends must communicate with backend APIs or databases within the Kubernetes cluster.

For more information, see Configure internal load balancers.

Deploy HA apps across zones with asynchronous storage

GDC lets you run infrastructure and applications closer to your data sources or end users. Achieving HA in your GDC universe is crucial for critical workloads. You can deploy HA applications across multiple zones within your GDC universe, leveraging asynchronous storage replication for data persistence and disaster recovery.

Zones represent distinct failure domains within a single universe. By distributing application components and replicating data across zones, you can significantly improve resilience against localized hardware failures or maintenance events.

For more information, see the following two scenarios to help understand deploying a critical stateful service: