Placeholder image

SAP S/4HANA Journey on Azure - Architecture, Scaling & Migration

| Abbas Ali Mir | Momin Qureshi |



Episode #271

Introduction

In episode 271 of our SAP on Azure video podcast we talk about very large HANA environments on Azure.

Already in the early days of running SAP on Azure, we had customers with large HANA environments. Obviously, in the beginning we talked about 2 or 4 Terrabyte systems. We were always supporting the largest environment and today we want to talk about a new recent S/4HANA scale-out environment with 24 TB VMs. This whole projects shows how the growth even of complex HANA architectures, from 3 TB to 6TH, 12 TB, 24 TB or even 32 TB is something that we support on Azure. To share more about this amazing journey, I am happy to welcome Abbas and Momin on our show today.

https://bestattungen-lindebaum.gemeinsam-trauern.net/Begleiten/ralf-udo-klahr

Find all the links mentioned here: https://www.saponazurepodcast.de/episode271

Reach out to us for any feedback / questions:

#Microsoft #SAP #Azure #SAPonAzure #HighAvailability #ScaleOut

Summary created by AI

  • Tribute to Ralph Klar and Recipe Tradition:
  • Goran, Holger, Momin, and Abbas Ali began the meeting by honoring their late colleague Ralph Klar, sharing memories and continuing his tradition of exchanging recipes, with Momin presenting a quick tuna sandwich recipe as a tribute.
    • Honoring Ralph Klar: Goran announced the passing of Ralph Klar, a former guest and valued colleague, and the team reflected on his positive impact, unique presentation style, and love for sharing recipes, expressing their sadness and appreciation for his legacy.
    • Recipe Sharing Tradition: Momin continued Ralph’s tradition by sharing a practical tuna sandwich recipe designed for quick family meals, describing its ingredients and preparation, and noting its effectiveness in calming ‘angry customers’ at home.
  • Customer Case Study: SAP S/4HANA Scale-Up to Scale-Out on Azure:
  • Abbas Ali, Momin, Goran, and Holger discussed a Fortune 50 consumer packaged goods customer’s journey with SAP S/4HANA on Azure, detailing their database growth, scale-up limitations, and strategic migration to a scale-out architecture to support expanding workloads.
    • Database Growth and Scale-Up Limits: Abbas Ali described the customer’s S/4HANA deployment history, starting with a 3TB VM in 2020 and scaling up to 32TB VMs by 2024, noting a rapid growth rate of 1TB per month and the impending need to exceed the 32TB scale-up limit.
    • Strategic Decision to Scale-Out: Facing scale-up constraints, the customer decided to transition to a scale-out architecture, deploying two 24TB MV3 VMs and conducting extensive testing over six months, with plans to go live in February 2026.
    • Migration Strategy Overview: Momin outlined the migration approach, involving replication from the source system to the scale-out environment, a planned downtime for cutover, redistribution of tables across nodes, and validation before switching the load balancer and enabling replication for high availability and disaster recovery.
  • SAP S/4HANA Scale-Out Architecture: Compute, Network, and Storage Design:
  • Abbas Ali, Momin, and Goran provided a detailed technical overview of the scale-out architecture, covering VM sizing, cluster pinning, network topology, storage partitioning, and naming conventions to optimize performance and manageability for large SAP workloads on Azure.
    • Compute Design and Cluster Pinning: Abbas Ali explained the use of MV3 VMs for both scale-up and scale-out, emphasizing the need to pin coordinator and worker nodes to the same compute cluster within a data center for optimal latency, achieved through backend engineering and availability set configuration.
    • Network Topology and Load Balancing: The team described the use of Azure Load Balancer for zone-redundant traffic management, the separation of client, internode, HSR, and storage virtual NICs, and the implementation of active read replicas to distribute workload and reduce pressure on the primary site.
    • Storage Architecture and Partitioning: Momin detailed the storage setup, including managed disks with write accelerator for HANA logs and multiple ANF volumes for HANA data to distribute I/O and avoid throughput bottlenecks, as well as strategies for subnet IP management using snapshots and cloning.
    • Naming Conventions for ANF Volumes: Momin and Abbas Ali stressed the importance of standardized naming conventions for ANF volumes, enabling administrators to quickly identify volume purpose, node association, and HA status, which is critical for complex scale-out environments.
  • High Availability and Disaster Recovery Architecture for SAP S/4HANA Scale-Out:
  • Abbas Ali and Momin described the high availability and disaster recovery setup for the scale-out architecture, including pacemaker cluster design, failover mechanisms, HANA fast restart challenges, and replication strategies to ensure business continuity and minimal downtime.
    • Pacemaker Cluster and Majority Maker Node: Abbas Ali explained the 2+2 database node architecture, the need for a fifth ‘majority maker’ node to enable quorum and automatic failover, and the use of pacemaker clusters spread across availability zones for robust HA.
    • HANA Fast Restart and Failover Optimization: The team discussed the customer’s use of HANA fast restart, the resulting challenges with defunct processes during failover, and the solution of switching the pacemaker’s action on fail from ‘kill’ to ‘fence’ to expedite memory clearing and reduce recovery time.
    • Disaster Recovery Strategy and RPO/RTO: Momin described the DR setup using HANA native replication for databases, ASR for app servers, and third-party solutions for ANF volume replication, achieving RPOs of 0-30 minutes and RTOs of 30 minutes to 4 hours, with future plans to adopt ANF ZRS for zero RPO.
  • Live Demonstration of SAP S/4HANA Scale-Out Failover:
  • Abbas Ali led a live demo of the scale-out environment, showing the cluster configuration, network interface mapping, and a simulated failover process, with Goran and Momin observing the transition of primary and worker nodes and validation of system health post-failover.
    • Cluster Configuration and Node Roles: Abbas Ali presented the demo environment, highlighting the five-node cluster setup, the roles of coordinator, worker, and majority maker nodes, and the mapping of client, internode, HSR, and storage NICs.
    • Failover Execution and Validation: During the demo, Abbas Ali triggered a failover by simulating an index server failure, observed the automatic promotion of secondary nodes, and validated the successful transition and system health using cluster monitoring commands.
    • Configuration File Insights: Abbas Ali reviewed key configuration files, such as global.ini, to show correct network interface assignments and resource agent settings, demonstrating the technical details that support reliable failover and recovery.