Cloud Portability: How Platform Engineering Pushes Past Toil
In our previous discussions on platform engineering, we delved into the intricacies of transitioning to the field, the underlying motivations and its prospective trajectory. We discussed how adopting platform engineering will allow tech organizations to be more adaptable to changes toward in a business direction.
One such extreme change is cloud portability, which is not uncommon anymore in today’s business but has a significant effect on developer experience. Most companies start their journey with a single cloud provider, embracing the cloud native functionalities that these services offer. They build expertise, write automations and leverage as much of the cloud as they can.
But what happens when a business becomes too intertwined with one provider’s services? We’ve seen it firsthand: vendor lock in. This is a significant concern for businesses that need the flexibility to switch or interoperate between providers due to various factors, from customer preferences and regional market conditions to data sovereignty laws and pricing.
This article delves into case studies, outlines challenges and offers an approach to cloud portability that is practical and minimizes toil.
The Hurdles of Cloud Portability
As businesses scale, moving across cloud providers — or cloud portability/interoperability — becomes tempting but also fraught with hurdles. Let’s shed light on what goes inside the tech discussion rooms once such a decision is made.
- Obscured documentation: First, it must be determined what is cloud portable and what it is not. This becomes a large exercise because for most companies, in our experience, the architecture of the system is kept in documents that have already become obsolete. Traditional automation on code and infrastructure pipelines also fall flat as environments are rarely recreated and hence the source of truth becomes questionable.
- Skill gap: Next, the platform engineers and even developers who have spent years in building expertise on the primary cloud now have to acquire expertise on the new cloud, understanding the parity as well as finer differences. The time and effort spent becoming acquainted with new tools and conventions can detract from the team’s core operational focus, resulting in potential setbacks. Furthermore, there is a high chance that this skill gap will lead to suboptimal cloud environments when the migration happens.
- Automation rewrites: Simply put “the nuts and bolts have to match the machinery.” Given parity and disparity between cloud features, automation originally tailored for one cloud environment needs to be overhauled to be compatible with the new one.
- Development interruptions: Migration is usually run for long periods and development teams move ahead with enhancing existing workloads, which means automation teams have to constantly catch up. To ensure smooth migration, ongoing developments might be halted, causing potential project delays.
- Cross-cloud environment drifts: Over the course of an application’s life, environment drifts occur. During the transition period, when some of these environments are on different clouds, the chance of drift is even higher, causing inconsistencies and confusion. During migration, these disparities can manifest as inconsistencies between the origin and destination environments.
- Retraining overhead: Developers need to train on a new set of tools and best practices. This can temporarily dent the team’s productivity and elongate the adaptation phase.
Overcoming Hurdles: Dynamic Cloud Interoperability
Many of the above challenges have led us to design a key principle at Facets, which is a platform for platform engineers.
Documentation of architecture should not be a post fact; in fact, this should be the source of truth that drives automation. This can be built in layers, starting with developers on how they view their architecture devoid of cloud details, natively separating architecture intents from cloud implementations.
This is where our concept of “Dynamic Cloud Interoperability” (DCI) comes into play. DCI is our answer to the traditional narrative around cloud agnosticism. It involves developing an abstraction layer that allows businesses to employ the same infrastructure setup across different cloud providers like Amazon Web Services (AWS), Azure and Google Cloud Platform (GCP) without altering their applications. This means a service like AWS RDS can transition smoothly into a CloudSQL in GCP or a flexible server in Azure with zero hassle.
Here’s how DCI helps you address the aforementioned challenges:
- Obscured documentation: Ensure the architecture is documented in a cloud-agnostic manner as a prerequisite to cloud delivery, not a post fact. This not only clarifies the structure but also streamlines migration.
- Skill gap: Overlay the destination cloud best practices on the automations, which reduces the need to build expertise from scratch and provides a Day 1 optimized environment.
- Automation overhauls: Employ generative automations that auto-adapt, eliminating manual management and rewrites during cloud transitions.
- Development delays: Implement continuous delivery that functions uniformly across different cloud environments. Your development doesn’t have to halt for migrations.
- Drifts: DCI ensures a drift-free continuous delivery system to maintain consistency and avoid incremental errors over time.
- Developer learning curve: With DCI, we adopt a single-pane-of-glass approach. This unified interface makes transitions smoother for developers, obviating the need for extensive retraining.
DCI in Action: The GGX Story
One of the most vivid illustrations of this balance in action is our work with GGX, an NFT marketplace. It provides a platform for trading digital player cards. GGX initially used AWS’s cloud native functionalities but needed to migrate to GCP. Challenges included:
- Limited GCP knowledge: GGX’s team, adept with AWS, had little experience with GCP, risking a halt in development to learn the new platform.
- Migration hurdles: GGX’s automations, customized for AWS, required modifications for GCP compatibility, a process rife with potential errors.
- Infrastructure drift: It was essential that the actual infrastructure configuration remained aligned with its intended design during migration.
Facets intervened, offering solutions:
- DCI-aided migration: Dynamic Cloud Interoperability bridged AWS and GCP, eliminating the need to overhaul GGX’s automations.
- Developer landing zone: The developer landing zone of Facets ensures that the developers are least exposed to the change and are trained over a period of time without affecting migration timelines.
- Infrastructure integrity: GGX ensured a consistent infrastructure state throughout the migration because of the inherent cross-cloud orchestration guarantees.
GGX transitioned in 15 days instead of the projected 2 to 3 months, all while continuing their regular operations.
Crafting an Optimal Cloud Strategy
From our experience, an optimal cloud strategy involves using the best tools a cloud offers while staying flexible enough to use other cloud options when needed. To achieve this, businesses can use standardized cloud services, add protective layers, manage policies in one place and use automated deployment tools. This creates a cloud strategy that’s strong but can adapt when needed.
The move to the cloud offers businesses many potent tools. The key is striking a balance: Use the best of what a cloud provider presents while maintaining the agility to shift if required. A strategic, forward-thinking approach to cloud services lets companies build a strategy that’s both strong and adaptable.
This is part of a series on platform engineering. Read the entire series:
- Part 1: Evolving DevOps: Platform Engineering Takes Center Stage
- Part 2: The DevOps Future Is User-Centric Platform Engineering
- Part 3: Shaping DevOps with the Best of ‘By Audit’ and ‘By Design’
- Part 4: Cloud Portability: How Platform Engineering Pushes Past Toil
- Part 5: How Platform Engineering Can Help Keep Cloud Costs in Check
- Part 6: Making the Leap: Ops Roles Evolve into Platform Engineers
- Part 7: Platform Engineering, Yes/No? A Guide to Making the Call
- Part 8: Measuring Key KPIs and Platform Engineering Success
- Part 9: Bringing Harmony to Chaos: A Dive into Standardization
- Part 10: Platform Engineering — Navigating Today, Forecasting Tomorrow