Why We Need an Inter-Cloud Data Standard
The cloud has completely changed the world of software. Everyone from startup to enterprise has access to vast amounts of compute hardware whenever they want it. Need to test the latest version of that innovative feature you were working on? Go ahead. Spin up a virtual machine and set it free. Does your company need to deploy a disaster recovery cluster? Sure, if you can fit the bill.
No longer are we limited by having to procure expensive physical servers and maintain on premises data centers. Instead, we are free to experiment and innovate. Each cloud service provider has a litany of tools and systems to help accelerate your modernization efforts. The customer is spoiled with choices.
Except that they aren’t. Sure, a new customer can weigh the fees and compare the offerings. But that might be their last opportunity, because once you check in, you can’t check out.
The Hotel California Problem
Data is the lifeblood of every app and product, as all software, at its core, is simply manipulating data to create an experience for the user. And thankfully, cloud providers will happily take your precious data and keep it safe and secure for you. The cost to upload your data is minimal. However, when you want to take your ever-growing database and move it somewhere else, you will be in for a surprise: the worst toll road in history. Heavy bills and slow speeds.
Is there a technical reason? Let’s break it down. Just like you and me, the big cloud providers pay for their internet infrastructure and networking, and that must be factored into their business model. Since the costs can’t be absorbed, it is worth subsidizing the import fees and taxing the exports.
Additionally, the bandwidth of their internet networks is also limited. It makes sense to deprioritize the large data exports so that production application workloads are not affected.
Combine these factors and you can see why Amazon Web Services (AWS) offers a service where it sends a shipping container full of servers and data drives so you can migrate data to its services. It is often cheaper and faster to mail a hard drive than it is to download its contents from the web.
It helps that all these factors align with the interests of the company. When a large data export is detected, it probably is a strong indicator that the customer wants to lower their fees or take their business elsewhere. It is not to the cloud provider’s benefit to make it easy for customers to move their data out.
Except that it is in the cloud provider’s interest. It’s in everyone’s interest.
It Really Matters
The situation is not unlike the recent revolution in the smart home industry. Since its inception, it has been a niche enthusiast hobby. But in 2023, it is poised to explode.
Amazon, Google and Apple have ruthlessly pursued this market for years, releasing products designed to coordinate your home. They have tried to sell the vision of a world where your doors are unlocked by Alexa, where Siri watches your cameras for intruders and where Google sets your air conditioning to the perfect temperature. But you were only allowed one assistant. Alexa, Siri or Google.
By design, there was no cross compatibility; you had to go all in. This meant that companies who wanted to develop smart home products also had to choose an ecosystem, as developing a product that works with and is certified for all three platforms was prohibitively expensive. Product boxes had a ridiculous number of logos on them indicating which systems they work with and what protocol they operate on.
It was a minefield for consumers. The complexity of finding products that work with your system was unbearably high and required serious planning and research. It was likely you would walk out of the shop with a product that wouldn’t integrate with your smart home assistant.
This will change. In late 2022, products certified on the new industry standard, named Matter, started hitting shelves, and they work with all three ecosystems. No questions asked, and only one logo to look for. This reduces consumer complexity. It makes developing products easier, and it means that the smart home market can grow beyond a niche hobby for technology enthusiasts and into the mass market. By 2022, only 14% of households had experimented with smart technology. However, in the next four years, adoption of smart home technology is set to double, with another $100 billion of revenue being generated by the industry.
Bringing It Back to Cloud
We must look at it from the platform vendor’s perspective. Before Matter, users had to choose, and if they chose your ecosystem, it was a big win! Yet the story isn’t that simple, as the customers were left unfulfilled, limited to a small selection of products that they could use. Worse, the friction that this caused limited the size of the market and ensured that even if the vendor improved its offering, it was unlikely to cause a competitor’s customers to switch.
In this case, lock-in was so incredibly detrimental to the platform owners that all the players in the space acknowledged the existential threats to the budding market, driving traditionally bitter rivals to rethink, reorganize and build a new, open ecosystem.
The cloud service providers (CSPs) are in a similar position. The value proposition of the cloud was originally abundantly clear, and adoption exploded. Today, sentiment is shifting, and the cloud backlash has begun. After 10 years of growing cloud adoption, organizations are seeing their massive cloud bills continue to grow, with an expected $100 billion increase in spending in 2023 alone and cloud lock-in is limiting agility.
With so much friction in moving cloud data around, it might be easier for customers to never move data there and just manage the infrastructure themselves.
The value still exists for sporadic workloads, or development and innovation, as purchasing and procurement is a pain for these sorts of use cases. Yet, even these bleeding-edge use cases can be debilitated by lock-in. Consider that there may be technology offered by AWS and another on Google Cloud that together could solve a key technical challenge that a development team faces. This would be a win-win-win. Both CSPs would gain valuable revenue, and the customer would be able to build their technology. Unfortunately, today this is impossible as the data transfer costs make this unreasonably expensive.
There are second-order effects as well. Each CSP currently must invest in hundreds of varying services for their customers. As for each technology category, the cloud provider must offer a solution to its locked-in customers. This spreads development thin, perhaps limiting the excellence of each individual service since many of them need to be developed and supported. As thousands of employees are let go by Amazon (27,000), Google (12,000) and Microsoft (10,000), can these companies really keep up the pace? Wouldn’t quality and innovation go up if these companies could focus their efforts on their differentiators and best-in-class solutions? Customers could shop at multiple providers and always get the best tools for their money.
High availability is another key victim to the current system. Copies of the data must be stored and replicated in a set of discrete locations to avoid data loss. Yet, data transfer costs means that the cost of replicating data between availability zones internally within a single cloud region already drives up the bill. Forget replicating any serious amount of data between cloud providers as that becomes infeasible due to cost and latency. This places real limits on how well customers can protect their data from disasters or failures, artificially capping risk mitigations.
An Industry Standard
So many of today’s cloud woes come down to the data-transfer cost paradigm. The industry needs a rethink. Just like the smart home companies came together to build a single protocol called Matter, perhaps the CSPs could build a simple, transparent and unified system for data transfer fees.
The CSPs could invest in building an inter-cloud super highway: an industry-owned and -operated infrastructure designed solely for moving data between CSPs with the required throughput. Costs would go down as the public internet would no longer be a factor.
A schema could be developed to ensure interoperability between technologies and reduce friction for users looking to migrate their data and applications. An encryption standard could be enforced to ensure security and compliance and use of the aforementioned cross-cloud network would reduce risk of interception by malicious actors. For critical multicloud applications, customers could pay a premium to access faster inter-cloud rates.
Cloud providers would be able to further differentiate their best product offerings knowing that if they build it, the customers will come, no longer locked into their legacy cloud provider.
Customers could avoid lengthy due diligence when moving to the cloud, as they could simply search for the best service for their requirements, no longer buying the store when they just need one product. Customers would benefit from transparent and possibly reduced costs with the ability to move their business when and if they want to. Overall agility would increase, allowing strategic migration on and off the cloud as requirements change.
And of course, a new level of data resilience could be unlocked as data could be realistically replicated back and forth between different cloud providers, ensuring the integrity of the world’s most important data.
This is a future where everyone wins. The industry players could ensure the survival and growth of their offerings in the face of cloud skepticism. Customers get access to the multitudes of benefits listed above. Yes, it would require historic humility and cooperation from some of the largest companies in the world, but together they could usher in a new generation of technology innovation.
We need an inter-cloud data mobility standard.
In the Meantime
Today there is no standard, and all the opposite is true. The risks of cloud lock-in are high, and customers must mitigate them by leveraging the cloud carefully and intelligently. Data transfer fees cannot be avoided, but there are other ways to lower your exposure.
That’s why Couchbase enables its NoSQL cloud database to be used in a multitude of different ways. You can manage it yourself, or use the Autonomous Operator to deploy it on any Kubernetes infrastructure (on premises or in the cloud). We also built our database-as-service, Capella, to natively run on all three major cloud platforms.
Couchbase natively leverages multiple availability zones and its internal replication technology to ensure high availability alongside multiple availability zones. With Cross Datacenter Replication (XDCR), you can asynchronously replicate your data across regions and even cloud platforms themselves to ensure your data is safe even in the worst-case scenarios.
Try Couchbase Capella today with a free trial and no commitment.