With “cloud” already an overloaded term, “cloud native” is the latest addition to the mix. The main concerns for your organization consist of assessing the options available to help further your IT strategy and how cloud-native alternatives factor in with other cloud-related decisions you must make. Understanding the role the Cloud Native Computing Foundation (CNCF) plays is also key.
The Three Definitions of Cloud
There are three top-level definitions of cloud, and your decisions along each of these definitions affect your organization, from staffing and talent, to operating processes, to the technology you use. In fact, there’s a unique hype cycle for these distinct definitions, each in very different phases of maturity.
The three definitions are:
- Cloud as a Sourcing Model
- Cloud as an Operating Model
- Cloud as an Architecture (as in Cloud-native)
Why do these definitions matter? Because while one decision along these definitions may influence others, each are in fact distinct concepts that can be managed separately. Anyone who tells you otherwise probably has a business model that’s driven by combining two of these definitions or a partial definition. In order for you to develop a well-informed cloud strategy, you need to accurately see the entire cloud picture.
Recognizing the right combination for your business is critical to making sure you don’t end up on an unintended (and difficult-to-reverse) path. Looking at the history along each of these definitions provides some guidance as to what we can expect in the future.
Cloud as a Sourcing Model
At the turn of the century, offshoring IT services to countries such as India was all the rage, as businesses sought to reduce costs for what they then identified as non-core processes. Examples included customer support, contact center, and later, software development. India-based companies like Wipro and Tata Consulting Services (TCS) were core to these efforts. Between 2000 and 2010, both Wipro and TCS quintupled their revenues.
However, aggressive outsourcing and offshoring eventually led to challenges of its own, driven by three factors. First, the success of offshoring led to a greater local demand for technical English-speaking staff, which drove up labor costs and sometimes turnover. Second, language gaps, coordination costs, and quality of output were sometimes not as expected. Third, companies began to re-think their decisions on the strategic nature of outsourced business processes. These three factors changed the equation for offshoring and led to a re-examination of sourcing strategies.
Nearshoring has since become popular after the debut of offshoring. This involves hiring staff in nearby countries that can speak the same language and operate in the same time zone, with the aim of improving service quality. The approach often meant looking at metropolitan statistical areas (MSAs) as a target for outsourcing. If the ultimate target was back in the same country, we called this “onshoring.” For many businesses in the U.S., this meant the expansion of data centers or service centers in non-coastal states for the right balance of talent and costs. These facilities are often insourced operations in relatively rural areas.
To address the connectivity challenges of rural data centers, adding metropolitan colocated centers was the next logical shift. In colocated centers, server systems and personnel are insourced, but the facilities are outsourced. Today, as the debate continues on the benefits and limitations of IT outsourcing, the most likely answer is that both will co-exist in foreseeable future.
Cloud as an Operating Model
Cloud as an operating model essentially means “on-demand,” or more broadly “as-a-Service,” when we include the ongoing lifecycle maintenance of these services. In enterprise IT, the opposite is probably “project-based interaction.”
Corporate IT organizations have long faced the challenge of deciding whether to provide their business partners what they want vs. what they need. For example, a business owner may expect their ambitious project to explode and demand a large amount of high-end compute, storage, and networking capacity from IT, only to end up not using much of the capacity. In some cases, the request will demand specific server models & configurations. Because IT services can take months to provision in some companies, users overestimate their needs, exacerbating the problem. Dedicated resources get allocated, but never used, driving up costs.
In high-performing IT organizations, interactions with the business are more balanced — business owners describe their needs around performance and availability, and the rest of the decision is left to IT. IT defines the services catalog, which can include low-level infrastructure services (like a server) to higher level services like databases. They also build and operate the bill of materials and track their costs so they can ultimately show back or chargeback to the business. While this is an improvement, IT-business interaction was still largely project-based interactions.
Cloud as an operating model was enabled by virtualization technologies. This is because, prior to virtualization, any request for IT services required something physical to occur, and someone from IT to step away from their desk — whether it was a server showing up at the loading dock, plugging in a server, or installing software (and we didn’t have robots yet to do this for us).
We have Amazon to thank for making the cloud operating model mainstream with EC2 (virtual machines as-a-Service) and S3 (storage as-a-Service). Users could provision servers and storage in minutes without calling anyone in IT. Because of this, people initially thought “cloud” could only mean outsourced IT, but VMware championed that “cloud” could be defined as an operating model that is agnostic to whether it’s insourced or outsourced, and delivered software to enable this. In 2009 Gartner’s Tom Bittman proclaimed “private cloud is real, get over it.” Private cloud refers to insourced, public cloud is outsourced.
By 2010, virtual workloads exceeded physical workloads worldwide, most of it insourced. Eventually, other technologies like Microsoft Hyper-V and OpenStack caught on. In 2018, VMware had the largest market capitalization in its history, which shows the continued momentum for this operating model that champions hybrid strategies combining private and public clouds.
Today, while the enabling tech stack may vary, business technology leaders generally agree that providing IT “as-a-Service” (with guard rails) is the most effective way to help their businesses be successful while managing the complexity and cost of IT.
Cloud as an Architecture
What we call cloud-native today are essentially the architectural patterns developed at webscale companies like Airbnb, Twitter, Google and Facebook in their hyperscale data centers in the late 2000s. These patterns are in stark contrast to earlier monolithic architectures, where everything basically fit in a single VM or server. Dealing with the explosion of serving billions of users in real time, these internet giants sought to improve performance and lower cost.
To do this, two things had to happen on the infrastructure level. First, replacing expensive monolithic servers with lower-cost commodity servers (but in much greater numbers). Second, using sophisticated software to stitch together massively distributed computing infrastructures. Early iterations of this approach used VMs, eventually giving way to containers as the unit of workload deployment. The container movement started with Linux cgroups, which are the foundation for the popular Docker file format that’s used today.
On the application level, cloud-native architectures share three key characteristics. First, they are distributed systems, and therefore assume that an underlying compute cluster is available because a single machine would be insufficient.
Second, functionality is delivered as a set of containerized microservices that don’t store any data internally (stateless), so they can be deployed, scaled, and killed at will. The dynamic nature of these microservices necessitates the use of containers instead of virtual machines. These microservices are the application code that businesses create in-house. An example of this might be a shopping cart service that exists as part of a larger e-commerce suite. These microservices are packaged in containers, which can be spun up in millisecond speeds. To run these containers in an organized way, we use container orchestrators, the most popular being Kubernetes.
Third, is the data service, which is the backbone of the cloud-native application. Data services are analogous to the relational database in traditional monolithic applications. Popular technologies in this area include tools for ingesting the data (e.g., Apache Kafka as part of an IoT application), storing the data (e.g., Cassandra, HDFS), and analyzing the data (e.g., Apache Spark). No useful business software is ever entirely stateless, so these data services play a key role.
Another key factor of cloud-native application architectures is open source software. Kubernetes was built from the ground up as a new project led by Google. Cassandra was used internally at Facebook before being open sourced. Same was true with Kafka at LinkedIn. Spark was first conceived on Apache Mesos, an open-sourced distributed systems kernel technology to stitch together data center resources and automate workload operations. Today, all of these technologies run elastically together on Mesos enabled by DC/OS, another open source project by Mesosphere.
With open source, cloud-native application architectures that were initially developed to solve large-scale challenges are now getting adoption at mainstream enterprises. Reasons include lower operational costs with increased workload density, faster time to ship out new functionality, and performance and scale to power data-driven applications.
Cloud-native architectures power today’s cutting edge use cases like predictive analytics, IoT, and personalization. The power of cloud-native architecture also comes with complexity, however. An ecosystem of yet more tools has evolved to help businesses manage the complexity. The CNCF was founded to help businesses navigate this fast-changing landscape.
With “cloud” disambiguated, let’s look at some real-world examples using these definitions:
- Sourcing model: insource vs. outsource
- Operating model: “as-a-Service” vs. project-based
- Architecture: cloud-native vs. monolithic
Let’s start with the basics in the world of monolithic architectures. These examples are also known as Infrastructure-as-a-Service or IaaS:
VMware vSphere and Cloud Foundation is software that provides monolithic architectures as-a-Service, on any infrastructure. To tackle cloud-native application architectures, VMware has other projects like Photon and Lightwave that are in early stages of development.
Rackspace provides monolithic architectures as-a-Service, outsourced to their infrastructure. To do this, they run software purchased from VMware and other providers.
Amazon Elastic Compute Cloud (EC2), provides monolithic architecture as-a-Service, outsourced to their infrastructure. The software they use is proprietary, based on the Xen hypervisor. Amazon also offers VMware as a hypervisor.
Getting to cloud-native architecture as-a-Service adds a bit more complexity since it also requires automation of container orchestrators like Kubernetes to run stateless microservices, as well as data services like Kafka, Spark, and Cassandra. Enterprises piecing together cloud-native technologies on their own on top of IaaS, are essentially insourcing project-based operations to build cloud-native architectures.
To address the complexity of cloud-native architectures, vendors have developed various means to help companies get to the finish line — all vendors offer Kubernetes.
Amazon Web Services (AWS), provides cloud-native application and backing data services, as-a-Service, outsourced to their infrastructure. In AWS’ case, the cloud-native data services are typically proprietary, running on their infrastructure, EC2.
PaaS, like Pivotal Cloud Foundry or RedHat Openshift, is software that enables cloud-native applications, as-a-Service, on any infrastructure. Data services typically fall beyond the scope of PaaS deployments. PaaS are highly prescriptive on the application components used to ensure standardization.
Mesosphere DC/OS, is software that enables cloud-native application and backing data services, as-a-Service, on any infrastructure. In Mesosphere’s case, cloud-native application services (like Kubernetes) and data services (like Spark, Cassandra, Kafka) are typically open source. Commercially vendor-supported offerings (like Lightbend Fast Data Platform, Datastax Enterprise, Confluent Platform) are also available.
As software, Pivotal, Openshift, and Mesosphere DC/OS can be deployed on-premises or on top of an outsourced IaaS like Amazon EC2.
Recurring Trends on an Evolving Landscape
Over the years, as each new definition of “cloud” formed, we’ve seen hype cycles play out. Aggressive outsourcing and offshoring eventually led to repatriation and insourcing towards a steady balance.
We’re in a different phase of the cycle with today’s popular definition of cloud — compute as-a-Service on outsourced infrastructures. This definition of “cloud”, is displacing traditional outsourcing and seeing explosive growth, with AWS seeing annual run rates of $20 billion. With cloud computing only set to grow, there are signs of moderation. A quarter of enterprises with more than 1,000 employees spend more than $6 million a year on the cloud. As cloud bills explode, it’s also getting increased scrutiny by IT and business leaders. Nearly half of cloud spending is unnecessary or wasted, according to a report published by Advocate, a company focused on cloud spending optimization.
The reason, according to the report, is threefold. First is that perception that cloud is cheap and easy, so users tend to not worry about controlling the cost or limiting expansion of cloud instances. Second is the “disaggregation of IT.” Because individual users can easily use cloud services, expansion goes unchecked with no central IT visibility. Third is cloud computing billing complexity, which often leads to unexpected bills. These factors sometimes create a case for change. Dropbox for example, at one point aggressively used AWS, only to move back to its own infrastructure, saving $75 million over two years.
The dynamics around the latest “cloud” definition, cloud-native architectures, has yet to play out. Generally speaking, however, cloud-native architectures do add a level of complexity. So while there are benefits around scale, performance and maintainability we have yet to see just how pervasive this architectural approach will be compared to traditional monolithic architectures. Some balance will likely be the answer for the foreseeable future.
Getting It Right for Your Business
There’s no answer that’s right for all businesses since each optimizes for different things. But there are five factors to consider.
Cost: Rackspace, generally known for outsourced, as-a-Service traditional architectures (or IaaS) has a profit margin below 10 percent. AWS, providing outsourced, as-a-Service cloud-native architecture, has a profit margin around 25 percent. There’s value in automating away complexity, and it’s illustrated in the profit margin. For some businesses, the cost is well worth it. For others that are running data-driven applications at scale, the costs of an all-in cloud approach can be prohibitive.
Business Risk: If you’re a bank, a health care provider, or a major retailer like Walmart, using a cloud provider that might become a competitor is a real risk. One approach is to leverage outsourced IaaS (now a mature space where the software used can be agnostic to the outsourcing vendor) and then run additional software on top for insourced as-a-Service automation for cloud-native architectures. This way, if your outsourcing provider decides to enter your space to compete with you, you have the option to go to a different provider, and not be forced to fund your own demise.
Service risk: While cloud provider outages are rare, they do occur. Snap, for example, made early cloud-native architecture decisions that led to business risk it captured in its S1 filing, mentioning that any significant interference or disruption of their use of their cloud provider “would negatively impact our operations and our business would be seriously harmed.” During a four-hour outage of a major cloud provider in early 2017, companies were estimated to have lost over $150 million. Vendors that fared well during this outage took advantage of multicloud operations. Apple, for example, is known to use both AWS and Google Cloud.
Architectural Control and Technical Differentiation: There’s been an explosion of technologies in and around the cloud-native ecosystem landscape. Additionally, outsourcers like AWS offer interesting services unique to their cloud (e.g., AWS Lambda), and so does Google (e.g., Google Cloud ML). However, using only one outsourcer for as-a-Service cloud-native technologies means your technical capabilities are constrained to their portfolio. If the option to use specific cloud-native technologies is important to your business, look to run cloud-native technologies as-a-Service through software, as opposed to outsourcing.
Data locality: The concept of data gravity is well known in the industry. If your data is sitting at a specific cloud or data center, services you build will often need to be close to the data. But if you’re in the business of geospatial analytics, operating cruise ships or industrial IoT, application services can’t run entirely in a data center far away. An emerging pattern today is rolling out global services with data locality, driven by data sovereignty compliance, performance, data transfer or latency considerations.
Most enterprises are evolving towards building and running data-driven applications across hybrid cloud infrastructures. Getting it right for your business means making decisions around sourcing, operations and architecture separately, and finding the right balance that yields the best results for both IT and the business as a whole.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker, Real.