Cloud Services / Containers / DevOps / Software Development / Technology / Tools

Infra-as-Data vs. Infra-as-Code: What’s the Difference?

24 May 2022 6:00am, by
curved, light blue structure with a cross-hatch patten. Blue sky with cloud in the upper third of the image.

Among the biggest challenges Kubernetes brings is the deployment of microservices and applications on different cloud and on-premises environments.

Configuring and managing different environments — as backend developers and operations teams know firsthand — is still mostly a manual process, not only for major application deployments, but for any code, application, microservice, security or any other update.

Hence, simplifying and automating this process is a high priority in order to take the load off of operations and backend developers.

Eventually, teams should be able to seamlessly and easily deploy an application, microservice or update on multicloud environments, Kubernetes clusters and on-premises environments through a single interface. This would be the Holy Grail of multicloud and cloud native computing for Ops engineers.

Recently, VMware engineers introduced the open source Infrastructure-as-Data Idem Project, headed by Tom Hatch, SaltStack’s founder and chief technology officer, to help achieve that end.

The resulting open source tool was created to reduce the enormous complexities associated with orchestrating massive amounts of codebases for each cloud deployment and API into a format consisting of data that, the project creators say, a human mind can readily understand and manage.

Idem, its creators say, represents Infrastructure-as-Data (IaD) for cloud configurations, since it reduces cloud configurations to data. It was designed so that cloud configuration becomes simpler to configure and manage for application deployments.

But in many ways, the end results Idem’s creators are hoping to achieve are very similar to what Red Hat’s Ansible is purported to offer for IaD. Instead of modeling Infrastructure-as-Code (IaC) or in a GUI, IaD is based on a “text-based, middle-ground and ­­data­-driven policy,” Michael Dehaan, Ansible’s creator wrote in 2013 for O’Reilly’s Radar.

“I call this ‘Infrastructure-as-Data’ — describing what your systems look like in simple machine­-readable data formats,” Dehaan wrote. “Have programs execute those data formats and ensure your infrastructure matches. The result is that configurations can be flexible, and also easy to prototype, easy to audit and easy to maintain.”

Then there’s Infrastructure as Code for multicloud and multi-environment provisioning. With HashiCorp’s open source Terraform — which now numbers millions of use cases and is the leading IaC platform — and related HashiCorp offerings against which Idem competes directly, many users already rely on IaC to provision and manage applications across multicloud and on-premises data center environments.

On a high level, Infrastructure-as-Data tools like VMware’s Idem and Ansible, and Infrastructure-as-Code, dominated by Terraform, were created to help DevOps teams achieve their goals of simplifying and automating application deployments across multicloud and different environments, while helping to reduce manual configurations and processes.

Here we’ll explore how IaD and IaC compare, and how they can also complement each other in certain cases.

The Challenges Infrastructure as Data Solves

DevOps teams continue to be challenged by reducing the complexity of provisioning multiple cloud environments.

“The codebase-specific character of CI/CD pipelines conflicts with the immutable principle of cloud native application development. Each component of the DevOps toolchain constitutes a potential integration point that requires setup, initial configuration and management,” Torsten Volk, an analyst for Enterprise Management Associates, told The New Stack.

“This is why we need cloud-specific teams of infrastructure engineers to figure out how to create optimal application-specific environments on AWS, Azure, GCP, etc.”

As far as adoption goes, Terraform “owns the market so far,” Hatch said. “Certainly, competing against Terraform is a big hill to climb,” he added, but there is a big difference between using Infrastructure as Code versus Infrastructure as Data.

When cloud architectures need to be expressed using code, “you’re just writing more and more and more and more Terraform,” he said. “Idem is different from how you generally think of Infrastructure as Code — everything boils down to these predictable datasets.”

“Instead of sitting down and saying, ‘I’m going to write out a cloud in Terraform,’ you can point Idem towards your cloud, and it will automatically generate all of the data and all of the code and the runtimes to enforce it in its current state.”

At the same time, Idem, as well as Ansible to a certain extent, were designed to make cloud provisioning more automated and simple to manage.

“While this sounds like magic, Idem exactly addresses the problem many Terraform users, open source and commercial, have expressed,” Volk said.

“Writing traditional Infrastructure as Code of any kind involuntarily leads to layers upon layers of infrastructure code that needs constant adjustments to keep up with changes to cloud APIs, changing application requirements and changes in the overall business environment.”

Improving Infrastructure-as-Code Tools

Meanwhile, HashiCorp continues to refine and improve Infrastructure-as-Code through each subsequent release of Terraform and the associated tools it offers. Terraform’s key feature is extensibility for what HashiCorp claims is any IT infrastructure.

IaC is also a component of “infrastructure automation” within Terraform, as HashiCorp defines it, in order to:

  1. Adopt Infrastructure as Code.
  2. Build workflow for composition, collaboration and reuse of IaC.
  3. Standardize workflow to security, compliance and management requirements.
  4. Provide innovation through self-service infrastructure options for the end-user application developers and delivery teams.

“Terraform is working at the infrastructure layer, and Infrastructure-as-Code is the best way to provide automation to provision any infrastructure from any cloud platform, private data center, etc.,” Meghan Liese, senior director, product marketing at HashiCorp, told The New Stack.

“Terraform is really about operators being able to define infrastructure that needs to be provisioned so that it is available in a self-service model to developers. And so, at the other layer, a tool like Waypoint says, ‘Hey, developers, you codify your application requirements and then run that through Waypoint, and Waypoint will self-service the platform,”

There can be overlaps between how IaC and IaD are used. Liese did not comment specifically on Idem, but said there are instances when the use of Ansible for Infrastructure-as-Data can complement Terraform when provisioning infrastructure across multiple environments with Infrastructure-as-Code.

“Ansible and Terraform creators see organizations with a lot of the same problems, but we also work well together. We work with organizations that use Terraform to lay down the infrastructure and Ansible a lot of times to configure the machines,” Liese said. “So, that is one of those situations where, as the markets continue to mature, the tools may provide some capabilities that overlap.”

Featured image by Victor on Unsplash.