History tends to repeat itself, but hopefully, 2020 will remain as an anomaly that future generations will read about, described as a singularly tragic and momentous year in modern-day history. The COVID-19 pandemic had definite repercussions in the DevOps community, especially the explosion in remote work.
Regardless, a lot of great work continued in the computing space in 2020, both in support of the shift in resources for the massive shift in work-from-home IT infrastructure and the continued progress in cloud native, GitOps, open source and security. As this all happened, we were fortunate enough to record some of it in hundreds of hours of fascinating podcast talks with leading DevOps practitioners, developers, operations managers, security specialists and other good folks.
Here are what we deemed the ten most important episodes that defined 2020 for us, drawn from our podcast shows: The New Stack Makers, The New Stack Analysts and The New Stack Context (now discontinued).
1: Next-Generation Sustainable Data Center Design
This episode is certainly one of the 10 most popular this year, but it also ranks first as being the most important both for the IT and computing industry and its discussion about how to help stave off global warming for the rest of humanity as well.
The UN’s guidelines are unequivocal, sobering and positive all at once. During the next 10 years if the global average temperature is to “only” increase within two degrees Celsius, we must “act now,” Kass said.
However, currently, the IT sector or data centers in general “will continue to consume a lot more energy at a much faster rate” if changes are not made now, Kass said.
“We need to really act responsibly, to put a cap on the gigatons of carbon that we are emitting in the environment if we don’t watch it, and that would be a shame, because we are heating up the planet resources faster than we actually are able to protect it,” Kass said.
The solution, as Kass describes it, is to use “nature-based technological solutions to protect our earth, resources and the environment.”
“I’d like to do a call for action for the industry people like myself or in the industry to use nature-based ecological solutions to protect our resources and the environment, as we deplete energy, as we deplete land, as we deplete water in the use of the data center,” Kass said.
Achieving this goal requires a “modular approach,” which involves not consuming more resources than needed to build and operate data centers, she said. Over-provisioning data center operations and capacity are “ just irresponsible in terms of consuming the resources, because we ended up wasting them and depleting the opportunity for other people to be able to do so,” Kass said.
2: Service Mesh: The Gateway to Cloud Migration
You migrate to a cloud native environment, you rely on service mesh to orchestrate it — period. What service mesh really does and how it is a technology pattern for use with Kubernetes also conjure up still yet-to-be resolved questions. These include how service mesh, and especially the Istio service mesh, helps teams get more out of containers and Kubernetes across the whole application life cycle.
“The centralized control and consistency that service mesh gives you is incredibly useful for helping bring sanity to the kind of craziness that is this split infrastructure world, this kind of multicloud, on-premises world,” said Butcher.
Ultimately, organizations are latching on to service meshes as an answer for “not just a deployment problem,” but as a way to “integrate all the pieces together” during a cloud native journey, explained Jenkins.
3: Observability, Distributed Tracing and Kubernetes Management
Service meshes are indeed a must-have but observability has also emerged as a critical element for DevOps teams, especially for organizations managing often disparate multicloud and legacy infrastructure.
Relying on metrics, tracing and logging data points that observability platforms are designed to provide is just the starting point as organizations seek the best toolsets they can find to gain visibility and insights about their systems.
Performance, of course, helps to explain why the Grafana is so popular while the elegant dashboards offered with open source Grafana, Grafana Enterprise, Loki and Cortex, do not hurt as well. Lest we forget, Grafana is now available as part of the Amazon Managed Service for Grafana.
Dutt described the history of Grafana Labs and how Grafana observability platform has evolved from its early days as an additional observability layer for Kubernetes and containerized environments while complexity has increased exponentially.
“It’s extremely complicated — it’s difficult, really, to get a complete picture in terms of what’s going on with your infrastructure and your application,” Dutt said. “And observability is really about getting deep insight into the behavior of your systems.”
4: How to Fix the Gaps in Kubernetes Infrastructure Management
Guests: Gareth Greenaway, vice president of engineering, SaltStack (now part of VMware), and Moe Abdula, solution architecture leader, worldwide specialists at Amazon Web Services (AWS) (vice president of engineering, SaltStack, when this podcast was recorded). (Episode)
Automation, and especially, automation of security management is all too often neglected. This realization also often happens when it is too late and a critical systems vulnerability has been exploited. Missing are security management tools for Kubernetes deployments and infrastructure management. Ultimately, these tools should have the capacity to replace security and IT skills gaps and talent shortages by automating vulnerability detections and fixes, for example.
This is why many organizations are actively seeking tools for configuration and infrastructure management for complex Kubernetes and container environments, especially after experiencing first hand the issues described above associated with infrastructure management in today’s ever-complex environments, Abdulla said. Once they realize this, a DevOps team member might typically say “‘let me figure out how I automate so that I can create consistency,” Abdula said.
This demand reflects how SaltStack has evolved as a solution for container and Kubernetes configuration management. “The tools that are in place to manage those clusters just don’t scale to the point that a tool like SaltStack does, in terms of managing those clusters and container-based infrastructures,” Greenaway said.
5: DevSecOps: Yesterday, Today and the Future
DevSecOps has existed since the early days of DevOps. A few years ago, organizations began to realize how “cheaper” it was “to find and fix vulnerabilities very early in the lifecycle,” Blake said. “The impact of fixing vulnerabilities late in the lifecycle was kind of the first step where people started thinking about shifting left a bit,” Blake said.
A potential impetus to shifting security to the left in the development process is how security processes have unfairly been seen as potential speed bumps to rapid development and deployments. “It’s almost been unfortunate at times about security sometimes getting a bad rap as the inhibitor versus a catalyst or as a value creator simply because it’s been brought in at the end,” Mulchandani said.
Customers are also “moving to modern sort of models of deployment,” Gupta said.
“What we’re finding is that some of the capabilities that we are being asked to really help drive that security into the development lifecycle is by building into the continuous integration frameworks,” Gupta said. “So, we have done a lot of work around helping companies build faster, deploy faster and figure out problems faster — and really take remediation action faster.”
6: NS1 Builds on DNS to Speed Traffic Management
Network infrastructure is not just a nebulous patchwork of connections your organization can blindly rely on to push code to production, manage applications or offer on-demand services to your customers. It is sometimes easy to forget that a single DNS (the Internet’s Domain Name System) failure could mean that your entire network — and operations — are interrupted as well. In other words, the last mile depends on the operations team to manage the underlying infrastructure in such a way to optimally deliver the final user experience.
In this way, NS1 supports and hones DNS and other network connections in such a way as to “unlock a huge amount of leverage to solve big problems in application performance, reliability, security, operational efficiency, complexity and modernization of the application delivery stack,” Beevers said.
“The opportunity of DNS and why we started NS1 is that it is the first touchpoint you have on your phone or your laptop, when you open an app or when you go to a website,” Beevers said. “And there’s a lot of leverage in a world where the infrastructure that services an application is very dynamic, is very distributed,” meaning the things DNS points to are many and varied.
7: Distributed Systems and the Butterfly Effect
It is sometimes easy to forget how highly distributed and interconnected networks and infrastructure really are today. As an apt analogy that also applies to today’s global warming catastrophe, science fiction author Ray Bradbury’s short story classic “A Sound of Thunder” portrays a group of hunters who travel back in time accidentally step on a butterfly. This seemingly innocuous act then sets in motion a chain of reaction that eventually leads to cataclysmic events in the future. Flash forward to today’s network infrastructure and computing environment, and we see how a single vulnerability, bad code uploaded from a remote entry point or a power failure in an offsite data center can not only bring operations to a halt, but in extreme cases, cause businesses to fail.
The theme of interconnectivity is “a great segue into how our deeper thinking has gone into building resilient systems from both a service architecture perspective but also within the context of front-end applications,” Rauch said.
For network synchronization, Rauch described the challenges of managing stateful data in highly distributed environments and how it’s synchronized over networks, and how policy structures for applications help that. He also described his ambitions to help improve the groundwork for coherent and consistent systems for data consistency models.
8: Why You Must Keep Error Monitoring Close to Your Code
9: Superior Monitoring with a Time-Series Database
Time series data management is one of those powerful applications that previously could require massive investments to implement. Today, however, the underlying technology has not only seen leaps in performance, but has concurrently become significantly more accessible and affordable. Whether it’s video streaming, real-time financial security data management, energy utility management or any application that requires time stamps for often very complex datasets at massive scales, time-series data will play an integral role.
What Churilo also finds “fascinating” is how many talented development teams are more than capable of designing and managing their own time-series databases.
Many organizations also often consider the opportunities the cloud can offer as they “look at the work that they have to do for their own product and realize that that’s probably more important and how it’s going to be more of a differentiator for them to focus on their products,” Churilo said. “And, so they often look for a data store for this time-series data. And having it in the cloud makes sense for them because they don’t have to build and manage it.”
10: Git Is 15 Years Old: What Now?
Believe it or not, but the Git distributed code management tool has been around for over 15 years. But while the consensus is that its associated version control capabilities are essential for DevOps teams or open source contributors and committers that are building software, Git’s usage model continues to evolve.
Git, for example, is now certainly reliable for locking and trust for source code control systems. “We couldn’t have done that with the old source code control systems because the primitives weren’t there, things like… having the full state stored in a particular instance, a snapshot, if you will,” Davis said. “Those types of things allow us to have safety while not having this very heavy-handed control.”
Branching is also key — and one of the original functions that helps explain Git’s massive adoption. “Branching and Git is really, really important to where we were 15 years ago, and cheap, easy branches revolutionized the way people collaborated on software,” Warner said. “And we take that for granted today — but 15 years ago, there was nothing like it.”
Meanwhile, Git must also see improvements, especially as new application types become more popular. “Version controlling data is one of those things that is still very, very hard. And I think now with the advent of ML and AI, it’s getting more and more important,” Sijbrandij said. “What is the data we use to train it? What is the input and output data and what is the output data and version of the model that ran on it? I see a lot of tooling, but I think there’s also something fundamental on the file level that needs to change.”