Monitoring and Observability — What’s the Difference and Why Does It Matter?

16 Apr 2018 4:38pm, by

Peter Waterhouse
Peter Waterhouse is a senior strategist at CA Technologies. He is a business technologist with more than 20 years’ experience with development, strategy, marketing, and executive management. Through his regular work with CA, Waterhouse covers key trends such as DevOps, mobility, cloud, and the Internet of Things.

It’s amusing to see how our industry flocks around the latest tech buzzword. Along comes some funky new term and suddenly everyone starts using it to describe their wares. It’s cool, hip and trending, so get on board and start the buzzword washing.

The new buzzword kid on the blocks is observability and you can bet many companies will be using it willy-nilly to spruce up their products.

So what is observability? Should we accept what many are stating — that it’s basically monitoring, only on steroids? Bigger, better, faster; the new Chuck Norris of DevOps tools. So better buy some observability, right?

But before getting too excited, let’s carefully dissect the meaning of the term. As with anything the truth is out there. We’ve just got to find it in all the noise.

Monitoring Basics

First, let’s review what we all know and love (well most of us anyway) — monitoring.

Monitoring is a verb; something we perform against our applications and systems to determine their state. From basic fitness tests and whether they’re up or down, to more proactive performance health checks. We monitor applications to detect problems and anomalies. As troubleshooters, we use it to find the root cause of problems and gain insights into capacity requirements and performance trends over time.

That’s the basic stuff, but monitoring has also evolved to support many more stakeholders. During application development, folks use monitoring to correlate coding practices to performance outcomes, while architects can validate which cloud patterns and models deliver the most bang for the buck.

And to achieve all this goodness, monitoring tools use many nifty techniques like instrumentation and tracing to gather, digest, correlate and analyze rafts of metrics across modern application stacks under our watch. Plus there’s synthetic transactions and application experience analytics to gain critical insights into the digital meanderings of our customers.

Demystifying Observability

So what about this new term observability?

As it turns out, observability isn’t new at all. The term actually comes from the world of engineering and control theory.

Basically, and as the definition states, it’s a measure of how well internal states of a system can be inferred from knowledge of its external outputs. So in contrast to monitoring which is something we actually do, observability (as a noun), is more a property of a system. Therefore, if our good old IT systems and applications don’t adequately externalize their state, then even the best monitoring can fall short. I’m no engineer, but that makes good sense.

Observability is important today when we consider both the characteristics of modern applications and the pace at which they’re being delivered. As organizations move towards containerized workloads and dynamic microservice architectures, old practices of bolting on monitoring after the fact no longer scale. It’s critical therefore that modern instrumentation should be employed to better understand the properties of an application and its performance as complex distributed systems take shape across the delivery pipeline and into production.

Observability Tools

There are many practices that contribute towards observability, some of which can be found in products and tools. Many of these do a great job at externalizing key application events through logs, metrics and events. With tracing, for example, we can more reliably determine the state of application performance and the service being delivered by measuring all the work being done across many dependencies; it builds a more observable system. In another example, we could increase observability by activating metric capture and analysis during a containerized application deployment with Kubernetes. By dovetailing into the deployment itself we can better understand a system from the work it’s doing — in this case as it scales and changes dynamically.

Improving observability means keeping watch over all application components — from mobile and web front-ends to infrastructure. In the past, this would have involved gathering and analyzing information from many data sources – app logs, time-series data and so on. Now, however, conditions are more complex and to get the real picture of customer experience we need clearer insights delivered in context of how mobile and web apps are being used and consumed.

The Human Factor

Great monitoring tools and all the latest innovations go some way to improving observability, but we shouldn’t forget organizational issues. No matter how great our monitoring smarts, they’ll count for little if folks don’t use them when designing, developing or testing their applications. Moreover, if a system can’t be applied in context of work being performed, it’ll just end up on the “too hard to use” shelf.

It’s important therefore that modern monitoring methods are baked into the deployment pipeline with the minimum of fuss. In the aforementioned Kubernetes cluster deployment, monitoring is established with the actual deployment itself, meaning no lengthy configuration and interrupts. In another example, we could increase observability by establishing server-side application performance visibility with client-side response time analysis during load testing — a neat way of pinpointing problem root cause as we test at scale. Again, what makes it valuable isn’t just the innovation, it’s the simple and straightforward application — ergo, it’s frictionless.

It’ll still take a fair amount of cultural nudging and cajoling to get folks to do the right things if systems are to become more observable but beware of dictatorial approaches and browbeating. To this end, DevOps-centric teams will always be on the lookout for opportunities to demonstrate how to make applications more observable and the value it delivers. That could be as easy as perusing an application topology over a coffee to determine latency blind-spots and where instrumentation could help. Another opportunity could be after a major incident or application release, but again the focus should be on collective improvement and never finger pointing.

In all cases, the goal should be to train people on how to get better at making their systems observable. That’ll involve delivering fast insights to get some quick wins but could quickly develop into a highly effective service providing guidance on monitoring designs and improvement strategies. To this point, many organizations have built out teams of “observability engineers.” Some go even further and incorporate observability learnings and practices into their new-hire training programs.

Our industry is great at taking something really obvious and over-complicating it. So take heed before hitting the “observability” buy button. In all the noise, I’d recommend listening to the real-world learnings from modern monitoring practitioners. Like for example, Cindy Sridharan whose written some well-considered articles, or listening to the speakers at this year’s Monitorama conference. All these folks work at the sharp-end, understanding that observability isn’t a product per se, it’s an essential property of the massively complex applications your teams are responsible for building — and to which modern instrumentation and application monitoring are essential contributors.

CA Technologies is a sponsor of The New Stack.

Feature image via Pixabay.

A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.