Ignoring SLAs Doesn’t Pay
Chances are you already have SLAs with service providers. Maybe you insisted on it during the sales process as you onboarded the vendor. But it begs the question: How do you know whether your vendor is meeting this goal? And how do you collect on their violation if and when it occurs?
How to Measure an SLA
What are we really measuring when we define an SLA?
We might say that we want to have 99.99% availability measured monthly. The service must be up 43,795.62 minutes out of the 43,800 in a month. It implies that my measurement system is at least as available as the system I am measuring, as well as the network between my measurement system and the system being measured.
Usually SLA violations are only tracked for egregious outages, because tracking SLAs is hard. And in our buyer-beware world, you need to keep an eye on these services yourself.
SLAs Are Everywhere
In the olden days (2010s), we had custom software running on internal servers. Now you just sign up for a software-as-a-service subscription, hosted on someone else’s cloud. With the proliferation of these SaaS services and APIs comes a proliferation of SLAs.
If you ask a procurement team how many of their SaaS vendors have violated the contractual SLA in the past 12 months, you will likely be met with uncomfortable silence. Getting the SLAs out of the contracts and into an operational system is a big gap in the way SLAs are managed today.
This trend will only continue. Software is increasingly being delivered as cloud services — whether in CRM, HR, supply chain/ERP, financial tracking, payments, payroll, travel, and various other line-of-business applications. And even software written within an enterprise that is custom to their business process and rules, is often deployed on cloud services. Different groups or teams may have SLAs between each other, referred to as service-level objectives, or SLOs, to make them less threatening.
Basic Vendor SLA Management
Once you sign an SLA, set up synthetic monitoring that watches the service. Encode the SLA programmatically, so the monitoring system can automatically notify you if an SLA is in violation. Ideally, this evidence will come to you quickly, so you can pass it along to the vendor and immediately request a remedy.
Over time, you can better understand which services are improving and which are missing the mark, so you can better negotiate and provide detailed feedback about their service reliability. It’s also essential to understand if SLA violations from vendors have a tangible impact on the business. Perhaps these services aren’t as mission-critical as you thought, and you may be able to reduce the service-level goals from the vendor and recoup some costs by renegotiating the SLA down.
Considerations for Your SLA to Your Customers
If you are selling to enterprise customers of any scale, you will have to offer an SLA. Sometimes the terms will be dictated by the customer; and in other cases you may decide. You can architect your systems to have resiliency and redundancy to ensure these SLAs are never, or rarely, missed. Most likely, your SLA is at a level that is the bare minimum for your customers. You can set internal goals that are a bit higher, giving you a risk buffer and a quantifiable amount of unreliability that can be measured and managed independently from the SLA contract itself.
It can be easy for engineering teams to shrug off minor reliability and performance issues as “within SLA,” even though the customer experience is quite frustrating. To avoid this, define several fine-grained goals internally driven by customer use cases and expectations, with differing goals based on real-world user behavior.
In today’s competitive software world, excellent user experience and reliability are critical features that every customer has come to expect. For you to stand out, your service needs to stand tough.
Conclusion: Ignoring SLAs Doesn’t Pay
If you use the services in those headlines and aren’t getting paid, you have to wonder if you are a sucker. Progress in technology comes from constant improvement incentives, and SLA violations create those incentives in a business context.
The biggest reason why companies don’t collect SLA credits is that they don’t have a systematic way to track and notice SLA violations in the first place, even tiny ones that may not have much business impact. Perhaps an outage is felt at the front line, but those folks don’t know how to alert the procurement team that owns the contract. That disconnect is prevalent within large organizations, with many vendors and SLAs described in legalese and not code.
SLAs are a reality for businesses, and they aren’t going away any time soon. This is a good thing for consumers and companies who expect a certain level of service. Many services means a lot of SLAs, which need to be operationally measured and analyzed. Holding vendors accountable can lead to better innovation and reliability. Plus, overly tight SLAs can be renegotiated to save costs. SLA violations can have big payouts, and advanced companies keep an eye out to collect on these and better manage supplier risks through SLAs.