Dynatrace sponsored this post.
In my previous article, I highlighted three widespread Kubernetes monitoring challenges facing platform operators and proposed solutions to overcome them with Prometheus.
But, before you implement your own solution based on these suggestions, ask yourself how many new solutions will be created if every reader builds his or her own? Right! You will end up with as many solutions as readers. That might be fine for solving the problem at hand, but it won’t scale to new use cases, or to new environments that other teams have to stand up, or to the “separation of concerns” that exist between developers, DevOps and SREs. Besides, much more could be achieved by not only configuring Prometheus and its ecosystem tools around SRE concepts but also leveraging them in other parts of the software development life cycle — for example, in CI/CD systems.
You can envision even more extensions. Why not integrate testing tools like JMeter and provide dedicated dashboards out-of-the-box for each test run? And why not query Prometheus metrics automatically for each test run and report it back to the user each time a test is triggered?
The ultimate root problem: Building such a system itself takes time and is a complex effort.
The ultimate solution: Use a ready-made, open-source framework built on industry standards.
We at Dynatrace started Keptn, a Cloud Native Computing Foundation sandbox project that provides a pluggable, event-based control plane for continuous delivery and automated operations, to answer this question: How to build a future-proofed, extendable platform on Kubernetes that provides out-of-the-box support for configuring and managing monitoring tools, and integrating them into a bigger workflow?
Keptn stores all of its configurations in an internal Git repository that can be connected with GitHub, Gitlab, Bitbucket, and other repositories, and applies changes from the Git repository to your environment. Keptn has an API and CLI to trigger these changes. There is no need to be an expert in Git, since you can use Keptn to add or change configurations in the Git repository.
Code Generation and Abstraction
Keptn builds upon SRE best practices, by basing its configuration on service-level indicators (SLI) and service-level objectives (SLO). These concepts are also the input for all code that is generated by Keptn or tools connected to Keptn. Thus, Keptn helps you to focus on ensuring service-level quality, instead of dealing with APIs.
Keptn is an event-based control plane that other tools — such as Jenkins, Chef, Puppet, Jira, Jmeter or Ansible — can connect to and be triggered by, or can trigger Keptn themselves. Communication with Keptn is done using CloudEvents, an incubating CNCF project.
As an example, consider a simple cloud event, an SLI and SLO file, and how Keptn can automate connected tools.
Step 1: A cloud event will be sent via the Keptn CLI — or from one of the aforementioned tools, like Jira or Puppet — to the Keptn control plane.
Step 2: From there, it will be distributed to all connected tools that have subscribed to this kind of event. Keptn provides over 20 different types of cloud events for continuous delivery and configuration management, as well as triggering auto-remediation actions to cover a broad spectrum of different use cases. Remember, the spec is open source and can be extended by the community.
Step 3: Registered tools receive the cloud event and can either execute an action directly with the information they receive within the event, or reach out to additional configurations managed by Keptn — for example, SLI and SLO files.
The following graphic illustrates the process of how Keptn automates connected tools.
This event-based approach ensures:
- Configuration is always version controlled;
- Code generators can be most effectively leveraged by plugging them into a larger toolset; and
- Generated code is synchronized within other tools, since all tools receive the same input information.
Here are some examples of what else you can do with tool integration triggered by Keptn:
- Monitoring data can be automatically pulled and used for quality gate evaluation in CI/CD systems.
- Quality gate evaluation can be sent to other connected tools, such as Slack, to inform developers about the performance impact of their latest commits.
- Auto-remediation can be kicked off to remediate issues in your production environment, by connected remediation or automation tools to Keptn.
Whether you want to automate scaling and setup of Prometheus and its ecosystem tools, or want to go all-in on an event-driven control-plane approach, Keptn helps you automate connected tools.
Want to learn more about Keptn and give it a try? Watch our on-demand Keptn Performance Clinic and join the Keptn Community; build your own services and then tell us about them so that the whole Keptn ecosystem can benefit!
Be sure to check out the Keptn community meetings as well, where the team provides regular LIVE updates from the Keptn headquarters.
The Cloud Native Computing Foundation is a sponsor of The New Stack.
Feature image via Pixabay.