Honeycomb: Debugging, Collaboration Made Easier
Honeycomb, the San Francisco-based observability software startup, has introduced new features designed to help DevOps and site reliability engineering (SRE) teams more easily analyze event-based production data and address problems faster.
It calls these enhancements the ability to observe production in hi-res. Offering what it calls “next-gen” application performance management, Honeycomb was designed to debug live production software, consume event data from any source with any data model and encourage collaborative problem-solving.
The new features include:
- A redesigned home page focused on making data accessible to even the newest or most junior member of the team, but with the ability to drill far down to the raw event to explore issues.
- BubbleUp, which allows users to select suspect areas of heat maps to investigate anomalous behavior.
- Distributed Tracing Accessed with a Click directly from line graphs, histograms, or heatmaps to easily navigate across services, examine crucial details and discover latency, errors or duplicates.
- Added collaboration features so users can share and search query history, replay debugging steps, curate dashboards for new team members and more.
“Other companies’ idea of AIOps is around the idea of ‘We watch your systems for you and let you know if there’s a problem.’ Instead, Honeycomb aims to empower the developer or the operator to ask questions like ‘What’s going on that’s weird,’ not ‘Tell me when you think something’s going on.’ Instead, it’s empowering you to answer the questions you’re already asking,” said Liz Fong-Jones, Honeycomb principal developer advocate.
In addition to helping to bring new team members onboard as quickly as possible, it aims to provide a single place to view data, share analytics or results on a query that you’ve run yesterday or a few weeks ago.
Customers can go from a histogram to a heat map and switch back and forth with a trace view. The company added tracing software last June.
Fong-Jones described it this way:
“You can click to show one trace that has this latency. I can look at that trace and see what’s going on with a set of API calls going on downstream. Or you can use BubbleUp. If you ask, ‘What’s going on with this data?’ we can answer questions.
It might tell you that one user is disproportionately experiencing problems. Or you might see that it’s one endpoint. Or one set of values that the MySQL duration is visibly distinct from the control group. It can ingest rich events in Honeycomb and surface insight so you don’t have to know what you’re looking for right off the bat. I can just say, “Show me how everyone’s doing. Oh, hey, that looked weird. Let’s go look at just that one customer.”
If I know that one customer is having a bad experience, how bad is that experience? We can see that as soon as that customer comes to our app, they start having really bad latencies and they give up and go away.
Getting insights like that in one unified view — being able to look at the whole picture, look at one individual trace and everything in between — that’s kind of what we’ve done with the BubbleUp feature.
With the collaboration features, a user can see the queries a teammate has run and rather than working on static different copies of the data, can retrieve the query and debug in parallel, then share results collaboratively.
In an InfoQ post, Fong-Jones advocates production ownership, which she says requires changes to people, culture and process rather than just tools.
Going forward, the company is moving beyond its core strength of incident response. That includes using Honeycomb for ongoing optimization and adding more capabilities for historical analysis to be used as service-level indicators and service-level objectives.
“We’ve kind of moved from just a tool for the power users, for the big and gnarly bugs, to ‘This is a debugging tool that anyone on your team can pick up,’” she said.
In a previous TNS article, Wojtek Chichon interviewed Honeycomb co-founder and CEO Charity Majors about the importance of putting developers on call, the concept of observability-driven development and software ownership within teams.