Lumigo: End-to-End Serverless Monitoring and Troubleshooting
IT teams spend way too much time troubleshooting problems in their environments. In fact, developers, DevOps teams, and web product managers reported in a recent SolarWinds survey that because of the time spent on day-to-day troubleshooting, they don’t necessarily have time to prioritize business growth and innovation.
Serverless environments, which free developers from having to worry about infrastructure, also pose particular troubleshooting challenges because when things go wrong, there are no servers to monitor.
Cloud applications tend to be made up of dozens of different parts.
“It’s even more than microservices. It’s nanoservices,” said Berkner. “We have an application with 30 or 40 different parts like Lego blocks. Understanding the big picture, understanding how all these different parts come together, understanding how a single request goes through all the different pieces, understanding the inputs and outputs of each component is one of the challenges with troubleshooting, particularly with serverless.”
Lumigo provides a visual map of the environment and the ability to track and drill down into every aspect of each request, including the cost of each request on AWS.
“All the companies doing serverless monitoring and troubleshooting are either focused on the function of the service, focusing on the lambdas, giving you a drill-down into the lambdas and giving you specific information about CPU and memory, giving you a very technical view of the problem, which is very needed,” Berkner said.
“Our view is more an end-to-end, production, business logic, application-centric view. We’re looking beyond functions — we’re looking at the entire technology stack of containers, serverless and third-party services.”
The platform analyzes huge amounts of data, learns the normal behavior of the serverless application and produces live visual maps of your architecture with all the different components and how they talk to each other. It will draw the path within the system and help you understand which error to focus on.
So you find the error, but then you don’t know the root cause. It allows you to go back through all the different services of this request to find it. That can be a lot of different services upstream from where the problem actually manifests.
“It allows you to check all the inputs and outputs, all the data that came through the service to understand what went through every service. You can see that, for instance, all the inputs were OK, but something went wrong from this function onward. Finding out something abnormal happened here, you can start drilling down into this specific piece of code and find out caused the problem,” he said.
On the transactions screen, Lumigo connects all the events and all the side effects in the system. It will show the story of every request in the system.
“…We can show you every part of every request, like where did it start? Who or what was the user initiating this request? What was the response to this request? Were there issues along the way? What was the duration of that transaction across all the different services compared with the average for that transaction? It also can tell you cost. What was the cost of this specific request compared with average?” he said.
“For the first time in the software industry, you can put a cost on a specific request. I can tell you the associated services and what each one of them cost. … Because I know the time and the resources [I know the cost].
“We think this is really a game-changer because it will allow you to forecast how much it’s going to cost next month when I’m going to have 1 million requests.”
Lumigo will tell you where the problem is and what in the transaction is abnormal, showing all the distributed logs in a single location.
Only recently coming out of stealth, Lumigo announced it has raised $8 million seed funding, backed by Pitango Venture Capital, Grove Ventures and Meron Capital.
So far the company is only on AWS because that’s where there’s the most demand, Berkner said. But is also is experimenting with Google and Azure and will be able to ramp up quickly on those services as demand grows.
Going forward, the company will focus on three areas, he said: Additional clouds, machine learning to better isolate abnormal behavior in services, and cost.
“We think costs in the future will be very different from calculations today,” he said. “When you get a bill from AWS, it’s very hard to understand what’s going on. Through understanding the cost per transaction, we can help you understand what’s going on.”
You can sign up here for Lumigo’s Feb. 19 webinar “7 things you need to know before going serverless.”
Stackery is a sponsor of The New Stack.