jClarity Brings Accurate Java Monitoring into the Container World
In the new world of containers, Java and Docker don’t necessarily get along.
The way Docker portrays the resources it has — memory, CPU, network — Java basically takes those numbers and misinterprets them from Docker, according to co-founder and CEO Martijn Verburg.
“The Docker people say it’s Java’s fault, and the Java people say it’s Docker’s fault. At the end of the day, Java is running under completely wrong assumptions about how much memory it’s being given, how much CPU it’s being given. Those two numbers in particular, which Java relies on very heavily, means strange problems start to happen,” he said.
The company’s initial product, Illuminate, is a software-as-a-service performance diagnostic engine that uses machine learning to help users pinpoint and correct performance problems in Sun/Oracle/OpenJDK JVMs. It supports all JVM languages.
Its second tool is Censum, which detects problems such as memory leaks and application pauses. It takes log files from the complex Java (JVM) garbage collection sub-system to help users determine why that’s happening. jClarity offers Censum as a SaaS option or local deployment.
Containers have had a huge impact on the company, Verberg said.
“Java always assumed it was running effectively on a bare-metal box, on the actual physical device, or at the very least a hypervisor of some sort from VMware or something of that sort,” he said.
JClarity’s software uses an agent on each host to gather information from the operating system and Java itself to perform root-cause analysis of problems. For containers, it attaches to the outside — Java provides a mechanism for that, he said.
“Container purists say you should have only one process per container, which means there’s no room for us. We had to change our deployment strategy for those customers,” he said. “As far as the user is concerned, our software is simply part of their application. It’s still a single process … If you’re following the purist container approach, we can deal with that.
“We also find a lot of people are taking, dare I say, a more pragmatic approach, using things like Kubernetes to allow for one other process — and usually only one other process — to attach to their application and usually for monitoring. And that’s what we are: We’re a monitoring tool.”
Users can specify a service-level agreement on any aspect of their application, such as a log-in page must respond within one second, or this database transaction should return in a half second. The agent contains a copy of the machine learning engine, and it makes a decision about the root cause of the problem.
“The Java application might appear to the user to be slow, but it could be a database on that host doing a lot of writing to the hard disk, slowing the entire machine down and affecting the application. It’s these kinds of hidden relationships our algorithm is very good at picking up,” he said. When a problem is detected it sends the user a plain English report and recommended next steps.
It sends out an email, urging the user to go to a dashboard. Integrations with other tools, such as Slack are in the works, Verberg said.
More Automation Coming
Verburg adopts an alter ego, The Diabolical Developer, on the Java conference circuit.— He and co-founder Ben Evans wrote the book “The Well-Grounded Java Developer,” then around 2012 set out to apply their knowledge of performance tuning. They created their London-based company with Kirk Pepperdine, whose experience was in high-performance computing, whom they knew from the conference world.
“We quickly learned it was beyond a decision matrix or simple statistics, we actually needed to use some machine learning to take the human knowledge we had and turn that into an algorithm that had any reasonable accuracy,” Verburg said.
“New Relic and AppDynamics produce all the metrics and produce amazing visuals and infographics [from them]. They’re really brilliant. We really admire what they’ve done there.
“But at the end of the day, a human still has to look at those graphs and try to correlate three or four different graphs to try to figure out the underlying problem. We go one step further: We go through all that data and tell them, ‘Your problem is right here,’” he said, adding that it then it recommends exactly what needs to be done — such as which line of code — to fix it.
So far, users have to manually go do that, but the company’s roadmap is to automate those fixes.
“We know that’s an educational piece, because users don’t necessarily trust an automated system,” he said.
“We’ll probably start out with, ‘Here’s our suggestion. Here’s an apply button.’ And it will apply it for you. We think over time as the industry trusts AI more, some of these fixes can be applied automatically, as long as the user is notified, of course.”
The company is in the process of raising a seed round of funding. Its customers include the travel site Kayak and UK property portal Rightmove, both websites for whom slowness can frustrate users and cause them to lose customers.