NodeSource, maker of the Node.js application platform N|Solid, has just released N|Solid 2.3, adding metrics visualizations to the dashboard and and introducing event loop delay notifications — a feature no other Node platform currently supplies.
Users also get to enjoy a new webhooks-enabled notifications system that sends instant issue alerts to via their favorite communication channel.
Better Visualization, and Real-time Slack Channel Alerts
Previously, N|Solid provided email-based alerts when potential problems presented around threshold or vulnerability issues. The update adds the utility of customizable webhooks-based notifications. A webhook is an HTTP callback, essentially, a simple event notification via HTTP POST to a URL when a trigger event occurs.
N|Solid 2.3 now can be configured to send alerts — like notifications for CPU or heap thresholds, or a security vulnerability warning — in real time to popular communication systems like Slack, or any other preferred channel.
Integrating webhooks-based alerts into the mix is part of NodeSource’s goal of expanding — and then harnessing — the platform’s metrics visualization features. “With deeper process-level metrics on clear display, teams can really improve their Mean Time to Resolve (MTTR) metrics and speed up issue resolution times,” said the company’s founder and CEO Joe McCann.
He explained that N|Solid’s dashboard has been enhanced to show a greater range of the process-level metrics collected within an application. Real-time views of event loop and garbage collection behavior grant direct insight into application performance, and the new webhooks alerting feature allows teams to respond even more quickly when problems arise.
Unlocking the Event Loop
One of the notifications newly available in v2.3 is event loop delay alerts, a boon for any engineer working on a Node.js application. The event loop problem is real, people: in a Node runtime, any long-term synchronous activity blocking the event loop can prevent other incoming requests from ever reaching the server — potentially causing the entire application to gridlock.
“N|Solid is the only commercially available product that offers real-time event loop delay alerting, which can immediately identify and expose issues that can otherwise be subtle and extremely hard to detect,” said McCann. These alerts, he continued, notify users whenever the Node.js Event Loop is “blocked”, and provides a detailed stack trace that enables the user to drill down, pinpoint the exact cause of delays, and resolve the root of the problem.
Specifically, the new metrics — visualized in the Process Detail dashboard — include Event Loop Idle Percent and Event Loop Lag. According to McCann, these indicate the health of the Event Loop. “A low value for Event Loop Idle Percent and simultaneous high value for Event Loop Lag is a symptom of an overloaded server, which may require performance tuning and/or scaling up server instances,” he explained.
Collecting the Garbage
Another new metric featured in N|Solid’s latest update is Garbage Collection. “The Garbage Collection Count indicates the number of times the Node.js garbage collector has run over time. When the application starts using more objects, memory usage will increase and the garbage collector runs more frequently,” said McCann. “Hence, this metric is useful for monitoring for, and identifying, potential memory leaks.”
All these new metrics features converge as a solution to the problem Node.js devs face all too often: when your app works just fine in the testing environment, but then performs inexplicably poorly in production. “Standard monitoring tools may alert developers when a problem happens, but not why it happened,” said McCann. For example: you see that web response times are slow, but only on certain routes. Why the slowdown, and why those particular routes?
Developers can use metrics while testing their Node.js apps prior to deploying them into production, Quality Assurance (QA) teams can use these metrics to benchmark the performance of the Node.js apps during load and stress testing, and DevOps teams can then use the metrics visualization and webhooks notification features to proactively monitor Node.js apps in production.
An example of N|Solid v2.3 workflow using these new features: Slow web response times detected — in a Node application, this could be due to a blocked event loop, which in turn may be caused by a long-running synchronous operation. A possible sequence of events may look like this:
- Because the event loop delay for this process has exceeded the user-defined threshold, a notification is posted in the #engineering-alerts channel in Slack, instantly alerting the team that an issue exists.
- Members of the engineering team can drill down to see process-specific metrics, looking at a recent history of heap used, garbage collection, and event loop lag to pinpoint the timeline (when the process first started misbehaving) and get a sense for the likely cause.
- By looking at a detailed stack trace provided by N|Solid, engineers can drill down to identify the exact location (function within a file) of the problem. This prevents the unnecessary effort and delay that would have otherwise been spent in trying to reproduce the problem in test environments.
- Given this detailed information about the source of the event loop blockage, the engineering team can now make an informed decision about how to remediate, and quickly take steps to resolve the issue.
“Developers often attempt to reproduce a production problem in their test environments, which is painstaking and often unsuccessful,” he said. “N|Solid 2.3’s new features are all about enabling teams to proactively monitor performance issues in their Node.js applications, so as to quickly identify and resolve problems when they occur in production.”
“No other solution in the market provides this Event Loop Delay Alerting capability.”
Feature image via Pixabay.