Troubleshooting Node.js Issues in Production with llnode
If you run a Node.js application in production, you may wonder “how can I debug my process while the application is taking production traffic?” Sometimes these bugs are easy to track down and to fix without any diagnostic tools. Sometimes, they need a deeper understanding of the current state of the system. When things go south and you need to inspect your frames, objects and source code, that’s where llnode comes in.
Although llnode can be used in real environments, the project is still a work in progress. Besides, since it relies on heuristics and presumptions about the VM, using it might not be completely reliable in some cases.
llnode is currently supported and tested on Linux and OS X environments. FreeBSD should work as well, although Node.js doesn’t have CI tests for it. Windows is not supported at the moment.
Why Another Tool
With so many amazing debugging tools already available — such as ndb, Chrome DevTools and Node.js built-in debugger — we might think “but why create another tool?” Although these tools are incredibly powerful on development environments, they are not suited for production environments: using them can put significant overhead to your process and sometimes even block or crash them. When your application is complex or you have a large scale deployment, some issues won’t appear in development environments. Besides, being able to reproduce issues in development or staging environments can be an expensive task. With llnode, you can take a snapshot of your application anytime without significant performance penalty or at the point of failure and inspect variables and functions without affecting your production environment. This is known as post-mortem debugging.
Post-mortem debugging is a technique which allows developers to gather insights and find bugs in production processes after those issues happened — even if the application crashes or goes into an infinite loop. This is a common technique for static languages such as C++, but only a few dynamic runtimes have support for it. Fortunately, Node.js is one of them!
How It Works
llnode can be used in three different ways. You can:
- Start a new Node.js process with llnode attached
- Attach llnode to an existing process
- Load a core dump into llnode
Core dumps are snapshots of a process memory at a given moment in time, and they can be generated on demand or when the process crashes. Core dumps are also the most important ingredient when performing post-mortem debugging.
On-demand core dumps are useful when your application hasn’t crashed but is not behaving correctly. One common example is when the application goes into an infinite loop. To generate a core dump on demand, you need tools such as gcore (Linux) or lldb’s “save-core” command (OS X).
You can think of crash core dumps as the software equivalent to a black box on airplanes. It contains every single piece of information about the process at the moment it crashed, including variables, the call stack, pending asynchronous resources, and so on. Crashing core dumps can help you make reason of out-of-memory failures or unhandled exceptions, as well as other subtle bugs.
- Useful to find the reason for infinite loops
- List all allocated objects of a given type
- Useful to find weird behaviors
- Find all references to a given object
- Useful to find retainers when the process is leaking memory/OOM
All commands are available in the project’s README, as well as in the help command (
Common Use Cases
As mentioned before, llnode can either attach to a process or load a core dump file. Attaching to a process is useful when you’re on a development environment and want to set breakpoints, execute functions step-by-step, etc.
If you want to debug a production application, you’ll take a core dump of the application and load it in your development environment. You won’t be able to set breakpoints or execute functions step-by-step in this case, but you’ll be able to inspect the application’s state at the point the core dump was taken.
Either way, llnode is not a silver bullet and shouldn’t be treated as such. There are some use cases where you’ll want to use other tools, such as the Chrome DevTools, Node.js builtin’s debugger or others. Choose llnode for use cases it was built for, and you’ll get the best result out of it. The most common use cases for llnode are:
- Tracking down memory leaks
- Understanding the context of an uncaught exception
- Debugging infinite loops
- Debugging Node.js core or native modules
- Debugging V8 internals with heap insights and easy access to objects as well as easy translation of JIT function names.
Present and Future
llnode is the hidden superpower which makes Node.js more attractive for the enterprise and makes the life of several developers easier. There aren’t many runtimes with this kind of superpower, therefore llnode has an immense value to the Node.js community, especially for developers working on large-scale deployments, Node.js core, V8, native modules and platforms tightly integrated with external services.
But the project still has a long way to go. In order to make llnode useful for a broader number of users, we are focusing our efforts to make it more accessible, stable and useful for the Node.js community as a whole. These are the areas we are focusing on right now:
- User experience
- Better installation and distribution process
- Maintainability and keeping up with V8 changes
- Windows support
In the future, you can expect llnode to be even more stable, reliable, easier to install and to use.
The project is looking for collaborators! If you want to get involved, feel free to reach out through our GitHub repository or to my email.
- Taming the dragon: using llnode to debug your Node.js application
- llnode for Node.js Memory Leak Analysis
- Exploring Node.js core dumps using the llnode plugin for lldb.
- js Postmortem Debugging for Fun and Production
The Linux Foundation is a sponsor of The New Stack.
Feature image via Pixabay.