Programming Languages

Troubleshooting Node.js Issues in Production with llnode

5 Oct 2018 9:30am, by

Matheus Marchini
Matheus is a Software Engineer at Sthima, where he works solving hard system-related problems for large-scale companies. He’s passionate about Open Source and making systems more reliable. Most of his work on Open Source is related to diagnostic tools, such as developing new features for llnode, bringing back Linux perf support to Node.js and introducing dynamic USDT probes for Linux.

This post is one in a series for the upcoming Linux Foundation’s Node + JS Interactive conference, taking place October 10 -12 in Vancouver.  The program will cover a broad spectrum of the JavaScript ecosystem including Node.js, frameworks, best practices and stories from successful end-users.

If you run a Node.js application in production, you may wonder “how can I debug my process while the application is taking production traffic?” Sometimes these bugs are easy to track down and to fix without any diagnostic tools. Sometimes, they need a deeper understanding of the current state of the system. When things go south and you need to inspect your frames, objects and source code, that’s where llnode comes in.

Although llnode can be used in real environments, the project is still a work in progress. Besides, since it relies on heuristics and presumptions about the VM, using it might not be completely reliable in some cases.

llnode is currently supported and tested on Linux and OS X environments. FreeBSD should work as well, although Node.js doesn’t have CI tests for it. Windows is not supported at the moment.

Why Another Tool

With so many amazing debugging tools already available — such as ndb, Chrome DevTools and Node.js built-in debugger — we might think “but why create another tool?” Although these tools are incredibly powerful on development environments, they are not suited for production environments: using them can put significant overhead to your process and sometimes even block or crash them. When your application is complex or you have a large scale deployment, some issues won’t appear in development environments. Besides, being able to reproduce issues in development or staging environments can be an expensive task. With llnode, you can take a snapshot of your application anytime without significant performance penalty or at the point of failure and inspect variables and functions without affecting your production environment. This is known as post-mortem debugging.

Post-Mortem Debugging

Post-mortem debugging is a technique which allows developers to gather insights and find bugs in production processes after those issues happened — even if the application crashes or goes into an infinite loop. This is a common technique for static languages such as C++, but only a few dynamic runtimes have support for it. Fortunately, Node.js is one of them!

Post-mortem debugging in Node.js was first introduced by David Pacheco back in 2012, and since then the community made efforts to keep it working. In the dawn of Node.js the only tool we had to perform this kind of analysis was mdb_v8, but unfortunately only a few operating systems — such as SmartOS — were able to run it. A few years ago llnode was created with an intention to provide a cross-platform, user-friendly and always supported post-mortem debugging for Node.js. Inspired by mdb_v8 and using the flexibility of LLDB, llnode can run on several platforms and perform post-mortem debugging by inspecting JavaScript objects and call stack from a core dump or process, alongside all native features provided by LLDB.

How It Works

llnode can be used in three different ways. You can:

  • Start a new Node.js process with llnode attached
  • Attach llnode to an existing process
  • Load a core dump into llnode

Core dumps are snapshots of a process memory at a given moment in time, and they can be generated on demand or when the process crashes. Core dumps are also the most important ingredient when performing post-mortem debugging.

On-demand core dumps are useful when your application hasn’t crashed but is not behaving correctly. One common example is when the application goes into an infinite loop. To generate a core dump on demand, you need tools such as gcore (Linux) or lldb’s “save-core” command (OS X).

Typical flow when taking a core dump on demand

You can think of crash core dumps as the software equivalent to a black box on airplanes. It contains every single piece of information about the process at the moment it crashed, including variables, the call stack, pending asynchronous resources, and so on. Crashing core dumps can help you make reason of out-of-memory failures or unhandled exceptions, as well as other subtle bugs.

Typical flow when taking a core dump on crash

Current Features

  • Complete call stack, with JavaScript and C++ frames
    • Useful to find the reason for infinite loops
  • List JavaScript objects by type with the number of allocated objects as well as memory used by them
  • List all allocated objects of a given type
  • Inspect JavaScript objects
    • Useful to find weird behaviors
  • Find all references to a given object
    • Useful to find retainers when the process is leaking memory/OOM

All commands are available in the project’s README, as well as in the help command (`v8 help`).

Common Use Cases

As mentioned before, llnode can either attach to a process or load a core dump file. Attaching to a process is useful when you’re on a development environment and want to set breakpoints, execute functions step-by-step, etc.

If you want to debug a production application, you’ll take a core dump of the application and load it in your development environment. You won’t be able to set breakpoints or execute functions step-by-step in this case, but you’ll be able to inspect the application’s state at the point the core dump was taken.

Either way, llnode is not a silver bullet and shouldn’t be treated as such. There are some use cases where you’ll want to use other tools, such as the Chrome DevTools, Node.js builtin’s debugger or others. Choose llnode for use cases it was built for, and you’ll get the best result out of it. The most common use cases for llnode are:

  • Tracking down memory leaks
  • Understanding the context of an uncaught exception
  • Debugging infinite loops
  • Debugging Node.js core or native modules
  • Debugging V8 internals with heap insights and easy access to objects as well as easy translation of JIT function names.

Present and Future

llnode is the hidden superpower which makes Node.js more attractive for the enterprise and makes the life of several developers easier. There aren’t many runtimes with this kind of superpower, therefore llnode has an immense value to the Node.js community, especially for developers working on large-scale deployments, Node.js core, V8, native modules and platforms tightly integrated with external services.

But the project still has a long way to go. In order to make llnode useful for a broader number of users, we are focusing our efforts to make it more accessible, stable and useful for the Node.js community as a whole. These are the areas we are focusing on right now:

  • Documentation
  • User experience
  • Better installation and distribution process
  • JavaScript API
  • Maintainability and keeping up with V8 changes
  • Windows support

In the future, you can expect llnode to be even more stable, reliable, easier to install and to use.

Get Involved

The project is looking for collaborators! If you want to get involved, feel free to reach out through our GitHub repository or to my email.

Further Readings

The Linux Foundation is a sponsor of The New Stack.

Feature image via Pixabay.

A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.