TNS
VOXPOP
How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
0%
No change in plans, though we will keep an eye on the situation.
0%
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
0%
What recent turmoil?
0%
DevOps / Security / Software Development

How to Deal with Race Conditions

"Race conditions" refers to bugs that occur due to the timing or order-of-execution of multiple operations. Here's how to deal with them.
Nov 12th, 2021 8:45am by
Featued image for: How to Deal with Race Conditions
Photo by cottonbro from Pexels.

“Race conditions” refers to bugs that occur due to the timing or order of execution of multiple operations. This is a fairly broad class of bugs that can present themselves in very different ways, depending on the problem space. In many cases, they can be difficult to identify and/or reproduce, even if the solution might be simple.

TOCTOU

“Time-of-check to time-of-use” (TOCTOU) describes a type of race condition that occurs when the state of a resource changes between checking its state and using the result. TOCTOU is usually discussed in the context of filesystem operations, but variations are possible in many areas of the systems we build.

The common example of a TOCTOU race condition is checking if a file is accessible and then reading it:


If the file is deleted or otherwise modified after the initial check, at best you will end up with an unhandled exception. At worst, you could be opening the door to a security vulnerability.

Instead: Skip the access check, wrap the readFile in a try/catch and handle any errors there.

Atomicity

Normally we talk about “updating atomically” in the context of database systems. Consider the following contrived example:


See the problem? The user is fetched from the database, some logic is performed, then a query is executed to update the user. This is a non-atomic operation, however, and there is no guarantee that the user’s role is still set to “admin” in the database between the time of the check and the update. In this fake example, the consequence might be a security vulnerability in the application; and in a real-world scenario, it might be even more difficult to notice depending on the complexity of the system.

Instead: Craft an atomic update query that performs the update in a single statement.

Shared State

Mike Del Tito
Mike is an experienced senior developer at LogDNA, with a demonstrated history of designing, delivering and leading teams in the development of web-based software applications.

Although Node.js is single-threaded, working with shared resources and data structures asynchronously requires the same level of care as needed in multi-threaded systems.

Here’s another example:

https://repl.it/@MikeDel2/set-data-race?lite=true

At a quick glance, you might expect this to always output the first value of 1, but the program can print a different value every time (try it!).

Even if it’s easy to spot the issue here in an isolated example, consider a more complex concurrent program with a similar goal of doing some work only when a precondition is met. You might have multiple asynchronous workers that need to:

  • Open a write stream for a specific file for all workers to write to.
  • Create a record in a database if it doesn’t already exist.
  • Memoize the result of a very expensive operation.

In all of these scenarios, you cannot assume the state of the resource you are working with will remain the same between the check and the use. Variations of this specific “create if not exist” race condition have popped up several times recently in third-party code and even our own.

Counter Measures

Avoiding race conditions not only requires some thought about what your code is doing, but also about how other parts of the system will use your code. There are no silver bullets here, but in addition to being thoughtful about concurrent design, here are some tips:

  • Perform database updates atomically. Do not rely on previously queried information about the record you are updating to craft your update query.
  • In general, avoid sharing “global” state whenever possible, but especially with concurrency. Think about the implications of simultaneous access to data structures and how that affects the logic of the program or the correctness of the data itself.
  • If sharing state among concurrent routines is required, consider introducing a mutex (mutually exclusive) or another locking mechanism to control access to the shared resource. This comes at the expense of complexity, but is sometimes unavoidable.
Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.