But there is a downside to monorepos. The convenience of storing all code in a single repo is offset by the extra build time it requires to churn through this code whenever a new change is added.
This is the problem that developer Jared Palmer encountered when building his own app (a TypeScript runtime, TSDX). He was building this project in a monorepo, where all the code, including dependencies, was located in a single repository, and he wanted to structure TSDX so that it could be managed in a monorepo as well. When he vented his frustrations online, he found many others felt the same.
So he created Turborepo, a open source monorepo build tool, that, according to Palmer, could boost build speeds by roughly 65% to 85%. In a few outlying instances, it has reduced a 30-minute build to 100 milliseconds, Palmer asserted.
“Turborepo is really good at what it does: Ridiculously fast builds,” enthused one engineer on Twitter.
In addition, Palmer geared the software to be super-intuitive to single developers and small teams. Turborepo has indeed garnered praise from reviewers in this regard when compared to NX, a similar project created by former Google engineers.
So impressed with the software, Vercel acquired the technology, filling out its portfolio of web development technologies, which also includes the Svelte next-generation front end framework and the Next.js library for augmenting the React framework with server-side rendering capabilities.
The Big Code Problem
The problem of managing large amounts of code in a uniform manner has been around for a while and has been exacerbated by the explosion of web development, which relies on a diversity of open source packages and a certain swiftness of delivery.
The answer that the IT giants have come to is to store everything in one giant repository (the “monorepo”). In addition to better managing the code itself, a monorepo sets the stage for uniform coding style and testing across the organization.
Google, Facebook and Uber have all gone this path, as have the keepers of React itself.
The general build tools haven’t kept up with this evolving environment, however. While web giants Facebook and Google have both developed internal toolsets to tackle the latency issue (open sourced as Bazel and Buck, respectively), these tools required extensive configuration and were designed for large, engineering-heavy organizations.
Palmer was more interested in building a tool that would be more easily used by smaller teams. Enter Turborepo.
Caching and Parallelization
The faster build times come from a couple of different ways.
One is smart caching. For this, the software borrows a technique from Google’s Bazel, built around content-addressable storage.
Turbo looks at “the state of your codebase,” Palmer explained. It also logs the commands that are being run to build the software, making a fingerprint that serves to index the finished work. When the dev types the same sequence of commands, Turbo then can quickly deliver the cached version rather than repeat the work.
“Turbo constructs a dependency graph, both of the external dependencies from package registries, and also the internal dependencies within your codebase,” Palmer explained. The developer provides the dependency information in a
turbo.json configuration file the root of the project.
In collaborative environments, every developer’s cache is shared, so one dev can reuse the work of peers.
Compare this to venerable
make command, which only looks at the modification times of the files or folders specified, rather than the fingerprint of the actual artifact. Different computers will produce different timestamps for the exact same time, which will cause build systems to miss otherwise identical files.
In addition to using cached work, Turborepo also looks for places to split the build into parallel operations.
The developer’s pipeline, or task graph, provides “a very concise way for developers to express the relationships between the scripts they need to run to build their codebase,” Palmer explained.
Here is a sample json pipeline configuration file (from the docs):
"dependsOn": ["build", "test", "lint"]
The Turbo command-line interface is open source and operates from the repo. The end-user can host the remote cache index, or use Vercel’s managed service, which comes with additional features such as metrics-based visualizations.
Also unique to Turborepo is that it can be incrementally adopted. Other build systems can “make constraints on your codebase and how it works and how it needs to be shaped. And while those constraints may be great at certain scales, they can be very costly and expensive and risky to migrate to,” Palmer said. In contrast Turborepo aims to “meet developers where they’re already at, with tools they are already using. And so it’s designed to be adopted, and in some ways deleted too. ”
It is still early days for Turborepo, Palmer admits. (The latest version is 1.1.16). The setup is still complicated, and requires some polishing, according to at least one user.
“Turborepo is a really cool project. And it’s not just cool, it’s really necessary — there clearly was missing some tool like this as monorepos are more and more popular,” wrote frontend architect Štěpán Granát in a blog post, while adding that the software’s inconsistencies point to work still needing to be done for production usage. “I still better run our main release pipeline without any caching as I want to be sure that something is not getting cached when it shouldn’t be as that really could be a big problem.”