Bazel, Google’s Open Source Build System
One of the most important, yet unsung, applications in a software developer’s life is the Make utility, or its equivalent. Make first appeared in 1977 and has been with us ever since. There are a very large number of build utilities, some based on Make, others completely different. The principle remains the same. The build system has a set of rules that tell it how to build an application from source files, usually fetched from a version control system. The Make utility reads the rules, then runs the compilers and linkers to do the build. The really good ones will run tests, as well.
Google has been using their own system, called Blaze, and open-sourced part of it as the anagrammatically named Bazel — recently released at alpha status. In this article I’ll give a general overview of Bazel.
So what’s different about Bazel? It aims to do two things: build quickly and correctly. It uses a massive shared code repository where all software is built from source. Speed is achieved using both caching and parallelism. Blaze solves a slightly different problem from Bazel as it is designed for Google’s internal systems. Any Google engineer can build any Google product from source on any machine by invoking a Blaze command. Here’s what one Google engineer said about Blaze on Hacker News:
Working at Google, Blaze is one of the technologies that amazes me most. Any engineer can build any Google product from source on any machine just by invoking a Blaze command. I may not want to build GMail from source (could take a while) but it’s awesome to know that I can.
I think this could be hugely useful to very large open source projects (like databases or operating systems) that may be intimidating for contributors to build and test.
Bazel works on packages — collections of files with interdependencies, and sub-directories — and sub-packages. Each package is a directory containing a file called BUILD.
The packages define targets as either files or rules. Files are either source files or files generated by build tools such as .obj files or resource files. Rules describe how to generate output files from input files, including other rules, and the steps needed to build the output files.
Bazel is run as a client server on Linux and Mac environments. Windows doesn’t currently seem to be a high priority for the Bazel team, presumably because they don’t develop for it and it’s not a platform they use internally.
The client is command line-based with commands like ‘bazel’ to invoke the client or ‘bazel test’ to run tests. The Bazel client talks to a long running server, or starts one if it isn’t present; there’s one server per workspace, which is a directory containing the source code that you’re building. Bazel has been designed to avoid the problems with errors in a Make process that require a ‘make clean’ prior to rebuilding everything once the error is fixed. Success or failure of build stages in Bazel is tracked in a database, and if the input stages are unchanged, then no rebuilds happen. Bazel guarantees that after a successful build, during which no edits were done, the build will be in a consistent state.
Programming Language Support
Bazel can extend rules and macros through the Skylark language. Macros are functions called from the BUILD file and are used to create rules for the build. Skylark’s syntax is a subset of Python, although some features not supported are class, while, break, continue, lambda and a few others. It’s also thread-safe, data structures are immutable, global values are constant and cannot be reassigned nor can a variable type be changed. These restrictions help speed up builds, because they allow parts of builds to run in parallel.
There are Bazel build rules that define automatic tests run by the command ‘bazel test.’ The test runner works on multiple test executables, created by running test rules, and are time limited, depending on the tests size or a specified timeout. The Test Encyclopedia strongly suggests tests be reduced to minimum dependencies so the results can be reproduced reliably.
A particularly strong feature is the Bazel Query language, which is used to analyze build dependencies. This involves a learning curve; however, you may not even need it for simpler builds. Basically, you apply filters to return targets. These can be paths, files, rules, etc. It’s a powerful system, and knowledge of graph theory would come in very handy!
Given the vast amounts of software Google builds, it’s encouraging to see them advancing the state of build technology. But note — only 10 percent of these rules have been open-sourced, the rest are Google build-specific, so they won’t make sense to an outsider.
Bazel is aimed, in particular, at projects with a combination of the following characteristics: they have a large shared codebase that supports multiple platforms, they are written in multiple languages and have an extensive test suite. If that’s your project, take a closer look at Bazel.
Given how many enterprises use Windows and develop software for it, I wonder if that might limit Bazel’s take up.