Why Bazel is the Endgame for Build Systems
The Ultimate Guide to Bazel; Efficiently Managing Large Repositories
A technical deep dive into hermeticity, the action graph, and how Google scales software engineering.
Every software project starts simple. You have a few files, you run npm run build or go build, and three seconds later, you have an executable. But as an engineering organization grows, so does the complexity.
The
npm run buildcommand is used to execute the "build" script defined in a project'spackage.jsonfile. This script typically runs a series of tasks using build tools (like Webpack, Rollup, or Babel) to transform the raw source code into an optimized, production-ready version of the application.
Suddenly, your frontend depends on a shared TypeScript library, which relies on a protocol buffer, which is generated from a C++ backend service. Your standard build tools (Webpack, Maven, Gradle) only understand their specific language silos.
To build the whole system, engineers end up writing brittle bash scripts that glue these disparate tools together. Build times creep from seconds to minutes, and eventually, to an hour.
Enter Bazel.
Originally developed inside Google as “Blaze” to handle their world-record-sized monorepo, Bazel is an open-source build and test tool designed for absolute scale. It doesn’t just make builds faster; it makes them mathematically predictable. Here is how it works under the hood.
Google's monorepo, primarily managed using a custom system called Piper, is one of the largest in the world, holding over 2 billion lines of code and roughly 86 terabytes of data. It contains about 9 million unique source files and handles over 40,000 commits per day from thousands of engineers.
1. The Core Philosophy: Hermeticity
The most frustrating phrase in software engineering is: “It works on my machine.”
This happens because traditional build systems leak state. If your Python script implicitly relies on a system-wide version of OpenSSL installed on your specific laptop, the build will pass for you but fail on the CI server or your coworker’s machine.
Bazel solves this through Hermeticity.
A hermetic build means the execution environment is completely isolated from the host machine. When Bazel runs a compiler, it places it inside a strict sandbox.
It only has access to the exact input files explicitly declared in the configuration.
It is stripped of environmental variables like
$PATH.It cannot access the internet to silently download a missing dependency mid-build.
If a build passes on your laptop, Bazel guarantees it will pass on every other machine in the universe, producing the exact same binary bit-for-bit.
2. The Brain: The Action Graph
Like Terraform, Bazel doesn’t just read code top-to-bottom. Before it compiles a single line of code, it analyzes your BUILD files to construct a massive, in-memory Directed Acyclic Graph (DAG) known as the Action Graph.

The action graph is built on two main concepts:
Artifacts: These are files, either source files or generated intermediate/final outputs (e.g.,
.ofiles,.jarfiles).Note:
.oand.jarfiles are both types of compiled code packages, but they serve different environments:.o(Object) files are intermediate binary machine code files used in C/C++ compilation, while.jar(Java Archive) files are packaged, compressed files containing Java bytecode (.classfiles), images, and resources for executing Java applications.
Actions: These are atomic commands run during the build (e.g., compilers, linkers, scripts) that take specific artifacts as inputs and produce others as outputs. An action contains command-line arguments, environment variables, and an Action Key used for caching.
After evaluating BUILD files (loading/analysis), Bazel constructs this graph to understand exactly what work is needed. The execution phase traverses this graph to run actions in parallel, maximizing performance. The action graph contains all possible actions to build a target (i.e., superset of work), but Bazel only executes the ones necessary to fulfill the request. Bazel assumes actions are hermetic, meaning given the same inputs and action graph structure, it will always produce the same outputs.
Every node in this graph represents a specific action (e.g., “Compile main.cc to main.o“), and every edge represents a dependency.
This graph gives Bazel two incredible superpowers:
Correctness: It knows exactly what inputs affect what outputs. If you change a CSS file, Bazel’s graph proves that it does not need to recompile your Go backend.
Maximum Parallelism: If the graph shows that 500 compilation actions do not depend on each other, Bazel will not build them sequentially. It will spin up threads and execute all 500 simultaneously.

Because Bazel knows the exact inputs and outputs of every action, it can use the Action Cache to determine if a specific action has already been performed, skipping it if the inputs haven’t changed.
The Content-Addressable Store (CAS) and the Action Cache (AC) are two distinct but interconnected parts of the build cache.
The Action Cache (AC) maps a unique hash of a build action’s inputs and definition to the metadata of its outputs. This metadata includes the exit code and the hashes (digests) of the files that were produced. This allows Bazel to determine if a specific action has already been run and its results are available.
The Content-Addressable Store (CAS) is where the actual output files (artifacts) generated by the actions are stored. The key for each stored file is a hash (digest) of its content, which ensures data integrity (the key always matches the content) and prevents data overwrites.
3. The Speed Secret: Remote Caching and Execution
Even with perfect parallelism, compiling a million-line codebase on a 4-core laptop is going to be slow. Bazel’s true endgame is taking the build off your laptop entirely.
Because Bazel builds are completely hermetic and predictable, the outputs are perfectly cacheable. Bazel hashes the inputs of an action (the source code, the compiler version, the flags). If those inputs match a hash in a Remote Cache, Bazel completely skips the compilation and simply downloads the finished artifact.
When an engineer at a massive company runs a full monorepo build, they usually compile almost nothing. Their local Bazel client checks the central server, sees that the CI pipeline (or another engineer) already built these exact files an hour ago, and downloads the results. A 40-minute build drops to 12 seconds.
For even more power, Bazel supports Remote Execution, where it farms out the actual compilation actions to a massive cluster of cloud servers, turning your laptop into a mere orchestrator.
Remote caching is not a true distributed build. If the cache is lost or a low-level change forces a full rebuild, you must still compile everything locally. The ultimate goal is remote execution, which distributes the actual build workload across multiple workers.
𝐋𝐞𝐚𝐫𝐧 𝐭𝐨 𝐛𝐮𝐢𝐥𝐝 𝐆𝐢𝐭, 𝐃𝐨𝐜𝐤𝐞𝐫, 𝐑𝐞𝐝𝐢𝐬, 𝐇𝐓𝐓𝐏 𝐬𝐞𝐫𝐯𝐞𝐫𝐬, 𝐚𝐧𝐝 𝐜𝐨𝐦𝐩𝐢𝐥𝐞𝐫𝐬, 𝐟𝐫𝐨𝐦 𝐬𝐜𝐫𝐚𝐭𝐜𝐡. Get 40% OFF CodeCrafters: https://app.codecrafters.io/join?via=the-coding-gopher
4. Polyglot by Design: Starlark
Most build tools are inherently tied to their language ecosystem. Cargo is for Rust, Maven is for Java, NPM is for Node.
Bazel is language-agnostic. It achieves this using a configuration language called Starlark (a restricted, deterministic subset of Python). Out of the box, Bazel doesn’t know how to compile anything. Instead, the community writes “Rules” in Starlark.
You can load rules_go, rules_docker, and rules_nodejs into the same workspace. This allows you to have a single BUILD file that compiles a Go binary, bundles a React frontend, and packages them both into a Docker container with one unified command: bazel build //src:everything.
The Verdict: Is Bazel for you?
Bazel is not a silver bullet. The learning curve is notoriously brutal, and maintaining a Bazel workspace requires significant engineering effort. If you are building a standard Next.js app, using Bazel is like using a sledgehammer to crack a peanut. However, if your organization is migrating to a massive monorepo, drowning in cross-language dependencies, and losing hundreds of engineering hours a week to slow CI pipelines, Bazel isn’t just an option—it is the only architecture that mathematically scales.










This was awesome 💯