A mystery of unnecessary crate recompilation

I recently fixed a Rust problem at work which had been a minor source of frustration for well over a year. Naturally, it turned out to be my fault all along. The effect of the problem was “only” slower compilation but when you’re compiling lots of Rust code already you don’t exactly want to make it worse.

To explain the issue let me share some of the repo layout. The lower three top-level directories are Rust workspaces, each containing their own workspace-flavoured Cargo.toml. All build artifacts go into the same repo-level target directory. (This is foreshadowing.)

.
├── target
├── utils-workspace
│   ├── Cargo.toml
│   ├── other-stuff
│   └── time
├── workspace1
│   ├── Cargo.toml
│   ├── crate1
│   └── crate2
└── workspace2
    ├── Cargo.toml
    ├── crate3
    └── crate4

This utils-workspace is a bit of a catch-all and crates from both workspace1 and workspace2 reference it as dependencies. The antagonist in today’s story is that one called time. Yes, that’s the same ditto-time crate I was talking about in a post a little while ago.

I spend most of my time inside workspace1 and life is good. Some of my colleagues, however, are frequently making edits in both workspace1 and workspace2 and occasionally I would hear a complaint: “why do I have to recompile ditto-time so often when I’m making an unrelated change?”

The concept of time gets into rather a lot of things so cargo rebuilding the time crate would inevitably snowball into rebuilding a bunch of other crates, amounting to a noticeable amount of wasted time.

I could only ever reproduce it on my own machine occasionally, by accident. It happened again the other week and I determined that I was going to figure it out properly. Reviewing my bash history I discovered the trick: I had to run cargo build in workspace1 then workspace2 in turn. My usual cargo check wasn’t enough to trigger it.

Baffled, but now armed with a way to make it happen, I overrode the toolchain to use cargo nightly to get access to the --build-plan option. This dumps a huge file which explains all the dependencies that are going into a given crate. When you look at the input filenames you have not only the name and version number like in the lock file—it includes cargo’s Metadata hash, which combines the crate, its version, its features, compiler version, etc., and the metadata of all its upstream dependencies. It’s pretty comprehensive. That’s why you might see a collection of files like the following inside your target directory:

libyasna-080c400d17ab0fbc.rlib
libyasna-38f10d7bdca324e3.rlib
libyasna-3bbd36df87de5542.rlib
libyasna-4ee5bbdf3224f90b.rlib
libyasna-53051573765968ab.rlib
libyasna-5a757cdb54d25cfc.rlib

Here we have the same yasna crate but it’s either different versions of yasna itself or the same version with different resolved versions of its dependencies. This situation arises easily if you have multiple cargo workspaces since they have their own lock files and can come up with slightly different version trees. The result is slightly different build products piled into that shared target directory.

So far there’s no problem. Because of this ingenious metadata-appending, the cached version of yasna for one workspace can coexist with the cached version for a different workspace. When I looked in my target directory for the time crate, though…

libditto_time.rlib

What’s this? Only one, and no metadata! Finally the behaviour made sense: according to the build plans, this particular crate has different resolved dependencies in workspace1 and workspace2. Since they refer to exactly the same output filename in target it has to be rebuilt for each when swapping between those workspaces. In the process it invalidates all its downstream dependants. Dang.

The root cause was a copy-paste error. I had this in time’s Cargo.toml:

[lib]
crate-type = ["cdylib", "rlib"]

When I first made the crate I was copying some common metadata from an existing crate, one which happened to be built as a C-style dylib. That was completely unnecessary, but it didn’t break so I didn’t pay attention to it.

Looking at cargo’s should_use_metadata function, the reason for this behaviour is documented:

    // No metadata in these cases:
    //
    // - dylibs:
    //   - if any dylib names are encoded in executables, so they can't be renamed.

It makes sense. Unfortunately for me the rlib and cdylib are named the same, so they can’t be cached according to their metadata and they’re forever getting overwritten. Removing the whole crate-type specification from Cargo.toml brought back the metadata and the problem went away.