A curious monotonic crash

A while ago I fielded an unusual bug report about an app crashing on iOS. What was strange was that it only seemed to happen on a freshly rebooted device. My experience with iPhones is that freshly rebooted is when they work best!

The crash was caused by a panic in some Rust code. It complained of an overflow while subtracting a Duration from an Instant. Even if you don’t know Rust this is a straightforward thing to follow: we start with a point in time, we subtract some quantity of seconds from it, and the result is an earlier point in time.

let one_minute_ago: Instant = Instant::now() - Duration::from_secs(60);

It turned out a line almost identical to this was the cause of the crash. Looking at it you might be as surprised as I was. You wouldn’t think that taking the current time and subtracting 60 seconds from it is so controversial, especially to the point of crashing.

The problem stems from Rust’s Instant using monotonic time, a made-up time that doesn’t necessarily correspond to what you see on your clock. Normally this is a very useful property—one_minute_ago is always one minute ago, even if there was a leap second or daylight savings just kicked in.

Perhaps you have already connected the dots: on the iPhone this made-up monotonic time starts at 0 when the system boots and counts upward from there. If you ask for “one minute ago” during the first minute of uptime, the field overflows and Rust panics. The moral of the story: don’t ever create Instants representing times earlier than when your app started.

I don’t intend to blame Rust for this annoying situation. Monotonic time information is completely nonstandard and provided differently from system to system. The documentation lists seven underlying methods that they will use to get the current monotonic time depending on the platform. In the case of Darwin it’s mach_absolute_time, a function which returns “tick units (starting at an arbitrary point)”. Rust could potentially do something like apply a fixed offset to enable a greater working range—but when the platform’s definition is so sloppy and could change at any time, this sort of fix could inadvertently create new problems.

There’s one more quirk I noticed. The uptime on my Mac is currently 7.72 days. If I write a Rust program to subtract from the current time, I can do 4 days okay but 5 days or more crashes. What gives?

It’s possible to debug-print the Instant to see its internal value. I can also write a C program that uses clock_gettime instead, an alternative monotonic clock API. (It happens to be one that Rust uses on other UNIX platforms.) And they give different results on the same machine!

Rust
Instant { t: 348024497167315 }

C clock_gettime
Seconds 667217
Nanos 729228000

The Rust value is 4.02 days’ worth of nanoseconds, while the the clock_gettime result is 7.72 days, perfectly matching my uptime. The difference? mach_absolute_time doesn’t increment while your computer is asleep.

Safest if I just don’t go backwards in time, methinks.