A BufMut capacity mystery

The other week I was messing around in Rust with an AsyncRead, which represented a client that connected over the network to make a request. To be spec-compliant, my server had to read up to 1024 bytes of UTF-8 followed by CRLF. I was in a pedantic mood so I didn’t want to use LinesCodec with a maximum line length. This codec also accepts lines that end with only LF, which is technically invalid in this application.

I was poking around in AsyncReadExt to find a way to do this that didn’t involve calling read in a loop and managing a buffer and cursor manually. I stumbled across read_buf:

async fn read_buf<B: BufMut>(&mut self, buf: &mut B) -> io::Result<usize>;

This came with the following fascinating example:

async fn main() -> io::Result<()> {
    let mut f = File::open("foo.txt").await?;
    let mut buffer = BytesMut::with_capacity(10);

    assert!(buffer.is_empty());

    // read up to 10 bytes, note that the return value is not needed
    // to access the data that was read as `buffer`'s internal
    // cursor is updated.
    f.read_buf(&mut buffer).await?;

    println!("The bytes: {:?}", &buffer[..]);
    Ok(())
}

Neat—a way to fill a buffer up to a maximum size with the bookkeeping handled for you via trait BufMut. It allocates a capacity of 10 bytes, which the following call to read_buf fills. Fair enough.

But this code looks a bit odd—a capacity is usually only a hint about something’s size to prevent reallocations. For example if you call Vec::with_capacity(10) there is nothing preventing you adding 11 or 200 elements to it. For some reason the capacity of this BytesMut is affecting how much data gets read.

Maybe it’s just a terminology confusion? I clicked on the docs for BytesMut and, well, no.

BytesMut’s BufMut implementation will implicitly grow its buffer as necessary.

So why in tarnation would it stop at 10 bytes if foo.txt has more to offer? I built an equivalent example to test, and sure enough it read 10 bytes exactly. Wild.

I went digging, and I believe this behaviour depends heavily on implementation quirks. Inside ReadBuf, this is what it does. me.reader is the source of data and me.buf is the BufMut.

let n = {
    let dst = me.buf.chunk_mut();
    let dst = unsafe { &mut *(dst as *mut _ as *mut [MaybeUninit<u8>]) };
    let mut buf = ReadBuf::uninit(dst);
    let ptr = buf.filled().as_ptr();
    ready!(Pin::new(me.reader).poll_read(cx, &mut buf)?);

    // Ensure the pointer does not change from under us
    assert_eq!(ptr, buf.filled().as_ptr());
    buf.filled().len()
};

First it calls chunk_mut to get a mutable slice in the unfilled part of the buffer. The documentation notes:

Note that this can be shorter than the whole remainder of the buffer

What BytesMut does in practice is:

If it’s currently full, reserve 64 extra bytes.
Return a slice covering the entire remaining uninitialised section.

This slice then becomes a destination for poll_read, and it returns the number of bytes that were filled.

So this works for two reasons.

BytesMut just happens to return its full remaining capacity in chunk_mut. It could return something smaller, or it could think to itself “hey I have only have 1 byte left, I’m going to increase my capacity before I hand this slice over.” The latter in particular would break this usage.
read_buf only ever does a single successful poll_read before it returns Poll::Ready itself. In principle there is no reason it couldn’t loop repeatedly, calling chunk_mut over and over and filling it with as much as data as possible from the source. The documentation says that it “usually” won’t do this, which is not a particularly strong statement.

The emergent behaviour is that if you call read_buf in a loop with the same BytesMut, it will always return once neatly at the nominated capacity before it begins to grow. If you proceed to read again, it will allocate another 64 bytes and keep going.

I’m honestly not sure what to make of all this. Despite being there in the example, it’s totally not clear to me if you’re meant to be able to depend on this. I think for my original problem the right solution would be to use plain old read.