I was caught out by a surprising problem this week. When I want to quickly count the bytes or words in some text I’ll often use a nearby terminal. Depending on my mood I’ll do one of several things.
$ echo -n "hello" | wc 0 1 5
This one is the longest to type but
-n suppresses the trailing newline so 5 is the correct character count.
$ wc <<< "hello" 1 1 6
Similar idea but I have to remember to remove 1 from the count.
$ cat | wc hello there ^d (I press Ctrl-D to indicate end-of-file) 2 2 12
This is probably my favourite because it’s easy to type and I can copy-paste a chunk of stuff without having to worry about how many lines there are. Again, all newlines are counted which is why it says 12.
These methods are interchangeable. Except for when they’re not.
The situation: I was playing with a fuzzer that was sending a long series of inputs to another program. I knew it sent a payload of approximately 4 kB but I needed to know its length precisely.
I copy-pasted it to the clipboard and tried to measure it:
$ cat | wc /.../AAAAAAAAAAAAAAAAAAAA... ^d 1 1 4096
Okay, it’s 4095 bytes. And I went on with my work, assuming it was 4095 bytes, and nothing worked. Eventually I tried checking my work with another command:
$ echo -n "/.../AAAAAAAAAAAAAAAAAAAA..." | wc 0 1 5005
Oops… it’s actually bigger than I thought. Why would
wc lie to me? In truth,
wc was doing exactly the right thing. It’s a quirk of Linux that under normal conditions a tool like
cat can only consume up to 4096 bytes in a single line when reading from a terminal. That’s exactly the situation when I’m pasting in a long line. It works fine for files or pipes, but not for pasting.
Unfortunately for me it doesn’t tell me that this truncation has occurred. It just does it silently, which makes this a little dangerous if I’m relying on capturing all the data. What I really should be doing is this:
$ xclip -o | wc 0 1 5005
This takes the content from my X clipboard and pipes it straight into
wc. It works perfectly, and it even doesn’t add a newline. Now I just have to fix my muscle memory.