Playing an MP3 with compression on Linux/OSX

Some software I’m working on for a Raspberry Pi demands that I play MP3s at particular times. The audio is fed into a radio where it is then transmitted as narrowband FM, a very lo-fi channel. The recordings are spoken word and they will be heard most clearly if the output volume stays near the maximum allowed all the way through playback. In other words, if some parts of the file are quieter than others I’d prefer to bring them all up to the optimal level. That’s the purpose of an audio compressor.

I was pleased to find out that I can do this on the fly pretty easily using mpg123 and sox (brew install mpg123 and sox comes with OSX). Here’s a full command I’m using:

mpg123 -r 44100 -s -m wianews-2016-04-03.mp3 | \
play --buffer 2048 -t raw -e signed-integer -r 44100 -b 16 -c 1 - \
    compand 0.1,0.3 -60,-60,-30,-15,-20,-12,-4,-8,-2,-7 -6

This is what all the bits mean:

-r 44100
Output sample rate. Obviously needs to be the same in the second part of the command.
-s
Send audio samples to stdout instead of audio device
-m
Mono mix (the FM transmitter has only one channel)
--buffer 2048
Use a buffer of 2048 bytes instead of sox’s default of 8192. In my testing the Pi was too slow for the default buffer size and needs smaller ones to avoid underruns.
-t raw -e signed-integer -b 16
Tell play that it will be getting headerless signed PCM with 16-bit samples
-c 1
Single channel, because it’s mono
-
Read audio data from standard input
compand
Apply a compressor effect (full docs for the following parameters, which can be more complex)
0.1,0.3
Attack and delay, in seconds. Controls response time to increasing and decreasing volume.
-60,-60,-30,-15,-20,-12,-4,-8,-2,-7
A transfer function I found online that works nicely. It’s a series of pairs—for a given input level in dB, what should the corresponding output level be? So it’s flat at -60 dB, signals around the -30 to -20 dB region will get a nice big boost, and really strong signals around -4 to -2 dB will be brought back a little.
-6
Reduce the overall gain by 6 dB to reduce clipping. In my testing it still clips occasionally but I’m intentionally pushing the limits so that’s okay.

This can all be replicated nicely in python using the subprocess module:

import subprocess

filename = "wianews-2016-04-03.mp3"
mpg_cmd = "mpg123 -m -s -r 44100"
play_cmd = ("play --buffer 2048 -t raw -e signed-integer -r 44100 -b 16 -c 1 - "
            "compand 0.1,0.3 -60,-60,-30,-15,-20,-12,-4,-8,-2,-7 -6")

mpg123_proc = subprocess.Popen( mpg_cmd.split() + [filename],
    stdout=subprocess.PIPE)
sox_proc = subprocess.Popen( play_cmd.split(),
    stdin=mpg123_proc.stdout, stdout=subprocess.PIPE)
mpg123_proc.stdout.close()
sox_proc.communicate()