When you first get into audio production, you’ll encounter several foreign words and phrases that will either fly past your radar or leave you puzzled; bit depth, attenuation, sidechaining, etc. (the lexicon is seemingly endless). Perhaps one of the most common examples of this is the term “dithering.” Whether or not you’ve seen this word before, its meaning isn’t abundantly clear. And yet, dithering is one of the most essential and useful tools in the mixing and mastering arsenal, so it’s a good idea to get acquainted with it.
Breaking Down Dithering
There’s a lot of depth to dithering, so we’ll begin with a simple definition. Dithering is a process wherein a small quantity of noise (i.e., white noise) is combined with an existing audio (or image/video) signal to help maintain its overall quality when lowering its resolution. At first glance, this might seem unnecessary and even counterintuitive. After all, if your goal is to put the highest-quality music out there, why would you want to add noise or downsample it in the first place? And how does adding noise to a signal help it sound good? The answer lies with the very nature of digital audio.
From Analog to Digital
Whereas analog audio is a continuous stream of sound waves, digital audio is a blocky, number-based representation of the “real deal.” So, whenever you record something into your digital audio workstation (DAW), the software “samples” the inputted audio and creates a digital copy via analog-to-digital (ADC) conversion. As long as you’re recording with modern digital tools at the proper “sample rate” (usually 44.1 kHz) and “bit depth” (usually 24-bit or 32-bit float), this digital version should sound virtually identical to the analog input. The sample rate is a measurement of how many samples per second are extracted from the analog signal to create the digital version, and bit depth refers to digital audio resolution, i.e., the amount of information packed into each sample. Now, if we all had unlimited processing power at our disposal, we could theoretically record, export, and listen to digital audio at infinitely high resolutions. Of course, this isn’t the case, and we must make some concessions to get a recording from point A to point B. It’s a matter of what’s manageable – while you’ll be recording at a 24-bit or 32-bit depth (or higher) in most DAWs, most audio playback devices and formats only perform at a 16-bit resolution (Spotify and Apple Music, for instance, stream at 16-bit/44.1 kHz). If you want your music to be as accessible as possible, you must reduce your high-quality recordings to fit these various formats. Thus you must downsample your audio upon exporting it. At the same time, you don’t want to eliminate your audio’s nuances and dynamics. This is where the power of dithering comes into play.
Clarifying a Distorted Concept
Without dithering, what would happen when converting digital audio from a higher resolution to a lower one? As you might imagine, those blocky numbers we mentioned earlier would become crunched and even blockier, turning what used to resemble a somewhat continuous wave into something more like a staircase (known as “quantization error”). If you’re familiar with waveforms and how different configurations sound based on appearance, you know that this step-like form sounds distorted. More specifically, downsampled audio that isn’t dithered yields correlated harmonics related to the original signal and cut through quite loudly. The last thing you want is for your pristine audio to feature unintentional, noisy distortion throughout, of course. When you add low-level noise to the audio, those correlated harmonics are essentially scrambled and lose their presence significantly. Ultimately, dithering allows your mix to maintain its dynamic range and original overall sound when getting docked to a lower resolution.
Randomness is the Key
If the concept of dithering still isn’t crystal clear, remember that it’s all about randomization. The noise you’re adding to your audio via dithering is random and uncorrelated (think “hissing” sound given off by white noise). As a result, those loud correlated distortions during the downsampling process are tamed and not allowed to poke through. This concept applies to visual data, too. Imagine looking at a crisp image on an HD TV. Then, imagine viewing the same image on an older TV with a much lower resolution. Plenty of data would be lost in this process, creating a pixelated, less colorful picture. This transfer from high definition to low definition could be smoothed out via dithering. Instead of directly cramming the image into a smaller space, the image’s information is first scrambled. Hence, the resulting downsampled version retains its relative shape, color, and structure.
Do or Don’t Dither?
So, when should you dither, and when should you hold off? As a basic rule, dithering should always occur whenever you downsample audio (often in the mastering or exporting stage). You won’t need to dither if you’re exporting audio at a high resolution (i.e., 32-bit). You’ll also want to hold off on dithering when covering audio to formats such as AAC or .mp3, as these processes will compress the sound on their own. And finally, you might avoid dithering if you’re preparing a track for mastering via eMastering, for instance, as this will be handled for you.