Replay Gain - A Proposed Standard

Replay Gain .wav File Format

Existing format

The RIFF WAV format is defined here. The rules in this rather complex document become a little clearer when implemented! If you want to implement this, please remember that .wav files use little endian byte order.

The RIFF format is based around data chunks. Each chunk has the following basic format: NAME,SIZE,DATA. Wave files always contain a WAVEfmt chunk, and a data chunk. The WAVEfmt chunk contains information such as number of channels, sampling rate, bits per sample etc etc. The data chunk contains the actual audio data. The entire file is contained within a RIFF chunk.

There are several other optional chunks, which may be placed between the WAVEfmt chunk, and the data chunk. These include Cue and Playlist chunks.

The standard requires that software can sucessfully read .wav files containing unknown chunks. Should an unknown chunk NAME be encountered, then the accompanying SIZE field should be read, and the DATA field should be skipped. This means that new chunks can be defined without breaking compatibility with legacy software.

Replay Gain Adjustment chunk

In the language of the standard document, the format is extended thus:

               <WAVE-form> ->
                      RIFF( 'WAVE'
                           <fmt-ck>               // Format
                           [<rgad-ck>]            // Replay Gain Adjustment
                           [<fact-ck>]            // Fact chunk
                           [<cue-ck>]             // Cue points
                           [<playlist-ck>]        // Playlist
                           [<assoc-data-list>]    // Associated data list
                           <wave-data>   )        // Wave data

                <rgad-ck> ->   rgad( <rPeakAmplitude:REAL>
                                     <wRadioRgAdjust:WORD> )

In English, we're defining a new chunk called "rgad" (Replay Gain Adjustment). The size will always be 8 bytes long. The first value stored will be the Radio Replay Gain Adjustment (2-bytes); the second value stored will be the Audiophile Replay Gain Adjustment (2-bytes); the third and final value stored will be the Peak Amplitude (4-bytes).

Example .wav header

Here is a real example, which should make things clearer. You can see how the "RIFF" chunk wraps around all the other chunks, and also how each chunk starts with its name, followed by its size. The new "rgad chunk" is highlighted in light blue.

Start ByteChunkChunktitlecontentscontents (HEX)bytesformat
0RIFF   name"RIFF"52 49 46 464ASCII
4   size1764443C B1 02 004uInt32
8WAVEname"WAVE"57 41 56 454ASCII
12fmtname"fmt "66 6D 74 204ASCII
16size1610 00 00 004uInt32
20wFormatTag101 002uInt16
22nChannels202 002uInt16
24nSamplesPerSec4410044 AC 00 004uInt32
28nAvgBytesPerSec17640010 B1 02 004uInt32
32nBlockAlign404 002uInt16
34nBitsPerSample1610 002uInt16
36rgadname"rgad"72 67 61 644ASCII
40size808 00 00 004uInt32
44fPeakAmplitude100 00 80 3F4float32
48nRadioRgAdjust1082246 2A2uInt16
50nAudiophileRgAdjust1899937 4A2uInt16
52dataname"data"64 61 74 614ASCII
56size17640010 B1 02 004uInt32
60waveform data..........176400Int16

The "rgad" chunk is new. I stress this just in case you've web searched directly onto this page - what we're discussing here is an extension to the .wav format, not the existing standard.

In this example, looking at the "rgad" chunk:

See the data format explanation if you don't understand how the binary codes were calculated.

The rest of the header information shows that this file is a standard 44.1kHz, 16bit, 2 channel wavefile (the kind you might rip off a CD). It runs for 1 second, and is 176458 bytes long.

Does it work?

I hacked the MATLAB wavwrite function to create the above file. It plays without problems in Winamp, and Loads correctly into Cool Edit Pro. Neither of the programs understand the "rgad" chunk, and both simply ignore it.

Suggestions and Further Work

If you can see any problem with this proposal, please let me know. No software implements this (as yet).