The RIFF WAV format is defined here. The rules in this rather complex document become a little clearer when implemented! If you want to implement this, please remember that .wav files use little endian byte order.
The RIFF format is based around data chunks. Each chunk has the following basic format: NAME,SIZE,DATA. Wave files always contain a WAVEfmt chunk, and a data chunk. The WAVEfmt chunk contains information such as number of channels, sampling rate, bits per sample etc etc. The data chunk contains the actual audio data. The entire file is contained within a RIFF chunk.
There are several other optional chunks, which may be placed between the WAVEfmt chunk, and the data chunk. These include Cue and Playlist chunks.
The standard requires that software can sucessfully read .wav files containing unknown chunks. Should an unknown chunk NAME be encountered, then the accompanying SIZE field should be read, and the DATA field should be skipped. This means that new chunks can be defined without breaking compatibility with legacy software.
In the language of the standard document, the format is extended thus:
<WAVE-form> -> RIFF( 'WAVE' <fmt-ck> // Format [<rgad-ck>] // Replay Gain Adjustment [<fact-ck>] // Fact chunk [<cue-ck>] // Cue points [<playlist-ck>] // Playlist [<assoc-data-list>] // Associated data list <wave-data> ) // Wave data <rgad-ck> -> rgad( <rPeakAmplitude:REAL> <wAudiophileRgAdjust:WORD> <wRadioRgAdjust:WORD> )
In English, we're defining a new chunk called "rgad" (Replay Gain Adjustment). The size will always be 8 bytes long. The first value stored will be the Radio Replay Gain Adjustment (2-bytes); the second value stored will be the Audiophile Replay Gain Adjustment (2-bytes); the third and final value stored will be the Peak Amplitude (4-bytes).
Here is a real example, which should make things clearer. You can see how the "RIFF" chunk wraps around all the other chunks, and also how each chunk starts with its name, followed by its size. The new "rgad chunk" is highlighted in light blue.
Start Byte | Chunk | Chunk | title | contents | contents (HEX) | bytes | format |
0 | RIFF | name | "RIFF" | 52 49 46 46 | 4 | ASCII | |
4 | size | 176444 | 3C B1 02 00 | 4 | uInt32 | ||
8 | WAVE | name | "WAVE" | 57 41 56 45 | 4 | ASCII | |
12 | fmt | name | "fmt " | 66 6D 74 20 | 4 | ASCII | |
16 | size | 16 | 10 00 00 00 | 4 | uInt32 | ||
20 | wFormatTag | 1 | 01 00 | 2 | uInt16 | ||
22 | nChannels | 2 | 02 00 | 2 | uInt16 | ||
24 | nSamplesPerSec | 44100 | 44 AC 00 00 | 4 | uInt32 | ||
28 | nAvgBytesPerSec | 176400 | 10 B1 02 00 | 4 | uInt32 | ||
32 | nBlockAlign | 4 | 04 00 | 2 | uInt16 | ||
34 | nBitsPerSample | 16 | 10 00 | 2 | uInt16 | ||
36 | rgad | name | "rgad" | 72 67 61 64 | 4 | ASCII | |
40 | size | 8 | 08 00 00 00 | 4 | uInt32 | ||
44 | fPeakAmplitude | 1 | 00 00 80 3F | 4 | float32 | ||
48 | nRadioRgAdjust | 10822 | 46 2A | 2 | uInt16 | ||
50 | nAudiophileRgAdjust | 18999 | 37 4A | 2 | uInt16 | ||
52 | data | name | "data" | 64 61 74 61 | 4 | ASCII | |
56 | size | 176400 | 10 B1 02 00 | 4 | uInt32 | ||
60 | waveform data | ..... | ..... | 176400 | Int16 |
The "rgad" chunk is new. I stress this just in case you've web searched directly onto this page - what we're discussing here is an extension to the .wav format, not the existing standard.
In this example, looking at the "rgad" chunk:
See the data format explanation if you don't understand how the binary codes were calculated.
The rest of the header information shows that this file is a standard 44.1kHz, 16bit, 2 channel wavefile (the kind you might rip off a CD). It runs for 1 second, and is 176458 bytes long.
I hacked the MATLAB wavwrite function to create the above file. It plays without problems in Winamp, and Loads correctly into Cool Edit Pro. Neither of the programs understand the "rgad" chunk, and both simply ignore it.
If you can see any problem with this proposal, please let me know. No software implements this (as yet).