An extensive Internet search was performed to find source code for a software MPEG-1 codec and a streaming MPEG-1 decoder. A Layer I and II UNIX codec called MPEGiis was found at the Fraunhofer Institute in Germany [16]. A streaming MPEG decoder for Windows could not be found. A real-time decoder called MaPlay, however, was found at Berkley along with source code [22].
The Fraunhofer Institute contributed its resources to help write the MPEG standard. It began working on perceptual audio coding in the EUREKA project in 1987 and "devised a very powerful algorithm that is standardized as ISO-MPEG Audio Layer3" [16].
The results from the codec search imply that this is the most popular MPEG source code released. It has been ported to the Windows, DOS, and Macintosh platforms. At the beginning of this project only the UNIX version of the source code could be found. It was compiled on a Sun Sparcstation and attempts were made to decrease the bandwidth by reducing the number of subbands used. After more searching a DOS port called AmPlay was found. Versions of this program were compiled with the free GNU DOS C compiler , subsequently a Windows version was also found [23]. It came in the form of a dynamic link library (.dll) for a program called Cool Edit. This is the version chosen for this project. All source code was compiled with Microsoft's Visual C++ compiler.
Cool Edit is a shareware audio editing program from Syntrillium [23]. In accordance with the GNU general public license Syntrillium has made its port of the Fraunhofer MPEGiis encoder freely available. In addition to MPEG-1 "filters," Cool Edit also contains a "filter" which enables it to encode a Real Audio file. The Real Audio 3.0 encoder was used as one of the benchmarks to compare results.
One of the original goals of this project was to test the new low bandwidth algorithms over the Internet by playing them with a streaming MPEG decoder. An extensive search at the beginning of this project, however, revealed no such decoders. This left the option of building a streaming decoder. A Windows port of a real-time decoder called MaPlay was found, which also came with source code. Significant progress was made after spending several months reviewing the Netscape API and MaPlay's source code. The original focus of the project was being compromised, however, and so this portion was put on hold and eventually dismissed. Jeff Tsay, the author of the Windows port, has recently added Layer III decoding to MaPlay [22].
Several modifications were made to the Cool Edit 95 MPEG-1 encoder and decoder "filters". One of these includes an option to create a text version of the encoder's output. A major rewrite of the very low bitrate (VLBR) extension was done when Cool Edit 96 was released so that it would be compatible with the latest version of the program.
Three controls were implemented in the encoder to limit the bandwidth of the encoded files. The modified Cool Edit dialog box is shown in Figure 15. It now includes advanced lower subband controls for very low bitrate encoding. These controls are only operable when the user selects 32 kbps, Layer II, Model 2 encoding and sets the "Starting Point" to something other than "Don’t Use VLBR." This is a non-intrusive extension to the original Cool Edit MPEG-1 filters in that the user can always override the VLBR controls to use the original MPEG-1 filters.
There are only eight subband controls because that is the maximum number the MPEG standard defines for 32 kbps encoding.

The first VLBR control limits the maximum number of bits that can be allocated to each subband by the encoder. The default values are 15,15,7,7,7,7,7,7 from subband 1 to subband 8, respectively. A value of zero will eliminate that subband from the bit pool, allowing other subbands to have extra bits. This control allows the user to influence how the bits will be allocated by the encoder and set some rules before the bits have been allocated.
The second VLBR control can decrease the percentage of bits in the encoder's bit pool. MPEG is a fixed rate encoder and has a finite number of bits per frame. The bits are allocated from a bit pool. Each frame is encoded by adding bits to noisy subbands and then checking the MNR of each. If the noise level is still audible then that subband is tagged as needing more bits. The subbands with the most noise are given the highest priority and are first in line to receive more bits. Once the maximum number of bits allowable have been allocated to a subband (now controllable by the first VLBR control) that subband is tagged as full and the remaining bits are given to the other subbands.
This control takes a percentage input and scales the bit pool accordingly. To cut the number of available bits in half, for example, the user would enter 50. The default value is 100%.
The third VLBR control can decrease the percentage of bits allocated to each subband. As opposed to the first control, this control scales of the number of allocated bits after the encoder has allocated them. There are four sets of this control so that up to four frames can have different scalars for each subband. This allows the user to play with temporal masking properties.
Further control is given by the "Run for (number) frames" control which lets the user repeat each set's settings for a desired number of frames. In Figure 15, for example, the first set will run for 7 frames then the second set will run for 7 frames, after which the first set will run again.
This control also takes a percentage input and scales the number of bits allocated by the encoder accordingly. For aesthetic reasons the maximum number of characters allowable in the edit box is two. To compensate for this a percentage of 99 is interpreted as 100%. The default value is 99%.
A drop down list containing preset starting points for different music genres was added to facilitate encoding VLBR files. The list contains four set enable controls and five genre starting points:
| Don’t Use VLBR | turns off the VLBR controls |
| 1. Enable 1 Set | enables the first set of post scaling controls |
| 2. Enable 2 Sets | enables the first two sets of post scaling controls |
| 3. Enable 3 Sets | enables the first three sets of post scaling controls |
| 4. Enable 4 Sets | enables all four sets of post scaling controls |
| 5. Voice | an algorithm for voice demonstrating temporal control |
| 6. Rock / Pop 1 | a higher bandwidth algorithm for rock / pop music |
| 7. Rock / Pop 2 | a lower bandwidth algorithm for rock / pop music |
| 8. Classical / Jazz 1 | an algorithm demonstrating the max bits control |
| 9. Classical / Jazz 2 | a higher bandwidth algorithm for rock / pop music |
These starting points were determined by testing the encoding algorithms on different source material. This is discussed in Chapter 6.
The "output stats to coolmpeg.txt" checkbox allows the user to get a detailed report on the encoding process. The output is very similar to the sample frame shown in Figure 12. It breaks down the header for each frame, shows the bit allocation and SCFSI for each subband, and the average bandwidth for each frame. The major difference between the actual output file and the output shown in Figure 12 is that the scale sample values are not output. This was done to keep the mpeginfo.txt file from becoming too large.
Care was taken to make the user interface easy to use. The settings for the advanced subband controls are saved in a file called coolmpeg.ini so that they may be restored each time the dialog box is opened. This is crucial when different settings are tweaked in an attempt to deliver the best sounding file possible.
One of the goals of this project was to maintain compatibility with the MPEG-1 standard so that the new files would be compatible with current decoders. Not all decoders, however, implement the standard the same way. MPEGiis calculates the number of MPEG-1 frames that need to be read from a file based on the file's bandwidth and size. To maintain compatibility with its decoder the MPEGiis encoder ensures each MPEG-1 frame is of fixed length. It does this by byte aligning the header and adding extra bits to the end of some frames. This created a problem when generating smaller files because the decoder would calculate too few frames and not read in the whole file.
A patch was created for the decoder that byte aligns the header and removes the extra padding bits (since they are no longer needed). The decoder reads in both standard MPEG-1 files as well as the smaller VLBR files. A side effect is that the decoding progress bar in Cool Edit is not accurate for VLBR files.
Some improvements in audio quality can be gained by doing pre-processing on the audio file before applying it to the encoder. Progressive Networks recommends the following guidelines when encoding Real Audio files [24]:
Progressive Networks has made a Cool Edit script available that will adjust the DC bias, normalize, and limit the dynamic range of a file. Running this script before encoding the VLBR files produced no noticeable improvements in noise reduction, clarity, or overall quality.
Boosting frequencies between 2-3 kHz by 3 dB or more does help reduce the muffled quality that results from dropping the high frequencies. It is also helpful to roll off the lower frequencies below 2 kHz 3 dB or more. Doing either of these too radically, however, can make the encoded file sound too thin. This technique proved to be the most useful in generating better sounding files.
For high amplitude music source material, quantization error produces noise that is similar to white noise. Removing noise from an audio file can be done by training a program with a sample of noise and then asking it to remove instances of that noise. MPEG-1 complicates noise reduction processes, however, because it breaks the original audio signal into subbands, each of which has its own noise level. The noise reduction option in Cool Edit was used to test the effectiveness of noise reduction algorithms on decoded MPEG-1 files. Cool Edit must be trained first by feeding it a sample of the noise. If the recording chain is noisy, for example, it should be fed a clip of (what should have been) "silence" containing the noise signature of the recording chain. While this method of training Cool Edit is not applicable to MPEG-1, the noise reduction algorithms proved to be effective anyway.
Cool Edit has a number of controls in its noise reduction dialog box (Figure 16), some of which can be used to reduce the S/E noise produced by the encoding process.

Since there is no noise signature to feed the noise reduction program the entire file (or a modest section) can be selected and analyzed by selecting the "Get Noise Profile from Selection" button. This generates a profile similar to the one shown in Figure 16.
The results of this project indicate that the noise reduction level should be set between 7-13 to reduce MPEG artifacts. If this control is set too high the noise reduction artifacts become very apparent and the signal becomes weak.
Both the smoothing amount and transition width were to most effective in decreasing the amount of audible noise:
Distortion effects may manifest themselves as a "hollow" or "underwater/burbley" sounding signal, dull sounding impacts, "rolly" high end, or a "computerish" mechanical sound. These effects, if heard at all, will fall off if the noise reduction level is reduced. The amount and type distortion depends on the type of noise that is being filtered. Adjust Smoothing Amount and Transition Width up or down to minimize these artifacts [23].
A setting of at least 5 is necessary for at least one of the controls. Setting each to 15, however, resulted in no audible improvement.
| < Back | ..... | Continue > |