3. Hybrid Model Introduction
The hybrid model implementation will combine the two general techniques explored in Chapter 2. By utilizing an acoustic impulse response we can extract the very important early echoes and frequency response envelope. The output of this portion can be determined using a convolution operation while a general comb, inverse comb, and/or all-pass generic network can be used to simulate the reverberation tail portion. A block diagram describing this process is shown in Figure 3.1.

Figure 3.1: Hybrid model
3.1 Implementation
Both Moorer [20] and Moore [21] have
suggested using an FIR filter to model the early echoes and a recursive topology
for the late reverberation section. These designs rely on a careful selection of
FIR coefficients that best simulate generic early echoes. This design does not
use a measured impulse response and lacks the ability to simulate a specific
room. In addition, a practical FIR length will not capture the early echo
nuances and high order FIR filters can quickly saturate the resources of a
practical processor.
3.1.1 Impulse Response Truncation
The notion that the impulse response can
be truncated is based on "temporal fusion." This phenomenon occurs
when the human biochemical auditory system can not discriminate between discrete
incoming sound events. A sound event occurring less than 30ms - 50ms after
another event will not be distinctly heard. Griesinger describes sound occurring
in this time as being assigned to the previous sound event by the brain. [16]
For this reason the portion of an impulse response containing discrete echoes
greater than 50ms apart must be kept; this is the early echo response that can
be discretely heard by the listeners. The portion where the discrete echoes have
been diffused so they occur less than 30ms apart is the reverberation tail. It
is when the impulse response decays to this point that the convolution stage can
be abandoned in favor of a recursive model.
Two different justifications have been explored in determining where a given impulse response can be truncated. Where in time this is to occur will make a significant impact on the reverberation accuracy and the computational requirement.
3.1.1.1 Time Based Truncation
Jot describes the first reflections
(approx. first 80ms) as the most crucial element of an impulse response
providing the room’s spatial impression. The late reverberation (reverberation
tail) is independent of listener position and can be modeled in a more generic
fashion.[18] The research of both Farina [8] and Borish [1] has found the most
distinct portion of a typical room impulse response to be between 50ms –
150ms. The influence of temporal fusion will prevent listeners from resolving
the short reflections occurring after this time. Farina suggests that an inverse
relationship exists between the crucial portion of a room impulse response and
the room’s size; larger room impulse responses could be truncated earlier than
smaller rooms.[6] For rooms with long reverberation times, it is the lengthy
"statistical" portion of the impulse response that covers the details
of the early reflections.
3.1.1.2 Windowing
Truncating the impulse response, like any
signal, will introduce time and frequency distortions that need to be restored
to achieve transparency. The recursive late reverberation component of the
algorithm will provide the time correction, and an equalization component will
voice the output to compensate for the frequency coloration caused by
truncation.
The truncation process can be executed in many different ways. When we simply decide to keep a certain amount of samples and set the rest to zero, we are actually applying a rectangular window to the original signal. If the point of truncation falls on a non-zero sample value, the sharp change to zero will introduce a high frequency component, this is known as "frequency leakage." This is illustrated in the magnitude spectrum of a rectangular window in Figure 3.2.

Figure 3.2: Magnitude spectrum of rectangular window
The "frequency leakage" occurs in the sidelobe portion of the response. The effects can be minimized by using other windows that do not have sharp transitions to zero, and minimize the sidelobe presence.[23] We want to be careful to keep as much of the impulse response intact while not introducing extra high frequency information during the window (truncation) process. Figure 3.3 shows three common window functions (rectangular, Hamming and Blackman) in both time and frequency.

Figure 3.3: Three 64-sample long window function
3.1.1.3 Equalization
The high frequencies of an acoustic
impulse response spectrum decay much quicker than the low frequency components
due to absorption by wall material, room contents, and air. This effect is
modeled in the reverberation tail by the 1st order feedback filters.
However, by truncating the impulse response a significant portion of the signal
(containing low frequencies) has been discarded.
With the majority of the signal discarded a new tonal balance is created (Figure 3.4).

Figure 3.4: Full and truncated frequency response or
medium hall impulse response
Since the first portion of the response is kept, thereby retaining the high frequency signature of the room, equalization must be performed to ensure adequate frequency balance. Because this is a linear system, the equalization can occur within one of the processing stages (convolution or the recursive reverberation tail) or a separate pre- or post-equalization (EQ) component.