APPENDIX A

Qualitative Judgements for Multirate Convolution without

Cross-Convolved Channels

Listening tests were performed in order to evaluate the audibility of the transition band aliasing present in the proposed algorithm. Results indicated that inexperienced listeners detected a difference between the ideal and aliased results, but had difficulty in verbally describing or quantifying it.

Procedure

Ten ten-second audio samples were convolved with one of five impulse responses using overlap-save convolution and the proposed variation. One of four different filter banks with various transition bandwidths and stopband attenuations was used in each multirate convolution. Table 4 summarizes the test battery. Audio samples were recorded from compact disc to a personal computer using Sound Forge 4.0d, manipulated in Matlab, and digitally transferred to a digital audio tape (DAT) recorder. The left channel contained the overlap-save convolution result. The right channel contained the multirate convolution result. Each channel of the DAT was routed through a Sony MXP-3056 mixing console so that each track could be monitored separately in a centrally panned position. The test samples were monitored in the near field via two Event 20/20 near field monitors positioned approximately five feet from the listener forty-five degrees to the left and right of straight ahead.

 

Table 4. Listening test samples.

Number

Style

Impulse response

Multirate filter and order
1 solo violin large steel stairwell FIR - 8
2 light vocal rock medium recital hall - rear FIR - 16
3 hard rock large recital hall - rear FIR - 8
4 chamber orchestra medium recital hall - middle IIR - 32
5 instrumental jazz large recital hall - rear IIR - 32
6 electronic/world large recital hall - rear FIR - 16
7 classical symphony large recital hall - front FIR - 8
8 heavy metal large recital hall - front IIR - 32
9 religious choral large recital hall - front IIR - 64
10 classical guitar large steel stairwell FIR - 16

 

Test subjects included 6 undergraduate university students between the ages of 18 and 21. Subjects were instructed to listen to each excerpt and rate on a scale of 1 to 5 the similarity of the second channel to the first. Subjects were also asked to verbalize any audible differences. Subjects directly controlled which channel they listened to via the channel mute buttons on the mixing console. Subjects were allowed to listen to each excerpt up to five times.

Results

Table 5 summarizes the results of the listening tests. It should be expected that any differences heard between the exact and multirate convolutions would qualitatively concern the frequency range where aliasing occurs. For audio sampled at 44.1 kHz, this is the range from approximately 10 to 12 kHz. Although most audio material has much more information in the lower frequency ranges, high frequency content critically affects the timbre of many instruments, especially percussion instruments and vocals.

 

Table 5. Listening test results with respect to mean similarity rating. In the comments, Channel A is the exact convolution and Channel B is the multirate approximation.

Number Mean similarity
rating (1-5)
Variance Comments
1 4.17 0.57 B had more ring; B slightly muffled
compared to A; B had less depth than A
2 4.17 0.57 B's low frequency range is expanded; B mid
highs different; kick drum more resonant
3 3.50 1.10 B had more depth; high hat is brigher in A;
cymbal "sizzle" was distorted in B; weird
high-freq stuff (cymbals) in B; cymbals
more distorted, fuzzy in B
4 4.33 0.27 B has expanded low freq response; less
brittle; more cello in B; slight difference in
bass notes
5 4.67 0.27 A sounded muted; slight distortion on
trumpet high riff; a bit on drums
6 4.33 0.27 drums on B sound weird (phase like), high
freq was rounder, less brittle than A; cymbal
crash is more definite in B; loss of definition
on snare and high hat
7 4.50 0.30 less depth in A; subtle change in B's
presence; B seems louder
8 4.50 0.30 B has change in presence; loss of definition
on drumset
9 4.33 0.27 female voice more prevalent on A; hear a
bit more male voice in B
10 4.17 0.17 B seemed cleaner; B has more bass, rounder;
bell doesn't echo as long in B, loss of
sharpness in B

 

Discussion

The user comments seem to indicate that these differences are generally audible. For the most part, they specifically address high frequency adjectives like presence, sharpness, definition, etc. But the high mean similarity scores seem to indicate that subjects either (a) had difficulty hearing the differences or (b) didn’t consider the differences very significant with respect to overall similarity. As one would expect, the items with the highest mean similarity scores had the fewest comments. For item 5,7, and 8, only two of the listeners detected a difference. Items 5 and 8 used a more selective 32nd order IIR filter. Item 7 used a very simple 8th order FIR filter, but applied it to a musical selection with no percussion or vocals. The poorest performance occurred on item 3, which used a simple 8th order FIR in conjunction with drum and cymbal laden material.

Although too few subjects were tested in order to support any generalized claims, it seems reasonable to assume that the multirate convolution using highly selective filters on non-percussive, instrumental material will audibly differ only slightly from that of the exact single-channel convolution. Whether the difference is detectable will depend significantly on listening conditions and listener experience.

Next...Previous...TOC...Home