Chapter 2: Localization

Interaural Difference Cues          Monaural Cues          Localization Blur          TOC or Beginning

One can rarely read a publication on the topic of localization without seeing some reference to the early twentieth century work of Lord Rayleigh (1907) and his duplex theory of sound.  Rayleigh felt that humans rely on two types of cues for localization: interaural time differences (ITD) and interaural level differences (ILD).  However, his theory did not allow scientists to understand how localization occurred when ITD and ILD were zero or equivalent, which occurs in several situations.  Thus, a complement to Rayleigh’s explanation is that monaural cues from the pinnae must provide additional help with localization when the duplex theory does not apply.

Since Rayleigh’s time, many experiments have been performed to further our understanding of this topic.  The experimental setup typically falls into one of two broad categories, free field (localization) or with headphones (lateralization).  While free-field testing is obviously more natural, headphone testing tends to be more popular because it allows the isolation of each localization cue and also removes any effects from the room.  Yet headphone testing also has its drawbacks, such as the creation of internalized auditory images (i.e. headphone images are perceived inside the head).

For either type of testing, it is important to introduce some basic terms used to describe the position of the sound and auditory events (see Blauert, 1999).This position is typically described by a horizontal angle (azimuth, gamma) and vertical angle (elevation, delta) from a point directly in front of the listener.  Note from Figure 3 that azimuth and elevation both start at zero directly in front, and increase in a counterclockwise fashion.

The starting point is the intersection of two imaginary planes, the horizontal and median planes.  The horizontal plane is the extension of the interaural axis, containing the center of the ear canal entrance and the lower portion of the eye sockets.  It essentially splits the head into an upper and lower portion.  The median plan bisects the head into left and right portions, whereas the frontal plane creates the front and rear halves.  All three planes are perpendicular to one another and intersect at the center of a symmetrical head (see Figure 3).
 

Figure 3: Views of the auditory planes, azimuth angle and elevation angle

Top

Interaural Difference Cues

As was mentioned, two major localization cues involve the interaural level and time differences between the left and right ear canal signals.  Yet, the term “interaural time difference” is somewhat nebulous because it can represent arrival, phase, and envelope temporal differences. Therefore the terms will be defined as below: 

·Interaural Arrival Time Difference (IATD): The difference in arrival time between left and right ear signals.  This is due to the constant speed of sound with varied path length differences (see Figure 4).The sound typically reaches one ear and then must additionally go around the head to the opposing ear.

Figure 4ITDs are caused primarily by path length differences



·Interaural Phase Difference (IPD): The difference in phase between left and right ear signals caused by different arrival times.  For a periodic sound (), IPD can have two different physical values: either IATD or T - IATD. 
 

Figure 5Interaural Phase Difference (IPD) has two physical values due to periodicity

·Interaural Envelope Time Difference (IETD):The temporal difference of the modulation pattern between the two ears. IETD is independent of the carrier frequency.  Similar to IPD, it also exhibits two physical values (see Figure 6).
 
 

Figure 6::Interaural Envelope Time Difference (IETD) also has two physical values due to periodicity



·Interaural Time Difference (ITD): A generic term used to describe any of the above time differences. Typically refers to the one that dominates the signal frequencies under discussion.  According to Blauert (1999), continuous sounds under 1.6 kHz would be dominated by IPD, while IETD has a definite influence above 1.6 kHz.  Although IATD directly affects IPD, its only direct influence is to impulsive sounds.

With regards to interaural level differences (ILD), those frequencies with long wavelengths as compared to the 17.5 cm diameter head are relatively undisturbed.  As the frequency of the sound increases (decreasing wavelength), it will begin to either reflect off or refract around the head (see Figure 7).ILD is additionally dependent on source position.  This is because of the asymmetrical characteristics of the head and body, and also the properties of acoustical waves and the barriers they encounter.

Figure 7: Interaural Level Differences caused by reflection of sound off head is (a) minimal for low frequencies (b) significant for high frequencies
 
 
Top

Monaural Cues

Having discussed the interaural cues, it is also important to realize the role played by monaural cues.  These cues are important because for every sound source position, there is a unique group of points that shares the same path to the ears (Durrant & Lovrinic, 1984).These points are more commonly know as the “cone of confusion,” and are represented by a hyperbola in the horizontal plane and a cone in three-dimensional space (see Figure 8).Two sound sources located on this cone would provide identical interaural cues; thus the monaural cues allow listeners to differentiate between them.
 


Figure 8: The cone of confusion is a set of points which provides identical interaural cues



Scientists have known for some time that it is possible to localize sounds with only one ear.  Angell and Wite (1901) compared the localization abilities of a normal binaural hearing individual to one who was entirely deaf in one ear.  They found that the monaural individual’s localization ability on the side of the non-deaf ear was “not greatly inferior” (p. 236) to the normal hearing individual.  However, hearing on the side of the deaf ear was “extremely uncertain” (p. 243).

For both subjects, front/back confusions occurred often and in general, complex sounds (whistles and bells) were more accurately localized than pure tones (tuning forks).  Their final conclusion stated that in monaural hearing, the external ear was responsible for contributing “qualitative peculiarities,” (Angell & Wite, 1901, p. 246) to the sound, which allowed proper localization to occur.

A more detailed analysis of the origin of monaural cues was not published until Batteau (1967) and Blauert (1969).Blauert described the operation of the pinna as a directionally dependent filter.  He stated that it enhanced or reduced various spectral portions of the input signal depending on the angle of vertical and horizontal incidence.  Blauert’s experiments suggested that these spectral influences dominated localization in fixed-head experiments; and that the actual location of the sound source had little to do with its perceived location.

For instance, because sounds originating from overhead exhibit a peak in the 7 kHz range, a sound that is played in the horizontal plane with an artificial peak at 7 kHz is perceived to originate from overhead.  Blauert defined several sections of the auditory frequency range that behave this way.  He called them “preference bands,” and showed that the relative intensity of these bands is what dictates fixed-head localization.

Batteau (1967) also discussed the influences of the pinna, but in terms of time-based reflections.  He showed that sound will reflect off the individual folds and cavities of the pinna, causing replication of the original signal with very small time delays. Batteau measured an almost linear relationship between azimuth and monaural pinna delay ranging from 10 ?sec (on axis with the ears) to 90 ?sec (directly in front) (p. 163).He also showed that changing the elevation of the sound source influenced the amount and concentration of pinna delay. 

Wright, Hebrank, and Wilson (1974) reinforced the plausibility of Batteau’s theory by showing that humans are sensitive to time delays as short as 20 ?sec (p. 960).However, Middlebrooks (1997) points out that time delays essentially cause spectral amplitude modifications due to phase interactions of the original and delayed signals.  Thus, researchers since Batteau’s time have focused on “spectral modifications, rather than on time delays per se” (Middlebrooks, p. 78).
 
 

Top

Localization Blur

Our ability to detect changes in a sound source’s position is experimentally measured as the minimum audible angle (MAA), also called “localization blur.” There are various methods of experimentation, but in essence, the localization cues are varied from a fixed point and the MAA is calculated to be the minimal amount of change that a statistically significant number of listeners can detect.  The MAA can be measured for both horizontal and vertical directions.

Blauert (1999) has summarized much of the localization blur experiments, including influential work from Stevens and Newman (1936) and Mills (1958).From this summary, Blauert suggests that our most acute sense of localization is directly in front (0° azimuth). In that position he states “the absolute lower limit for the localization blur is, as shown, about 1º” (p. 38).Schmidt, Vangemert, De Vries, and Duyff (1953) also state that changes in azimuth for pure tones close to the median plane were “considerably less than one degree” (p. 16).

Precision in locating a source is affected by its spatial location and frequency content.  With regards to source location, MAA in the horizontal direction (azimuth) is generally considered to be more accurate than that of the vertical plane (elevation) (Strybel and Fujimoto, 2000).On the horizontal plane, MAA is smallest directly in front of the listener where it intersects the median plane.  As the source moves around the head, MAA slowly increases to a maximum on axis with the ears, and then decreases again as the sound continues towards the rear of the listener.  In the vertical plane, MAA is again most accurate directly in front near the horizontal plane.  It similarly increases to its maximum directly above the listener’s head before decreasing to its secondary minimum directly behind the listener.

           MAA also varies with the signal’s spectral content.  Testing, such as Mills (1958) has shown that for pure tones, the middle frequency range generally has a larger MAA than either low or high frequencies.  In addition, narrowband signals and sinusoids are intrinsically more difficult to localize than wideband signals because of the limited number of localization cues the brain has to consider.

This discussion above describes the general nature of localization blur, but in reality it is more complex.To get an idea for this complexity, consider Figure 9 from Blauert (1999).This is the result of test subjects aligning the azimuth of a sinusoidal (solid) and octave-band (dotted) sound source to that of three wideband sources fixed at azimuths of 0º and ± 40º from midline.

Notice how the perceived azimuth varies with frequency and also between the two narrow band test signals.  While the results seem to change somewhat unpredictably with frequency, the 0º incident typically has smaller variations than either the 40º or 320º (i.e. -40º) positions.  Also, the results fall into a finite area around the wideband source, which is an indication of localization blur.  The white noise should be considered the absolute position of the source, whereas the difference in azimuth for the sinusoid/octave band represents the localization blur.  For example, a 5kHz sinusoid (solid) at ~44º azimuth is perceived to share the same location as the white noise source at 40º.This suggests that the MAA for a 5 kHz sinusoid is approximately 4º.In comparison, a 5kHz octave band (dotted) seems to have a MAA around 14º (located at ~26º azimuth) when compared to the same white noise source.

Figure 9Horizontal plane localization of sinusoidal (solid) and narrow band noise (dotted) as compared to a reference sound of wide-band noise at 0, 40, and 320 degree azimuth locations.  Shown versus frequencies to 5 kHz.  Reprinted from Blauert (1999) with permission from the MIT press.

Top or  TOC or Beginning
 
 

 Created  February 2003 by Rob Hartman
Copyright (C) 2003