Attributes of Linear Phase Loudspeakers
Attributes of Linear Phase Loudspeakers
Introduction
Source: http://www.kfs.oeaw.ac.at/content/blogcategory/0/378/
I must admit, I did not know about the above research and it’s results. I have
been researching the internet for about a year before I came across the above, simple
information. There is a lot more too. Now, I realised, there is a volume of research
results that clearly indicates, that rather than asking “is phase distortion audible?” we
should now be asking question “how does the phase distortion manifest itself?”.
Naively, and without any prior experience in how should I actually do it, I
conducted my own listening tests by comparing the sound from traditional, minimum-
phase loudspeakers to the sound of linear-phase loudspeakers. I am talking here about
acoustically linear-phase loudspeaker. During my short, initial listening tests on
linear-phase loudspeakers, I was surprised by how indifferent the linear-phase mode
was to my ear.
Was I doing the right thing then?. This result definitely required further
investigation on much more diverse listening material.
Listening Habits
I was doing the same type of analysis over, and over again for years, and grew
accustomed to this ritual. It was easy to compare with measured results, so it felt
comfortable, that I can correlate my measurements with what I can easily hear (or can
not hear).
Recently, things have changed for me. I came across a simple paper,
http://www.audiophilerecordingstrust.org.uk/articles/speaker_science.pdf which
inspired me to take a more comprehensive look at my listening tests. Having read the
paper, I re-examined information from other internet re-sources, and as a result I came
to the conclusion, that my listening tests were only a starting point of what I should
have listened to when examining linear-phase loudspeakers.
Yes, it appears, that I have been covering only half of what I should have been
paying attention to. And the paper mentioned above made it startlingly clear to me.
Source: http://redspade-audio.blogspot.com.au/2012/03/bathurst-2011-audio-event-of-year.html
DEQX demo
“…A highlight this year was a demo of the capabilities of DEQX. This came about
from discussions of my active crossover listening comparisons, in which a small
group could not hear any improvement with DEQX. Terry argued that we had
dumbed down the DEQX and prevented it from showing what it can do. This is
certainly true, we wanted to test sound quality only and in that regard found no reason
to spend the extra compared to cheaper options. However, Terry set up a demo in
which two profiles were created on DEQX. One was limited to the processing power
of MiniDSP and DCX. The other allowed DEQX to strut its stuff. In particular, it was
allowed to correct for phase and group delay. We then blind tested this with instant
switching, not knowing what was being heard. I was the first to sit in the chair and do
the demo and quite soon I didn't need to be told which was which, because the
difference was obvious.
Both had a basic level of time alignment with digital delays. Both were matched in
level and in response closely. These differences were related to the group delay
correction. Without it, the sound was flat and almost lifeless in comparison.
I then watched as others sat through the demo, each person noticing the same
differences, differing only in the amount of time taken before declaring what they
heard…..”
Personally, I can testify to the tighter bass audible during linear-phase mode. I
operate large, 18”/vented enclosure subwoofers, tuned to 20Hz. Playing impulsive
sounds, in minimum-phase mode, the subs overshot and then add and prolong the
ringing - past steep, impulse-like signals. This unwanted flabbiness is unfortunately
audible in minimum-phase mode on low-frequency impulsive signals.
http://www.bodziosoftware.com.au/LP_MP_Subwoofer_Tests.pdf
However, in linear-phase mode, the punch is still deep, but tight, without the
“aftersounds”.
3. Realism
Source: http://www.audiophilerecordingstrust.org.uk/articles/speaker_science.pdf
(This is a must-read article in it’s entirety)
A very small amount of research and thought will reveal that this is a misguided
view. Frequency response is important, but not so important that the attainment
of an ideal response should be to the detriment of realism. One tires of hearing
that "phase doesn't matter" in audio or "the ear is phase deaf". These are outmoded
views which were reached long ago in flawed experiments and which are at variance
with the results of recent psychoacoustic research.
The ear works in two distinct ways, which it moves between in order to obtain the
best outcome from the fundamental limits due to the Heisenberg inequality. The
Heisenberg inequality states that as frequency resolution goes up, time resolution goes
down and vice versa. Real sounds are not continuous, but contain starting transients.
During such transients, the ear works in the time domain. Before the listener is
conscious of a sound, the time domain analysis has compared the time of arrival
of the transient at the two ears and established the direction. Following the
production of a transient pressure step by a real sound source, the sound pressure must
equalise back to ambient.
The rate at which this happens is a function of the physical size of the source. The ear,
again acting in the time domain, can measure the relaxation time and assess the size of
the source. Thus before any sound is perceived, the mental model has been told of the
location and size of a sound source.
In fact this was the first use of hearing, as a means of perceiving a threat in order to
survive. Frequency analysis in hearing, consistent with the evolution of speech and
music came much later. After the analysis of the initial transient, the ear switches over
to working in the frequency domain in order to analyses timbre. In this mode, the
mode that will be used on steady state signals, phase is not very important. However,
the recognition of the initial transient and the relaxation time are critical for
realism. Anything in a sound reproduction system which corrupts the initial transient
is detrimental.
Whilst audio electronics can accurately handle transients, the traditional loudspeaker
destroys both the transient and the relaxation time measurement. Lack of attention to
the time domain in crossover networks leads to loudspeakers which reproduce a single
input step as a series of steps, one for each drive unit at different times..."
4. Depth
5. Resolution
6. Separation of ambience
Source: http://www.bostonaudiosociety.org/bas_speaker.htm
http://www.bostonaudiosociety.org/pdf/bass/
“….If the stereo loudspeakers differ in their time-shift behaviour by more than about
thirty millionths of a second (or a finer tolerance, perhaps, for critical listeners), the
stereo image will be perceptibly smeared. The two speakers must "speak" together
at all frequencies if the subtlest details in the stereo field are to be preserved.
This, quite simply, may be the principal advantage to be gained from "linear-phase"
or "time-corrected" loudspeakers. The manufacturers who are striving to reduce the
time dispersion of loudspeakers to zero may also be ensuring that there will be no
significant differences in signal propagation timing between the two speakers in a
stereo pair. The delicate timing information in a stereo recording is thus accurately
retained and is transmitted to the listener unaltered…”
1. Depth.
This may surprise some listeners when they first hear it, since many speakers (and
records) elicit only a general left-to-right spread. But "stereo", as originally
conceived, implied a three-dimensional sound in which voices or instruments could be
localized at different apparent distances from the listener as well as at various lateral
positions. Listeners to time-aligned speakers consistently report hearing a stereo
image with unusual depth.
2. Resolution.
The stereo image is reproduced precisely, each voice or instrument having its proper
place and width. In complex sound sources such as symphony orchestra, individual
instruments can be resolved with unexpected clarity. In the old cliche, "I hear details I
never knew were in the recording. " Some listeners have incorrectly attributed the
improved resolution of detail to more accurate transient response, but the better
definition of details is simply the result of the reduction of blending in the stereo
image.
3.Separation of ambience.
With loudspeakers whose stereo image is slightly blended because of time-smear, any
hall ambience or reverberation in the recording tends to become slightly mixed with
the instrumental sounds, causing coloration of those sounds. Consequently, with such
speakers closely-microphoned recordings tend to sound better because of their
distinctly defined sound. But with time-corrected loudspeakers, the ambience is
resolved as a separate sound, and larger amounts of hall ambience in recordings can
be enjoyed…….”
7. Inter-channel accuracy of sound reproduction.
Source: http://www.cirrus.com/en/pubs/whitePaper/DS668WP1.pdf
Source: http://en.wikipedia.org/wiki/Precedence_effect
The precedence effect appears, if the subsequent wave fronts arrive between 2 ms and
about 50 ms later than the first wave front.
The precedence effect is important for the hearing in enclosed rooms. With the
help of this effect it remains possible to determine the direction of a sound source
(e.g. the direction of a speaker) even in the presence of wall reflections….”
Source: http://sound.media.mit.edu/Papers/kdm-phdthesis.pdf
Page 44
http://www.davidgriesinger.com/Acoustics_Today/Pitch,%20Timbre,%20Source%20
Separation_talk_web_sound_3.pptx
11. Confirmation of two-stage processing by the ear, as discussed in (3).
Source: http://www.hauptmikrofon.de/theile/ON_THE_LOCALISATION_english.pdf
It appears that the “law of the first wavefront” can be interpreted as the “law of the
first localisation stimulus”…..”
“…..6. Summary
According to the association model presented in the preceding chapters, the
functioning of the auditory system with respect to spatial hearing is due to two
different processing mechanisms. Each of these two processing mechanisms manifests
itself in the form of an associatively guided pattern selection.
Source: http://arxiv.org/pdf/1208.4611v2.pdf
"..The time-frequency uncertainty principle states that the product of the temporal and
frequency extents of a signal cannot be smaller than 1/(4PI). We study human ability
to simultaneously judge the frequency and the timing of a sound. Our subjects often
exceeded the uncertainty limit, sometimes by more than tenfold, mostly through
remarkable timing acuity. Our results establish a lower bound for the nonlinearity
and complexity of the algorithms employed by our brains in parsing transient
sounds, rule out simple "linear filter" models of early auditory processing, and
highlight timing acuity as a central feature in auditory object processing…."
And further:
"…In many applications such as speech recognition or audio compression (e.g. MP3
[18]), the first computational stage consists of generating from the source sound
sonogram snippets, which become the input to latter stages. Our data suggest this is
not a faithful description of early steps in auditory transduction and processing,
which appear to preserve much more accurate information about the timing and
phase of sound components [12, 19, 20] than about their intensity…."
And finally:
"…Early last century a number of auditory phenomena, such as residue pitch and
missing fundamentals, started to indicate that the traditional view of the hearing
process as a form of spectral analysis had to be revised. In 1951, Licklider [25] set the
foundation for the temporal theories of pitch perception, in which the detailed pattern
of action potentials in the auditory nerve is used [26, 28], as opposed to spectral or
place theories, in which the overall amplitude of the activity pattern is evaluated
without detailed access to phase information. The groundbreaking work of Ronken
[22] and Moore [23] found violations of uncertainty-like products and argued for
them to be evidence in favour of temporal models. However this line of work was
hampered fourfold, by lack of the formal foundation in time-frequency distributions
we have today, by concentrating on frequency discrimination alone, by technical
difficulties in the generation of the stimuli, and not the least by lack of understanding
of cochlear dynamics, since the active cochlear processes had not yet been discovered.
Perhaps because of these reasons this groundbreaking work did not percolate
into the community at large, and as a result most sound analysis and processing
tools today continue to use models based on spectral theories. We believe it is
time to revisit this issue….."
Some very interesting information on transients and localization comes from the
development work of Joseph Manger. The whole paper is recommended for reading.
Source: http://www.manger-audio.co.uk/PDFs/acoustical_reality.pdf
How distorted transients can be – Manger illustrates it on the following pictures:
14. AES Technical Document on phase accuracy and transient fidelity.
In 2002, the AESTD1001.1.01-10 drove the stake in the ground, and pegged the 10usec
as the maximum allowed timing difference between stereo loudspeakers across the entire
audio band.
Source: http://www.aes.org/technical/documents/AESTD1001.pdf
The authors performed subjective and objective analysis of several woofers using
impulsive tones, and concluded:
The following paper, clearly indicates, that transients are critical in localization
process.
Source: http://www.pa.msu.edu/acoustics/rooms1.pdf
http://www.pa.msu.edu/acoustics/rooms2.pdf
“….Our results indicate the following: (1) A sound must include transients if the
precedence effect is to operate as an aid to its localization in rooms. (2) Even if
transients are present the precedence effect does not eliminate all influences of room
reflections. (3) Due to the interference of reflections large interaural intensity
differences may occur in a room and these have a considerable influence on
localization; this is true even at low frequencies for which IID cues do not exist in a
free field. (4) Listeners appear to have certain expectations about the reliability and
plausibility of various directional cues and perceptually weight the cues accordingly;
we suggest that this may explain, in part, the large variation in time-intensity trading
ratios reported in the literature and also the differing reports regarding the importance
of onsets for localization. (5) In this study we find that onset cues are of some
importance to localization even in free field.
http://www.pa.msu.edu/acoustics/rooms3.pdf
Conclusions
(2) The precedence effect is maximally effective when the signal onset is
instantaneous. Its effectiveness begins to diminish as the onset duration is
increased…..”
Source: http://www.pa.msu.edu/acoustics/rakhar2.pdf
(2) Onsets are a great leveler among individuals. Whereas the ability to localize
steady steady-state sounds varies greatly among listeners, the ability to localize
sounds with an onset transient shows best to worst differences less than 1.5 degrees
among our seven listeners….”
“….3. CONCLUSIONS
In this study localization of sound in a room with reflecting walls was tested. Eleven
stimuli were used, differing in spectral and temporal information. For such a room the
following can be concluded:
- Signal bandwidth is an important cue for localization. The broader the frequency
spectrum of a sound, the better localization performance.
Although many of the errors made were distance errors, subjects are able to localize
distance quite well. Furthermore subjects usually select the correct side, even for the
hard to localize steady state sinusoids…..”
There is very good correlation between transient distortion and subjective listening
tests. Whenever there are peaks in the transient distortion, one can be sure that the
listening tests will reveal unpleasant distortion, even though the sound-pressure
curve is quite smooth….
Extensive measurements show that for a high-quality audio system the sound-pressure
curve must be smooth and properly shaped, and that the transient distortion should be
down at least 18dB throughout the range. One can then be fairly certain that the system
will pass very careful listening tests….”
20. Confirmation of two-stage processing by the ear, as discussed in (3) and (11).
Yet another interesting paper. It puts the early reflections in somewhat different
perspective.
“…Any reverberant space yields comb-filtering effects, and virtually all listening to
music via loudspeakers is done in such spaces. Therefore, logically speaking, all
listening is done under compromised conditions, where a primary attribute of accurate
sound reproduction (fiat amplitude response) is negated. Yet we must acknowledge
that music playback systems seem to work well: listeners enjoy listening, they readily
and accurately identify sounds (and will testify to their realism), and some listeners
are able to detect truly microscopic differences between alternate components in the
playback system.
This anomaly raises the question: how can individuals listen effectively to
loudspeakers in reverberant spaces and why don't the ubiquitous comb-filtering
interference effects always pose problems for the listener?
I suggest that the answer lies in the nature of our auditory localization
capability, which makes use of interference effects such as comb-filtering as a
function of performing the sound source localization task.
The above statement confirms earlier findings of Gunter Theile, Watkinson and
Manger about the ear processing the incoming audio stimulus in two stages: The
received ear signals can be attributed to the two sound source characteristics of
“location” and “signal”, which are independent of each other but always occur in a
pair-wise fashion.
20. General Conclusions From Papers Presented Above
Early reflections occurring less than 2.5 ms after the original sound sensation are
known to shift the image towards their direction and to blur the image.
Early reflections occurring more than 2.5 ms but less than 5 ms after the original
sound sensation are known to blur the image, although they keep the direction of the
image constant….”.
So if your goal is to deliver the sharpest image, or most accurate localization, you
would be well advised to take care of transient origination (loudspeakers) and also
provide some acoustical treatment to the walls/room.
There are basically three areas where linear-phase loudspeakers differ from minimum-
phase loudspeakers.
2. Identical phase response for all loudspeaker in the system. The phase response
in correctly equalized multi-channel linear-phase system is 0deg in every
loudspeaker. Therefore it immediately satisfies AESTD1001.1.01-10 for phase
accuracy and transient fidelity to perfection. The measurements of linear-
phase loudspeaker are presented on my website, and comments on
AESTD1001.1.01-10 are presented in -
http://www.bodziosoftware.com.au/AES_Document_Comments.pdf
3. Tighter bass. Even Dr. Floyd Toole quoted other researchers (Craven and
Gerzon) on this subject on page 420. The most obvious difference is the
tighter bass. I have conducted extensive tests on this subject -
http://www.bodziosoftware.com.au/LP_MP_Subwoofer_Tests.pdf
21. Square Wave Loudspeaker Testing
Another interesting paper from 94'th AES Convention. I would recommend reading
the entire paper.
”…Summary
This is the reason why loudspeakers of proven square wave response capability
are an important prerequisite for the natural reproduction of sound. Moreover,
it is only possible to detect acoustic faults with this type of technically
faultless reference loudspeaker. It should be clearly noted that all other
current components in the electroacoustical reproduction chain already transmit
square wave signals correctly.
Correct square wave reproduction with loudspeakers has the same importance as
was the case for correct square wave reproduction with amplifiers in the 1960's.
Both constitute fundamental advances and establish important conditions for high
quality reproduction of music. Nothing can propagate the concept of high
fidelity more than these types of advances….”
Time Domain Instrument Testing
The system under test discussed here consists of a filter and a loudspeaker in an
enclosure. These two components that will introduce time delay are the filter and the
combination of driver and the enclosure itself. To illustrate the above, a 12” guitar
loudspeaker in a vented box was measured and it’s minimum-phase responses were
obtained with a help of an MLS measurement technique – see below. It is immediately
observable, that the loudspeaker has rather irregular frequency response. Since the
loudspeaker is essentially a minimum-phase device, the corresponding phase response is
also highly irregular, and definitely not flat.
Let’s establish the frequency response of interest, which is the frequency range where the
SPL will be equalized to flat response. In my example it will be: 90Hz – 5500Hz.
A 300Hz square wave reproduced by this loudspeaker is highly distorted. Strong ringing
is due to 10dB sharp SPL peak located at 3.5kHz. You can see, that there are about
11periods of ringing waveform in one period of 300Hz square wave.
Instrument test results obtained from linear-phase loudspeakers reveal their
true superiority in time domain. The following test results were obtained by John
Kreskovsky of Music and Design ( http://www.musicanddesign.com )
The first figure shows the 300 Hz response. This is close to the low frequency cut off
of the system where the phase rotation and group delay due to the 200 Hz high pass
cut off would normally result in loss of flat top behaviour and the 2k Hz crossover
would cause distortion of the initial rise. This is shown in the insert at the upper right
of the plot for the linearized system and confirmed by the lower plot which if for the
standard LR4 system. The white trace is the input, orange the acoustic output from the
speaker system.
300 Hz square response of linearized system, left, and standard LR4 crossover, right.
500 Hz square response of linearized system, left, and standard LR4 crossover, right.
1kHz square response of linearized system, left, and standard LR4 crossover, right.
2kHz square response of linearized system, left, and standard LR4 crossover, right.
Next, I used 2ms-wide pulses separated by 350ms space as the source signal.
On the 2ms pulse, the minimum-phase version delivered a more of a “thump” instead
of a pop or a click. This is perhaps not surprising, as the post-ringing of the pulse
extended to130ms and far exceeded the 30ms “memory effect” of the auditory
system. Here, the driver, filter and vented enclosure added it’s own, combined
signature. It is also observable, that the minimum-phase version of the subwoofer has
converted the clearly asymmetrical pulse into a much more symmetrical bi-polar
pulse with post-ringing. This is clearly visible on the screen shots below.
When a 2ms bi-polar pulse was used for excitation, the minimum-phase
version has done the opposite, and converted the symmetrical bi-polar pulse into a
pulse with clear asymmetrical tendency. The ringing past the pulse is due to a more
distant microphone placement, so now, the mike picks some of the room reflections.
When a 10ms bi-polar pulse was used for excitation, the minimum-phase
version has even more asymmetrical tendency.
10ms Bi-polar pulse in Linear-Phase Mode and Minimum-Phase Mode
The linear phase result is on the left and the nonlinear phase result on the right.
It should be noted that there is some distortion in the wave forms that that must be
attributed to room reflections. Square wave testing is a steady state test and without a
true anechoic chamber the effects of room reflections can not be eliminated.
Never the less, for the 300 Hz case shown in the first figure, the linear phase
system shows the sharp rise and fairly flat top expected. The nonlinear phase case
shows early tweeter response followed by the woofer response and the sloped top is
an artefact of the nonlinear phase. The response also significantly overshoots the
correct level. This latter effect is seldom discussed when comparisons of linear and
nonlinear phase systems are made. Even though the amplitude of reach frequency
component is correctly reproduced in the nonlinear phase system, the lack of linear
phase means that the different frequency components do not sum correctly since that
are delayed by different amounts. The overshoot is a result of time distortion.
300 Hz square wave response, Linear phase, left; Nonlinear phase, right
The next figure shows the same comparison for a 1kHz square wave. Again,
some distortion is observed due to room reflections. However, the linear phase case
again shows the expected sharp rise and relatively flat top. The nonlinear phase
system more clearly shows the time lag between the woofer and tweeter response.
1kHz square wave response, Linear phase, left; Nonlinear phase, right.
The next figure shows the result for a 3k Hz square wave. The differences
between linear and nonlinear phase, while clearly evident, are less significant because
the fundamental is above the crossover point and there is little contribution from the
woofer due to the 4th order low pass response. With the system designed another
interesting feature of the linear phase system can be examined, the effect of crossover
slope.
3kHz square wave response, Linear phase, left; Nonlinear phase, right.
The next figure shows the 1kHz response of the linear phase and nonlinear
phase system when the slope of the woofer to tweeter crossover is increased to 8th
order, 48dB/octave. With the Ultimate Equalizer this is easily accomplished by
selecting the new 48dB/octave slopes and clicking Show complete system to calculate
and load the new filters.
1kHz response of linear and nonlinear phase system with 8th order crossover.
This result should be compared to that of figure where the crossover was 4th
order. Changing the order has no effect on the linear phase system at the design point.
The nonlinear phase system response is significantly different solely due to the
change in crossover order.
Finally, the last figure shows the effect of reducing the crossover to 2nd order.
The response of the nonlinear phase system looks somewhat better now. However, for
flat response the tweeter must be connected with inverted polarity in the nonlinear
phase system and the initial tweeter pulse is therefore in the wrong direction. It should
be noted that many audio enthusiasts feel the 2nd order crossover sound better than
those of higher order. This may be a result of the improved wave form observed here
and could be an indication of the potential of linear crossover and speakers of any
order since they will all preserve wave form relative to the design point.
1kHz response of linear and nonlinear phase system with 2nd order crossover.
Conclusions
At the time of this writing, linear-phase loudspeakers are still a new “kid on
the block”. Past attempts in creating them resulted in offerings that were simply too
expensive for wide-spread use. The most accurate implementation of linear-phase
loudspeaker requires a full set of individual driver measurements, coupled with a DSP
approach, in addition to an active amplification system. This really makes the linear-
phase system highly customized device – a world of difference in comparison to the
current approach of loudspeaker industry.
However, this particular feature makes the linear-phase system an ideal DIY
device. In our world, everything is custom-built, with an aim to typically outperform
comparable commercial designs. Linear-phase loudspeakers offer everything that
minimum-phase loudspeakers can offer, and then reward you with often vastly
superior performance in time domain, as explained in the pages above.
So, here I am. Struggling to come out of the “frequency-domain box” and into
the new world of time/frequency/space-domain characteristics of contemporary
loudspeakers. But even at these early stages of adopting a new technology, I find it
already very rewarding. This is because it’s evident that a new, accurate and
realistic acoustic transduction technology is being achieved in much more
accessible commercial way.
Source: http://en.wikipedia.org/wiki/Sound_localization
For determining the lateral input direction (left, front, right) the auditory system
analyzes the following ear signal information:
For frequencies below 800 Hz, mainly interaural time differences are evaluated (phase
delays), for frequencies above 1600 Hz mainly interaural level differences are
evaluated. Between 800 Hz and 1600 Hz there is a transition zone, where both
mechanisms play a role.