The Model
The Auralization Technique
The Listening Test



In the past, the impact of the linear amplitude and phase response on sound quality have been investigated by using linear filters (equalizer). Linear distortion, however, describe the loudspeaker adequately only at small amplitudes. At high amplitudes, real loudspeakers produce other kinds of distortion, which should be investigated also systematically. This can be done using a nonlinear model of the loudspeaker able to synthesize loudspeaker output in the large signal domain.

Measuring the parameters of the model, we can perform a real-time simulation for any input, like music or artificial test signals, providing the distortion components separated from the ideal linear output.

The rest is simple: we pass the distortion and the linear signal through a mixing console, so we can emphasize or attenuate the distortion, or even listen to the distortion alone. This technique is called Auralization.

aura-det.gif (5509 bytes)


There are three major mechanisms in the electro-mechanical system that produce distortion:

  • Bl(x): The variation of the Force Factor with displacement
  • CMS(x): The variation of suspension Compliance with displacement
  • LE(x): The variation of voice coil Inductance with displacement

There are other nonlinear mechanisms as well, however, for most drivers, Bl, CMS and LE nonlinearities are by far the dominant sources of distortion, and it makes sense to analyze them in detail.

The Model

Classic modeling assumes the loudspeaker is linear: you get only the frequencies that you also find at the input, and if you double the input, you get twice the output.

The linear model assumes that all parameters are independent of displacement and time - and thus is valid only for small excursions. At notable displacement levels, the voice coil moves out of the gap and the force factor decays, the suspension gets (usually) stiffer with displacement, and the inductance changes. Thermal heating limits the output power, and the suspension gets softer at the rest position. Unfortunately, these mechanisms interact strongly, and so we end up with a nonlinear feedback system: adding new frequency and a DC component, compressing the output, and with "history" - behavior is dependent not only on the instantaneous signal, but also on the signal from the time before.

LSI - simplified modelThis extended Loudspeaker Model is derived from the common Small Signal model, it allows variation of certain parameters over time and displacement, and other nonlinearities. The picture to the left shows a simplified representation (click to enlarge):

The complete model we use includes additional effects, such as Para-Inductance, thermal power compression, jump out effect of the voice coil, nonlinear compression of the amplitude, and the complex interaction between the different nonlinearities. 

Extensive research in the recent years has shown that this model indeed describes the large signal behavior of a loudspeaker, or similar electro-dynamical transducer, correctly. It is also very easy to compare predicted results with classic distortion measurement.


The Auralization Technique

The Auralization can not only predict the sound pressure output of the loudspeaker, it can as well provide state information, such as displacement, velocity, voice coil temperature, etc.

This allows a direct correlation of the subjective listening impression and the objective parameters and states of a loudspeaker. The impact of individual nonlinearities can be assessed separately, which allows to track artifacts in the listening impression back to their physical cause.

Paper: Prediction of speaker performance at high amplitudes

The Listening Test

We prepared the listening test for two reasons: to provide a fairly non-technical introduction to the Large Signal behavior of loudspeakers, and to collect statistical data for a systematic investigation of the audibility of loudspeaker distortion. For this, we need your help.

The test is an "enforce blind A/B comparison" meaning:

  • The Listener compares two samples A and B, and has to tell which one is "worse" (is distorted)
  • It is blind - the listener doesn't know which sample is the original signal, and which one the distorted
  • It is enforced - There is no "I don't know" option:  even if the listener is unsure, he is asked to tell his opinion. This helps digging into subconscious, intuitive decisions, and avoids a "rather than saying something wrong, I say nothing" position. It is amazing how sensitive the ear can be to distortion, even though you cannot say definitely what makes the difference. However, some patience is required from the listener, to satisfy the statistical requirements

Each test uses a particular test signal (music or artificial), simulated for a particular driver or loudspeaker. Each test consists of multiple steps, in each step, the listener is asked to compare two samples.

Samples are provided with the distortion enhanced or attenuated versus the linear signal (usually in 3dB steps). e.g. "-9dB" means the distortion component was attenuated by 9dB, while the linear component is at 0 dB. The performance of the "Real Speaker" is obtained by neither attenuating nor enhancing distortion (both linear and distortion component are at 0 dB). Furthermore, a "linear" sample is provided, where the distortion is attenuated by 100dB.

The listener always compares the linear to a distorted sample. The listener is asked to identify the distorted sample. The position (A or B) of the linear sample is randomized. 

To see how the test is evaluated, let's have a look at an example test:

test-sample.gif (11831 bytes)


The test starts at  high distortion levels (distortion enhanced by +12dB), which are easy to hear. If the listener identifies the distorted signal correctly, the test goes down by two steps. This is repeated until Step 5, where the listener judges incorrectly. The level of Step 5 (-12dB) is marked as first turning point, and the test goes up by three steps to make identification easier. From now on, the test goes up by three for a wrong response, and down by one for a correct one. The levels where the listener changes from correct response to incorrect are marked as "turning points" as well. For consecutive errors (like Step 5 + Step 6), the only lowest level counts.

The Test continues until three turning points are found. In the sample, this is at -12dB, 0dB (Real Speaker), and -3dB. The distortion audibility threshold - where the listener can no longer tell the distorted signal apart - is determined as Median of the three turning points, i.e. the turning points are sorted - -12dB, -3dB, 0dB, and the center one is taken: For this test, the threshold is -3dB. 

Note: the median is not the average of the values - the average for the test would be -5dB. The Median is more resilient towards single lucky (or unlucky) guesses.

The test provides a chart showing the distribution of thresholds for a single signal/speaker combination. Individual test results are retained for a more specific analysis.

Reference for the Test Method: Kaernbach, C., "Simple adaptive testing with the weighted up-down method", Perception & Psychoacoustics 1991, 49(3), 227-229