Fundamental Acoustics Foundations

The facts and fictions of modern audio engineering. 9/23/2013 12:31 PM Eastern

Fundamental Acoustics Foundations

Sep 23, 2013 4:31 PM, By Bob McCarthy

The facts and fictions of modern audio engineering.


The first-ever AES Conference on Audio Education was held in July at Middle Tennessee State University (MTSU) in Murfreesboro, Tenn. This led me to reflect on the current state of audio education and what subjects are important to emerging audio engineers. They are certainly not the same topics that were relevant and/or available to me when I created a custom degree in audio engineering at Indiana University in the late 1970s. Current students will not prioritize vacuum tube circuit design, Thiele and Small parameters, magnetic tape editing, transformers, and optical sound tracks as highly as we did in those days. The laws of physics that we studied back then have not been updated, but virtually everything else has.

An education regimen sets a foundation and as a result there is always an emphasis on the timeless aspects of the field: fundamentals, signal paths, laws of physics, and acoustics. Teachers feel the need to cover this stuff instead of getting right to how to mix a festival for 50,000 people. Sadly, we know that we have to get through all of that boring stuff to improve our chances of getting behind the big desk for the big show.

The nice thing about learning the fundamentals is that they stay learned. They also help speed up the process of learning all of the interesting and fast-changing stuff because a solid foundation is better to build on than one full of holes or misconceptions. It will be easier to keep up with industry progress when you have a firm understanding of the things that won’t progress. Think of it like a game plan with 50 percent of your energy spent on the unchangeable, 50 percent on the current industry practices, and 10 percent on past and future directions of our trade. It takes at least 110-percent effort to get that cherry job.

Education is a serious business, but that does not mean we can’t have some fun with fundamentals in the short space we have here. In recognition of AES’ three-day event, here is a dartboard of interesting tidbits that might inspire you to check your audio fundamentals foundation.

Figure 1: Comparison of the log and linear frequency axes (from question 1). Click here to see a larger image.

1. Did you know that the log frequency axis (octaves) is all in your head? The physical world doesn’t sing that song. Physics uses only the linear frequency axis. Harmonics are linear multiples (not successive doublings like log would be). Peaks and dips caused by reflections or multiple speaker interaction are all linear. We need to be linear log bilingual since the acoustical interactions of the world (and most of their solutions) are linear but we experience them with our log hearing. Comb-filter interaction is a perfect example of this. In the linear world, the peaks and dips are equally spaced. To our ears, they get narrower and narrower as frequency rises.

2. Did you know that there is nothing in acoustics that can make a pattern of equally spaced 1/3-octave-wide peaks and dips? Do you think we should tell the makers of graphic equalizers about this?

3. Everything to do with phase affects the frequency response linearly. Phase is about time. There is no logarithmic time, only linear. Tick-tick-tick-tick-tick, not tick--------tick----tick--tick-tick. We should be grateful for this since log time would mean that childhood goes really slowly and then time passes more quickly as we get older. It already seems enough like that already, eh?

Figure 2: The spectral variance progression for a single speaker. Notice that the on-axis far and off-axis near responses are matched, connected by the minimum variance line (from questions 4 and 5). Click here to see a larger image.

4. The inverse square law is just like a traffic law. It is never obeyed. Hypothetically, the SPL drops 6dB per doubling distance but two things screw it up: air and Earth. The air causes extra loss in the HF, especially in a desert. The Earth decreases the loss rate in the low end because the reflection is added to the direct sound path. You can get rid of the Earth bounce by getting the PA really high. That part is easy. But if the audience is near the ground they will still get a strong reflection. So the audience needs to get high also, but that relates to a different law than inverse square.

5. Did you know that if the on-axis response of a speaker is flat at 50ft. it won’t be flat at 100ft.? It will have a tilted response with the low end above the midrange and the high end below. The longer on-axis response gets more air loss and more room reflections, both of which tilt the response. The off-axis near response has less air and less room but gets tilted by the directionality of the speaker. So two places that are likely to have similar frequency responses are on-axis far and off-axis near. You might want to factor this into your speaker aiming.

6. Did you know that you can reduce the need for equalization by having great speaker positions? On the other hand, you can’t reduce the need for great speaker locations by equalization. As they say in real estate: location, location, location. Did you know that putting an I-beam in front of a high-frequency loudspeaker reflects and blocks the sound? If not, you are qualified to be an architect or scenic designer.

7. Have you heard the one about how you can’t measure subwoofers in the near field because the low frequencies have not developed yet? If you would like to test that theory, I invite you to put your ear against the grill of my subwoofer and tell me if the full-power kick drum seems developed enough.


Fundamental Acoustics Foundations

Sep 23, 2013 4:31 PM, By Bob McCarthy

The facts and fictions of modern audio engineering.


Figure 3: An intellectual discussion of room acoustics by Trap and Zoid (from questions 10 and 11). Click here to see a larger image.

8. Did you know it is possible for two speakers to be out of polarity and in phase? How? Reverse polarity of one (180 degrees) and delay the other one half a wavelength. This is often done in two-way crossovers. Polarity has no time or frequency component, only normal (0 degrees) or inverted (180 degrees) for the entire frequency range. Phase changes are related to time. A fixed offset of time for all frequencies creates a different phase shift per frequency. If the native two devices are 180 degrees apart at crossover then a polarity reversal will bring them together.

9. Have you heard people talk about the need to turn the system up loud enough to “excite the room?” It is indeed challenging to get a room excited. They are notoriously dull. But if these walls could talk, things would get interesting. Many people believe that there is a trigger threshold where there is enough direct sound to get the reflections going. Imagine a wall with a policy like an amusement park ride: “You must be at least this loud to be allowed to reflect off this wall.” This is another case of “it’s in your head.” The reflections are always there. If we raise the direct sound level, we also raise the reflection level. Where does such an urban legend come from? It relates to how we can more clearly perceive the reflections when they are louder than the noise floor. Reverberation time in rooms is quantified as an RT60 (RT=reverberation time) value: the time it takes to fall 60dB in level after the direct sound finishes. But you will not hear all 60dB of decay if the direct sound is not at least 60dB more than the noise floor. As an example, we will use one second. If the noise floor was 30dB SPL and our direct sound 90dB SPL, then we would hear one second of reverberation before it was lost in the noise (90-30=60dB of decay). Our perceived RT would be the same as the measured RT. If the direct sound were only 60dB SPL, then the room would sound drier than before since we will lose half of the reverb in the noise. If we turn it up to 120dB SPL, then the reverberation would still be 30dB more than the noise floor after one second of decay. The room still measures with an RT60 of one second but our perception is much longer.

10. There is another legend about rooms out there. In this one, our powerful sound systems can “overdrive” the room or drive the room into saturation. This viewpoint seems to think of rooms as having acoustical limits in the manner that our power amplifiers have electrical limits. So overdrive and saturation would be the result of hitting the mechanical limits of the walls. The more likely outcome of reaching the mechanical limits of the walls would be the roof collapsing on your head, but fortunately our sound systems do not have that much power. The perception of overdrive and saturation are real though. They are combinations of distortion and compression in the sound system and your ears, as well as the perceived extension of reverberation time that results from the high acoustic levels.

11. Do you think that we might achieve matched amplitude and phase through the acoustic crossover when we use unmatched high-pass and low-pass filters in an electronic crossover? This is much more likely than if the filters are matched. Are the speakers you are crossing together a matched pair? It’s pretty unlikely that your two-way speaker is comprised of a 12in. front-loaded low driver and a 12in. front-loaded high driver. When we are crossing between two very different speaker components (which is the main reason for having a crossover), it is extremely unlikely that the native roll-off response of the two drivers is matched. Also they might not be mechanically aligned on the same plane. Acoustic asymmetry needs to be met with electronic asymmetry to get the two devices to play nicely at the meeting point.

12. Did you know that there is no such thing as a “phase problem” between two speakers? If two speakers driven by the same source arrive at our ears at different times, there will be peaks and dips in the frequency response. This is often called a “phase problem,” so you wouldn’t think I could solve this with an amplitude solution, would you? Simple. Turn off one of the speakers. Phase problem is gone.

13. The real problems are “phase + amplitude” problems, and there are plenty of them. The severity of phase + amplitude damage is predictable based on two factors: how close the two phase responses are and how close the two amplitude responses are. If you are close in level, you better be close in phase (time). If you are far apart in phase (time), you better get far apart in level. Two speakers are like two children: when they play nicely together, they can stay close together. But if they want to fight, send each of them to their rooms.

14. Remember how Spinal Tap likes to turn its amps up to 11 just to get that little bit more over the top? We can do even better with digital audio, because it goes to “111111111111111111111111.” What level is that? That is called “full-scale digital” or 0dBFS. But if they want that little bit more, it is just too bad. There are no 2’s or 3’s, and we can’t just add another digit on because the next device in the chain won’t know to read it. If you try to go beyond full-scale digital, you have chewed off more than you can bit. What level exactly is “full scale?” This follows audio industry standard practice, which is to say we practice setting a lot of different standards. Full-scale digital (0dBFS) can be anything since it is just a mathematical construct inside a number-crunching machine until we finally reach the outside world where audio exists in an actual medium such as electricity: the A/D or D/A converter. It is at this stage that the dBFS value is given a voltage value such as 0dBFS = 10V (+20 dBV) or another voltage of the manufacturer’s or end-user’s choice. It is highly recommended that you read the spec sheet and find out the full-scale conversion number for both input and output of your digital audio devices.

15. Did you know that the equal-loudness (Fletcher-Munson) curves have no application to the equalization settings of a live sound system? Those are the curves that explain how our ears change their response to frequency when the source level changes. Remember the “loudness” button on an old hi-fi receiver? If we listen too quietly, it sounds too midrange-ish, so the loudness button bumps up the LF and HF regions. It takes a good 20dB change in level for the response differences to really matter. The reason it doesn’t matter to live sound is because it’s live. First, this means that our mixer has full control of the tone of the program material, and secondly, we never have quiet parts in live shows anymore because the moving lights are too noisy. The mix engineer sets the level and adjusts the tone to sound right for the given song. When a ballad is performed, the mix is modified to sound right at the new level. It is a closed-loop system that does not need a retuning of the PA every time the band creates some dynamics. Now what if the levels are different between the front and rear of the house? Simple answer: If you have 20dB differences between the front and rear, you have bigger problems than Fletcher-Munson can solve. So why is it valid for playback? Because the studio mix assumed a certain level for playback, which may or may not be the level you are listening to at home.

Want to read more stories like this?
Get our Free Newsletter Here!
Past Issues
October 2015

September 2015

August 2015

July 2015

June 2015

May 2015

April 2015

March 2015