Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now


Bob McCarthy on Sound Delivery

Sound systems and the rooms they live in

I am a sound system designer and tuner, and I work in performance spaces. Every room I have encountered has a unique and extremely complex acoustic response. Each has its own macro dimensions, surface materials, and architectural details—all of which contribute to the unique “sound” of the hall. The combined effect of these factors can be quantified with a stupefying array of mathematical statistics such as reverberation time, early decay time, clarity, bass ratio, strength, and on and on. These numbers alone cannot ensure a room will be successful for particular applications such as opera, classical music, or speech, but they are the best objective indicators to date.

Figure 1: The summation of two signals that are 1 millisecond apart in time, such as the direct sound and an early reflection, or two speakers. The comb filter peaks and dips are linearly spaced and therefore are perceived as varying percentage bandwidth to our log frequency hearing system. The spacing is octave wide at 1kHz and 1/10 octave at 10kHz. Click here to see a larger image.

Used by permission from Sound Systems: Design and Optimization by Bob McCarthy, Focal Press

One might assume that as a sound system designer, I would have a great interest in these numbers and that they would be an important part of the design process. For example, you might expect that the sound system in the very lively Carnegie Hall would require a dramatically different approach than a Vegas showroom full of curtains and fiberglass. The approach differs, but to a much lesser extent than you might think.

The Role of the Room

Ask yourself, “Why should it be different?” Where do you aim the speakers in a lively hall? At the people. And in a dead hall? At the people again. In what kind of hall do we intentionally aim sound at anything other than the seats? None that I have ever been involved with. Do we approach this differently for pop music than speech in a house of worship? Do loud shows need to aim away from the walls while quiet ones don’t?

This might seem like a silly line of questioning, but I am bringing this up to make a simple but important point. Sound system engineering is not about the room. It is about the sound system. It is about laying a blanket of direct sound with consistent level and tone onto all of the seats. The room is first and foremost to us, a place to hold the seats. Our approach to the room geometry begins with the arrangement of seats from the perspective of where we are allowed to place speakers. This defines the target, and hopefully, the walls and ceilings are kind enough to lets us get there without their help.

The room comes into play only after we have completed our primary mission: direct sound delivery to the seats. Direct sound delivered to unseated areas is viewed from our perspective as potentially treasonous for its ability to interfere with the unfettered direct sound delivery to other seated areas. I don’t know any sound engineers who look at a particular wall and say, “Nice! That reflection will really improve things.” It is fairer to say that we inspect a hall for the surfaces that are going to start an insurrection against our control over the space—and yes, we are direct sound control freaks. Strong reflections put the room in control and risk degrading areas that would otherwise have undisturbed coverage. As an example, we can adapt an old dog lover’s saying, “The only good balcony front is a dead balcony front.”

Am I advocating for a sterilized world of anechoic chamber music? No, not at all. I love reverberation as much as the next guy, but a little bit goes a long time. In the ideal world, we would have variable control of the reverberation to create a decay character that was appropriate for a given program material or artist, or even a particular song. We have reverberation devices that we can add to the sound mix, but we understand that this is not the same as reverberation in the room. A dead room filled with lots of electronic reverb in the speaker system leaves the performers singing in the rain while the audience stays dry. The most desirable result is a uniform reverberant field surrounding the audience creating a sense of sonic envelopment. That goal is widely shared between acousticians and sound engineers alike.

Figure 2: A guide to comb filter identification. When the timing is known, the frequency series of peaks and dips can be found and vice versa. Click here to see a larger image.

Used by permission from Sound Systems: Design and Optimization by Bob McCarthy, Focal Press

The biggest divergence between our approaches is in the category of early reflections. In the world of unamplified acoustics, these reflections are the “sound reinforcement system.” Their job is to provide strong copies of the direct sound that arrive early enough to be integrated by our ears as signal strengtheners and extenders. These are quite distinct from the dense reverberation decay tail. The power game is played by a small number of arrivals in a few milliseconds rather than the hundreds of arrivals over a few seconds that comprise the decay tail. These strong early reflections are vital to unamplified acoustics to “get the party started.” As we move back in the hall, the proportions of direct sound and early reflections changes, which helps to keep the loudness relatively even over distance. From an amplified sound system perspective, these are largely unneeded and potentially troublesome. We have our own tool for creating even levels from front to back: directionally controlled speakers and arrays. From an acoustic power point of view, we don’t need the walls. Although, our subwoofers will happily accept the help of the floor and nearby walls as long as they are close enough to add constructively. But for the high frequencies, there is absolutely no good outcome for the addition of strong early room reflections to our speaker output. The reason is simple—they will never be early enough to add to our direct sound in a controlled, constructive manner. And remember, we are control freaks.

It may help to visualize a typical horn or waveguide attached to a high frequency driver for the strong early reflection device that it is. Even as small and close to the source as these reflective devices are, they present an incredible engineering challenge because the wavelengths involved vary from small to extremely small. The room? Way too late to ever provide constructive addition at the high end without also creating cancellations and gross non-uniformity.

Differences in Perspective

Let’s step back for a moment and consider a critical difference in perspective between architectural acoustics and sound system engineering: direct sound. The evaluation of room acoustics for an unamplified acoustic space begins with the assumption that direct sound from the stage will have an uninterrupted path to all seats. All of the seats can see the point of sonic origin: the performer onstage. It would be fair to say that the acoustician’s prime focus begins only after he has gotten the direct sound path over with. Now the fun begins, orchestrating the reflected paths into the desired timing and level arrangement of early and late reflections.

On the other side is the sound system engineer. We do not have the luxury of assuming that just because people see the stage, they can see our speakers. Our speakers are placed on the sides of the stage or above the stage, locations that architects may have considered unimportant to keep in the sightline of the seats. But let’s say we do have a line of sound to the seats. What then? The sound system engineer creates a plan to reach each seat with direct sound at close to the same level. This consideration preempts all others regarding the room. Once we see how we can reach every seat, we look at the walls, not for help, but rather to see how much damage will occur from strong early reflections. This may cause us to revise the design by partitioning coverage into segments such as sidefills, underbalcony, and overbalcony delays. In essence, the more damage we foresee from the room reflections, the more we can tighten the coverage and/or subdivide the coverage into smaller parts that can more precisely avoid the room.

But if the walls are not prone to sending us strong early reflections, our design work is essentially done. We don’t need to follow hundreds more reflection paths around the room because we accept the reality that there will be a reverberation tail and that there is nothing more we can do about it. If we have covered the seats with an even quilt of direct sound and minimized spill onto to strong reflective surfaces, we will now move on to taking advantage of the richness that a reverberant decay field has to offer.

For us, the game has already been won or lost at the point where most acoustical analysis metrics have scarcely begun to gather data. In the high frequencies, we have about a 5-millisecond window in which to complete our work. That’s not much room is it? The amount of time we have to work with increases proportionally as we go down in frequency, because the decisive factor is how the reflections reshape the frequency response. By the time we reach our subwoofer range, the time has stretched to several hundred milliseconds so there is plenty of room for the room. In short, we are much more aware of the frequency response effects of the room on the low end than the highs.

Summation Effects

This requires a moment to break down this time-sensitive mechanism that affects our sound system so dramatically. This is the summation effect of multiple arrivals of the same signal, which may or may not be in time (and therefore may or may not be in phase). These effects occur whenever we hear a reflection, or whenever two speakers play the same signal, or even between your own voice and a copy of it coming through the sound system. The early reflections and reverb tail are members of this family but vary by relative level and timing. The interaction of multiple speakers will create the same family of frequency response effects if the relative timing and levels of the reflections are the same.

This effect goes by many names with “comb filtering” being the most common and easy to digest. Here is what you need to know right now to get the picture: 1. The peaks and dips created by comb filtering are strongest when the mixed signals are close in level. 2. The width (in octaves) between the peaks and dips is the reciprocal of the number of wavelengths (cycles) the two sources are apart in time. So if we are one wavelength apart then we have octave spacing. If we are 1,000 wavelengths late the width is 1/1,000 octave. Wide peaks and dips (such as octave, half octave, and one third octave) are heard as a tonal shift, which we might be inclined to modify with equalization. When we hear peaks and dips that are 1/100 of an octave wide, we don’t perceive this as a tonal modification but rather a separated event such as an echo or part of the reverb tail.

Let’s do a quick and easy example to put this into perspective. A strong wall reflection arrives 10 milliseconds late. In the acoustician’s terms, this is an early reflection. To the sound system, it is early, medium, and late, depending on frequency. For 1kHz, the effect is a peak with dips on either side than span 1/10 octave wide (medium). For 10kHz, the peak is surrounded dips that are spaced 1/100 of an octave wide (late) and finally 100Hz, which is an octave wide (early). The 100Hz peak would clearly be perceived as tonal distortion, while 10kHz would be far too narrow for even someone who believes audiophile marketing materials to imagine they can hear as tonal.

Once we realize that a 100-millisecond reflection creates 1/100 octave-wide peaks and dips at 1kHz, it becomes clear that the late room reflections are far out of the tonal range for all but the lowest frequencies. Even something as short as a 1-millisecond reflection will make 1/10-octave-wide havoc to our high driver, so just forget about the room helping us up there.

We can see that it is best for the speakers to go it alone in all but the low end. So why do we care so much about tone and frequency response? Frequency response is the second most common metric of sound performance (first place goes to dB SPL bragging rights). The reason we care so much is that our speakers, the combined mix of the music in the show, and anything that leaves a strong tonal stamp on our response does so to all of the instruments and singers passing through. A strong early reflection added to the speaker path colors the sound for the entire orchestra at your location, and worse, it paints a different set of discolorations to the other seats. You might say, “Why the big peak at 400Hz?” while your friend in the next seat wonders why 500Hz stands out of the crowd.

How is this pathological version of the early reflection different from the beneficial reflections that acousticians love? Consider that the band is spread out over the stage and their individual direct sounds all hit the walls at different times. Each instrument gets some color, but none of them get the same. For the sound system, a single wall can create a 6dB peak in the response for a huge percentage of the people in the coverage area. To duplicate this undesirable feat with natural room acoustics for everything coming from the stage would be the ultimate challenge for an acoustician.

The next level may help to see why we keep our focus on the speakers and view the walls as someone in the room you prefer to avoid talking to: speaker arrays. Clusters of speakers are the ultimate ERSDs (Early Reflection Simulation Devices), an idiotic acronym, which I just made up at this moment. The multiple speakers of an array couple together in close proximity and arrive almost at the same time as the leading speaker. Almost is the key word here because whatever amount of time difference we have is a factor of 100X greater comb filter challenge for 10kHz as it is for 100Hz. As stated before, our margin of error is terribly small. A four-speaker array with a spread of different arrival times can have peaks of up to 12dB. Few acousticians would ever face this kind of frequency response deviation by virtue of room acoustic design, but this is an everyday event for the sound system engineer. Because of the great potential for frequency response uprisings among our own people (the speakers), we are in no hurry to expand the population we are trying to control by adding strong reflections from the walls.


So let’s get back to the original premise of how much differently we would design a sound system for a live or dead room and consider a series of questions. If you designed a sound system with perfect coverage in a room you were told had a 1.5- second reverb time and then found out it was actually 1.0 or 2.0, what would you change?

If there was a big flat reflective balcony front in a reverberant room, would you try to avoid it? How about the same balcony in a dry room? When you are told that a room has “perfect acoustics,” do you think “Easy Street” or is it more like “Danger Zone”?

So where does this leave us? Pretty much right where we started, but hopefully a little clearer on why we do things the way we do. The biggest difference between our approach to live and dead rooms is the amount of fear we have about spilling any sound out of the seats. The live room requires a more surgical approach, but the shape of the direct sound target is exactly the same whether the building is just a wire frame or the walls are filled in.

Sound engineers are generally aware that they are not acousticians and yet a sound engineer has to live with the fact that everyone has two jobs: theirs and sound. We respect the training and knowledge required for an acoustics degree. But knowledge of acoustics does not automatically mean knowledge of sound system engineering. Many sound engineers lack formal training and yet we have a huge and valuable volume of experience about how speakers perform in rooms. We can spot a newly built room that was designed with an 18th century approach to speaker systems. And likewise, we can spot (and are infinitely grateful for) the rooms that are designed with the needs of modern speaker systems in mind. Speakers are here to stay—mostly because the only shows that make any money are the ones with speakers, so let’s get realistic about designing rooms that are going to work with and not against them.

This article orginally appeared in the January 2013 issue

Featured Articles