Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now


Loud and Clear

A WELL-DESIGNED sound system for speech should provide proper coverage over the listening area, minimal spill to other areas, adequate sound pressure

Loud and Clear

Apr 1, 2002 12:00 PM,
By Mary Gruszka

A WELL-DESIGNED sound system for speech should provide proper coverage over the listening area, minimal spill to other areas, adequate sound pressure level (SPL), smooth and wide frequency response, minimal distortion, and of course, intelligibility. Although SPL, frequency response and distortion are fairly easy to measure, measuring intelligibility in the field, predicting it in the design phase and even defining it have proven somewhat problematic.

To gain insight into what intelligibility means and how to measure it, two Syn-Aud-Con Intelligibility Workshops were held more than ten years ago. Those provided sound contractors and systems designers with the latest research and measurement tools at that time. Since then, much more research about intelligibility has been conducted, and newer measurement tools have become available. Not only that, the importance of intelligibility in life-safety systems has gained heightened awareness and has taken on legal ramifications.

So the time was right for another intelligibility workshop. The latest was held October 11-13, 2001, in Chicago and was sponsored by Gold Line, manufacturer of the TEF20 and other audio and acoustical measurement tools. Other sponsors included Audio Systems Group, Inc., and Rent-Com (both from Chicago), as well as Shure and TOA Electronics.

In the workshop, emphasis was placed on difficult acoustic spaces where intelligibility tends to fall apart. Two Chicago Roman Catholic churches, St. Mary of the Angels and St. Helen’s, served as suitable venues. Although the churches differ in size and architectural design, each has long reverberation times and definite intelligibility challenges. More than 50 audio professionals and designers of equipment for public safety systems from North America and Europe attended the workshop, which was chaired by Doug Jones, audio department chairman at Chicago’s Columbia College. Curriculum development assistance came from Jim Brown, principal of Audio Systems Group Inc., and Peter Mapp, principal of Peter Mapp Associates, U.K.

Although most of the workshop was devoted to guided self-discovery experiments, some presentations focused on the state of the art of intelligibility. Presenters included Peter Mapp and Don Keele, senior engineer at Harman-Motive, and Dr. Herman Steeneken, senior research scientist at TNO, Human Factors Group, the Netherlands.

▪ Workshop Format. Jones designed the workshop to create an atmosphere of discovery, allowing the participants to uncover the many factors that affect intelligibility, to hear what these effects sound like by using word lists, and to understand how to make intelligibility measurements and how the different measurements compare with one another. Another goal of the workshop was to learn how to use the Gold Line TEF20 analyzer, running the new Windows software, to measure percentage articulation loss of consonants (%Alcons), speech transmission index (STI) and rapid speech transmission index (RaSTI), and perform noise, frequency response and energy time curve tests.

In each church, evaluations were made with the existing installed sound system and with temporary systems installed on portable lifts solely for the workshop. Each system had different directivity and coverage characteristics. In addition, other loudspeakers placed in each space were used to simulate acoustical problems such as late sound arrivals, slap from the rear wall, and noise and rumble from HVAC and mechanical equipment to see how those might impact intelligibility. The effects of proper and improper equalization were also examined. Measurements were made at various listener locations throughout each church. EASE 4.0 models were also made for each church and the intelligibility predictions were compared to actual measurements and listening experiences.

The participants were divided into groups of four or five and were rotated to the various experiment stations and churches. The third day of the workshop was used for a plenary session, where representatives of the teams shared their discoveries, data and insights into the phenomena they observed.

▪ The Venues. Built more than 100 years ago, St. Mary of the Angels is a large Gothic church laid out in a cross shape with plaster and lathe construction. The church holds about 1200 people and encompasses a volume of approximately 900,000 cubic feet. Distance from the rear wall to the altar is about 180 feet. Side-to-side distance is about 126 feet across the transept area and roughly 80 feet across the rest of the church. The height from the floor to the top of the dome is about 110 feet, and the height from the floor to the main ceiling is about 65 feet. Reverb times were measured to be anywhere between 6.5 and 7.5 seconds, depending on atmospheric conditions (especially humidity).

St. Helen’s Church was built in the mid-1960s in a more modern architectural style with circular geometry. Its volume is approximately 311,000 cubic feet. This church is wider than it is deep, about 145 feet wide and 92 feet from the wall behind the altar to the rear wall of the congregation. The scalloped ceiling has an average height of about 35 feet. Both the back and front walls are large curves, and the sound that hits the back wall is easily heard in the front — delayed, of course.

Although St. Helen’s is smaller that St. Mary’s, it has about the same reverb time. However, the churches sound different. St. Mary’s has a more diffuse reverberant field that is held in check at the low end by the amount of glass in the space and at the high end by air absorption and humidity. St. Mary’s also has much more complex geometry and diffusive architectural details as evidenced by the 997 faces needed to represent half of the church’s space in EASE, compared with 149 faces for St. Helen’s. (The total number of faces in each church is doubled.)

Nearly everyone who experienced St. Helen’s said that it was an unpleasant place to be in from an acoustics standpoint. Although a person speaking in the front could be heard and understood by a listener in the rear, a certain fatigue factor would set in. This could be attributed to the concave curved surfaces, minimal diffusion and absorption, and the resulting focused sound energy.

▪ Venue Sound Systems. The installed sound system in St. Mary’s is fully described in “Keeping the Faith,” on page 90. The system consists of two large Renkus-Heinz horns each with a 2-by-2 bass array to cover the front third of the church. The rest of the seating area is covered by line arrays custom designed by Brown and fabricated by R&R Cases of Des Plaines, Illinois. The 12-driver boxes are spaced about 30 feet apart and delayed.

Brown pointed out that proper level balance and delay settings for delayed loudspeakers are especially critical in highly reverberant spaces. As he learned from Dave Klepper, the shortest possible amount of delay should be used to establish the precedence effect. Levels need to be balanced to provide just enough gain to listeners in the delayed zone without overdoing it, to avoid injecting more energy to the reverberant field and upsetting the Ed/Er ratio.

Rear loudspeakers facing the front provide reinforcement for the choir. The choir is located on the antiphonal balcony (at the back of the church) and has its own monitor and reinforcement systems. The rear choir reinforcement loudspeakers were used to create echoes simulating slap from the rear wall.

Two Shure P4800 digital signal processors were used during the workshop to provide different delay and equalization settings for the experiments. Shure donated the P4800s to St. Mary’s to replace existing analog processing. Two Intellivox systems, Models 2b and 6c, supplied by Duran Audio, were temporarily set up, providing the opportunity to measure and compare three systems with different Qs in the same space. The smaller Intellivox unit, the 2b, has good directivity control through the midrange but is more dispersive at the low end. In comparison, the 6c maintains pattern control down through the low frequencies.

The installed sound system at St. Helen’s consisted of some old, poorly angled columns and no equalization. Coverage and intelligibility were poor. St. Helen’s two temporary systems included a pair of high-Q Community M-4 coaxs and two low-Q two-way boxes with relatively small horns. In addition, sound contractor Mike Hedden of dB Acoustics and Sound Inc., Gainsville, Georgia, brought one of his Intellivox 2c units to test. A few full-range boxes were arranged at the rear wall to simulate echo, but it turned out that it was not needed in the space — the curved rear wall produced enough real echoes of its own.

TOA Electronics provided a DACsys2000 model DP-0206 DSP processor with additional output modules for a 2-by-10 configuration. That was used for the tests at St. Helen’s and was donated to the church for part of a planned improvement to the sound system that Brown is designing.


▪ Word Lists. Phonetically balanced word lists were used mainly for ear calibration, to get a sense of what a particular intelligibility metric sounded like, and to compare the differences between acoustic conditions on intelligibility. They were not meant to be the golden standard for comparison with all other intelligibility measurements. The word lists for the workshop were taken from The Audio System Designer Technical Reference, edited by Mapp and published by Klark-Teknik. Four sets of 25 words each were recorded on CD for the class.

Listeners used an answer sheet with multiple choices to indicate the words that they perceived. (Word tests can be designed with either multiple choice or open-ended responses; each method has its pros and cons.) Jones and Ted Uzzle, former editor of S&VC and now an instructor at Columbia College, worked through the night scoring and processing the word lists in time for the plenary session.

▪ Intelligibility Measurements. The TEF20 software permits measurements and calculations of such intelligibility indicators as %Alcons, STI and RaSTI. The new Windows software makes measurement setup easier than previous versions and can present more data and graphs to the user at one time in a clear manner. %Alcons is based on the reverb time (RT60) of a room and the ratio between early and late energy (Ed/Er). The lower the reverb time, or the higher the Ed/Er ratio, the better the intelligibility. With the TEF20, this information is derived from an energy time curve (ETC) swept over a particular band of frequencies, such as a 2kHz octave band.

Once the ETC is taken, the software makes suggestions for cursor placement to obtain the RT60 and the Ed/Er. With the default settings, the first 20 ms are integrated as the direct sound and the rest as reverberant energy. The default settings for RT60 cursors calculate the decay rate starting from the highest level to a point 10 dB lower. Most of the time the highest level is the direct sound, but that’s not always the case, especially in difficult spaces. The software allows the user to change cursor settings.

STI calculations are based on ETC sweeps over the octave bands centered at about 125, 250, 500, 1000, 2000, 4000, and 8000 Hz. The sweep time for the 125Hz band is four seconds, and for the others, two seconds. Once the data is gathered, the TEF20 software calculates STI without additional user intervention. The user can make a noise measurement along with the ETCs. That way the effect of noise on the STI reading can be calculated. Noise files can also be considered later in post-processing, allowing the effects of different types and levels of noise to be evaluated. RaSTI is similar to STI except that measurements are made only in the 500 and 2000 Hz octave bands.

Two test setups are commonly used for intelligibility measurements of a sound system. One is to send the TEF test signal directly into the sound system, bypassing any sound system microphones. The other is to send the TEF test signal into a small loudspeaker that simulates a person talking. This loudspeaker is placed in front of one of the sound system microphones (such as a pulpit, podium or lectern) and amplified through the sound system. In either case, during the measurements the measurement microphone is located at a particular listening position in the space.

In the second method, the reference or “talker” loudspeaker must first be calibrated, and the new Windows TEF20 software guides the user through the steps. During the calibration process, the TEF20 emits a series of short sweeps to the reference loudspeaker. For this calibration, a measurement mic is used instead of the sound system mic but is placed in the same location and fed directly into the TEF20 analyzer. The user monitors the received SPL and adjusts the output of the test signal until the correct level (as indicated by the TEF20) is reached. Once the calibration process is finished, the sound system mic is set back into place, and the measuring mic is moved to the listening position to be tested.

▪ Which method should be used? Arguably, the sound system direct approach would allow comparisons of listening positions within a space but not necessarily comparisons between venues. The “talker” loudspeaker approach does allow comparisons from one venue to another; however, as was discovered at the workshop, there were few differences between the two venues.


the workshop, by all accounts, succeeded in fostering learning and allowed synergy among all the participants. Rigorous science was not the goal, but many valuable insights were gained and shared. One insight was what intelligibility measurements should be used for. Because they do not guarantee that a listener will be able to understand reproduced speech, as Mapp points out in “Measuring Intelligibility” on page 56, intelligibility measurements can only indicate if a sound system can faithfully reproduce what is presented to it.

But which intelligibility indicator? STI seemed to correlate better than %Alcons or RaSTI with what was heard in the two churches. That was no surprise to Steeneken, who developed STI and pointed out that the modulation transfer function used to calculate STI was meant to take into account temporal distortions such as noise, echoes and reverberation as well as bandpass limiting, nonlinear distortion and overloads.

RaSTI was never intended to be used with sound reinforcement systems, though it has been misused that way. RaSTI was developed as a tool to evaluate how well person-to-person communications could take place in an acoustical environment. It assumes wideband transmission and performs measurements in only two octave bands, 500 and 2000 Hz. When incorrectly used for evaluating sound systems, RaSTI can give strange results, as was shown at the workshop.

That leaves %Alcons, which Mapp covers in more detail in “Measuring Intelligibility” (page 56). A user can manipulate cursors on the TEF20 that affect the %Alcons results. That is useful in understanding the acoustics of the space and realizing how RT60 and Ed/Er can affect intelligibility, but a measurement that could be open to interpretation would not hold up in court. Remember that intelligibility can have legal importance. When system designers or installers are required by law to meet specified intelligibility levels, a standardized test setup and fixed set of measurement para-meters, such as STI, is required.

%Alcons may have limitations in a difficult space, as the measurement may not take into account all the detrimental effects that space might present. %Alcons, as implemented on the TEF20, uses RT60 in its calculations. However, as the ETCs showed, reverberation is a complex thing, and a single RT60 number may not totally define it. That could account for some of the variability in %Alcons results.

One of the major mistakes in dealing with difficult acoustic spaces is to take a look at only the 2kHz band for such things as loudspeaker directivity and intelligibility. Problems such as reverb, echo and lower-frequency noise could adversely affect intelligibility but would remain undetected unless specifically measured and looked for. STI measurements, on the other hand, take into account the full frequency range, producing ETCs for each octave band. With these ETCs and the RT60 and Ed/Er cursors, %Alcons can be interpolated from STI for each frequency band without much additional effort.

At St. Mary’s, participants heard an unexpected effect on intelligibility when background, uncorrelated, band-limited pink noise was added to the space in an experiment initiated by Mapp. Intelligibility actually went up, and the improvement was not subtle as evidenced by the word tests and STI. One theory is that noise can effectively mask the reverberation, thus making speech easier to understand. That is something Mapp intends to investigate further.

One surprising finding was that equalization had an affect on intelligibility. Poor intelligibility caused by equalization was clearly heard in the word tests and confirmed by STI. The tests also confirmed that directivity does matter.

The comparisons between EASE predictions and actual measurements were not made as much as had been hoped, mainly due to lack of time. However, after the workshop Brown used one of the new features of EASE 4.0, called CAESAR, on the model he did of St. Helen’s to compute ETCs on a grid of points in the seating areas. Acoustical calculations were made from those ETCs, including a new algorithm that predicts audible effects of long delayed reflections.

▪ Future workshops? The intelligibility workshop helped define and confirm key acoustical, electronic and loudspeaker characteristics that affect intelligibility. It also indicates areas for further investigation. Maybe noise can be used in highly reverberant spaces to improve intelligibility, and perhaps the software for the TEF20 must be changed to allow for longer ETC sweeps when doing intelligibility calculations in these types of spaces. Other tips and techniques for improving intelligibility might be revealed; maybe another intelligibility workshop will be needed in a few years to share all these exciting discoveries.

Mary C. Gruszka is a systems design engineer, project manager, consultant, writer and artist based in the New York City area. She can be e-mailed at [email protected]. Thanks to Doug Jones, Jim Brown, Peter Mapp, Dr. Herman Steeneken and Greg Miller from Gold Line for their assistance in writing this article.

Closing Credits

The workshop would not have been such a success without the help of large number of people and companies.

In addition to the credits given in the article, Jim Brown coordinated, designed and documented the sound systems; made an EASE model of St. Helen’s Church; handled logistics; served as technical coordinator at St. Mary’s; and donated a considerable amount of consulting time. His company, Audio Systems Group Inc., was one of the sponsors.

Ron Steinberg of Chicago’s Rent-Com, another sponsor, donated labor as well as gear. He helped scout suitable churches, set up the temporary systems, installed the DSPs and set up the TEF measurement stations and the classroom systems.

Bruce Olson of Olson Sound Designs of Brooklyn Park, Minnesota, helped with EASE modeling, provided technical support and served as coordinator in St. Helen’s. Olson, Mapp, Don Eger, Blair McNair, and Ron Sauro did the EASE model of St. Mary’s.

John Murray of TOA Electronics provided technical support in St. Helen’s. Harman International made it possible for Don Keele to attend.

Featured Articles