Conference Audio Systems Design
Mar 1, 2001 12:00 PM,
HIGH-QUALITY AUDIO IS VITAL IN CONferencing and distancelearning systems. Poor audio can make a conference fatiguing andeven unintelligible. While there are many things that contribute toa good conferencing system, nothing works better than starting withgood acoustics.
A room with lots of hard surfaces such as walls, floors,ceilings, windows and tables can be too reverberant. These hardsurfaces are often parallel to each other, causing sound to bouncearound the room before decaying. A microphone in this type of soundenvironment picks up everything and sends it out to speakers or toother meeting places your system is connected with. The originalsound gets the same treatment as a reflection of the sound becausethe microphone cannot differentiate between the two. All soundsreaching the connected conference room are from a single location(the loudspeaker on their end). The result is that the conferenceroom on the receiving end hears everything original audio andreflections at one level (the loudspeaker’s volume).
You can see how this would ruin the meeting for anyone at aremote location. Since remote/teleconferencing is becoming moreprevalent, you need to know how to make audio sources lessnoisy.
There are a number of ways to correct reverberant rooms. Yourfirst recourse is acoustic treatment: acoustical ceiling tiles,carpeted floors, sound-absorbing panels on the walls and draperiesover the windows. If you don’t have the budget for these things,try tacking a few blankets to the walls and tossing some throw rugson the floor. You’ll be surprised how much quieter the roomgets.
CORRECT MICROPHONE USAGE
Ideally, we want to get a 25 dB signal-to-noise ratio at the micin a conference room system. A problem to overcome on any system’stransmit path is the fact that noise is transmitted with speech,and adding gain won’t help that situation. Keeping mics away fromnoise is the key. It’s crucial to maintain an appropriate number ofmics, enough that everyone can be heard adequately, without usingtoo many, which can add background noise. A good rule of thumb isto use one mic for every two or three conference speakers.
Low-profile boundary mics that sit on a tabletop are attractiveand work well for most corporate conference rooms. Push-to-talkmics can also be used, but be sure to specify models that do notmute the mic element. Push-to-talk mics should mute mic signals inthe mixer, so they must be designed with separate conductors forthe switch. Four such mics are beyerdynamic’s MPC67RC, Crown’sPCC170SW0, and Shure’s MX392 and MX412D. Check with themanufacturer to see if they make any models with separate contactclosures.
Wireless mics can also be used; however, movement challenges theoperation of acoustic echo cancellers. Lavalier mics are betterthan handheld when it comes to wireless, and it is critical to keepthem a fair distance from loudspeakers to avoid feedbackproblems.
Podium (or gooseneck) mics are also a good option in paper-ladenconference rooms to avoid the sounds of rustling paper. The keyhere is to get the mics off the desktop, but still within thecritical distance of 2 to 3 feet from the speaker. Microphones thatare flush-mounted on the ceiling can also work but can createproblems if they are too far away from the subject. They may alsopick up vibrations from the HVAC system, loudspeakers or evenpeople walking on the floor above.
If the boardroom has more than four microphones in use it iswise to use an automatic mic mixer. When a large number of mics areon at once, especially in a reverberant room, the room’s gain intothe audio system can become too high. This excessive gain resultsin feedback and can increase background noise in the audio system.The automatic mic mixer will ensure that a minimum number of micsare on at once.
CORRECT LOUDSPEAKER USAGE
Ideally, the loudspeaker should be 25 dB louder than the noiselevel of the room it is in. A common mistake in teleconferencingsystems is the tendency to use only one or two loudspeakers tocarry the audio from the other sites. When too few loudspeakers areused it is necessary to increase the volume, which places the roomcloser to potential feedback. Also, the listeners hearing theloudspeakers directly are in pain from the volume, while those whoare farther away from them can barely hear what is going on.
Using an adequate number of loudspeakers will allow you to keepvolume at a reasonable level throughout and will minimize theamount of loudspeaker audio that is picked up by mics. The best wayto distribute loudspeakers is to place them in the ceiling (ormount them along the walls) and control them with a separate poweramplifier. Also, try to isolate loudspeakers from mics as much aspossible. Remember that a mic will pick up all sounds in itsvicinity, and audio coming out of a speaker will be treated as justanother voice in the room.
In a multipoint conference system, the complexity is compoundedas more noise sources are mixed together. It may be beneficial (ornecessary) to reduce noise using electronic methods.
Noise suppression is an electronic process of removing noisefrom a corrupted signal. Generally, noise suppression consists of amethod for distinguishing between the signal and the noise and amethod for removing the noise. There are many different kinds ofnoise suppression, but they all have one basic goal: to make theoutput sound as much like the original, noise-free signal aspossible.
The best way to reduce noise in a signal is to prevent it fromentering the signal to begin with. This is accomplished by usinggood system design practices such as improving acoustics, usinghigh-quality equipment and following good wiring practices. Despiteall these efforts, there will always be some noise in thesignal.
Note that noise suppression isn’t necessarily a cure for noisysignals. A signal with 10 dB of noise suppression doesn’t sound asgood as a signal that has 10 dB less noise in the first place. Butit will still sound much better than a signal with no noisesuppression at all!
NOISE BEYOND YOUR CONTROL
Conference systems are often connected to a wide variety ofremote locations, many of which may be noisy. Noise is frequentlyintroduced by a poorly designed room on the other end of theconference. Even when all rooms are well-designed, noise may beintroduced from sources beyond everyone’s control. Common sourcesof noise include blowers and HVAC systems; road noise from cellularphones; in-band telephone hiss or hum that cannot be removed by thephone add; or noise introduced through the conference transmissionsystem itself. Other sources are computers and projectors used inthe conference room.
Since you don’t have control of the other rooms or thecommunications network, your only choice is to remove the noiseelectronically. Ideally, we want to reach a signal-to-noise ratioof 25 dB. Other ways to increase SNR is to place mics closer totalkers, since the talker’s SPL increases closer to the mic.Applying acoustical treatments to the room is also an option;however it might be expensive in a retrofit so it is usually bestleft for inclusion in the design phase.
The effects of noise are gradual, and listeners may get fatiguedafter listening to high levels for a long period of time. Even ifthe intelligibility isn’t improved by noise suppression, thequality enhancement can make the conference much more pleasant andallow it to continue comfortably for a longer period of time.
MIXING CHANNELS TOGETHER
Even with moderate to low noise levels, the noise starts to addup when you mix signals together. This will limit the number ofchannels you can mix together, depending on the noise levels ofeach channel and your quality requirements. Noise suppression caneffectively increase the number of channels you can mix together orsimply improve the audio quality with the number of channels youhave. If noise suppression is applied correctly, you can improvethe signal quality and intelligibility just as much as if you hadprevented the noise from entering the system acoustically. Thisprocess is best illustrated by the following example.
A conference room has eight non-gated microphones, which are onthe table about 3 feet from the participants. The ambient noiselevel in the room is 40 dB SPL, and the level of speech picked upby the microphone is 73 dB SPL. The unprocessed signal-to-noiseratio for each channel is 33 dB.
When you mix the eight channels together without gating, thetotal noise level adds up to 49 dB (add 3 dB for every doubling ofthe number of channels). This means the total SNR is only 24 dB. Ifyou apply noise suppression to the mix, the SNR is improved to 34dB, but the signal doesn’t sound as good as it would if the SNR was34 dB without noise suppression. The mixed noise suppression iscertainly an improvement, but you can do even better.
If you apply noise suppression to each individual channel beforemixing, the algorithm has to remove less noise. Since you areapplying noise suppression to a signal channel, you are startingwith an SNR of 33 dB. After noise suppression, the SNR is improvedto 43 dB. Now when you mix the channels together, the SNR isbrought back to 34 dB. The key here is that the SNR has never beenworse than 33 dB, so the mix sounds as good as a signal channelwith a 33 dB SNR. That is, it is not a 23 dB SNR signal that hasbeen improved with noise suppression to sound like 33 dB.
This works because removing noise from the channels individuallyprevents noise from one channel from getting mixed in with all theothers. Since you are preventing noise from getting mixed in, theintelligibility doesn’t suffer. For example, the noise in channels2 through 8 is reduced before it gets mixed with channel 1. So theintelligibility of channel 1 is not harmed by the noise from theother channels. This is the same principle that makes an automaticmic mixer sound better than a single noise gate after a non-gatedmixer.
The benefits of using noise suppression in multipointconferencing systems are similar to the benefits you get whenmixing channels together. Since the sites are mixed together, youcan improve intelligibility and mix quality by reducing noise ateach site. The benefits of noise suppression in this situation maybe even more noticeable if one of the sites is very noisy.
Consider a simple system with two rooms (A and B), linked with ahigh quality audio conferencing channel. If a telephone caller on anoisy line is brought into the conference, the line noise corruptsthe mix. Thus, the people in rooms A and B sound just as noisy asthe telephone caller. By removing the noise before it gets mixedin, the people in room A and B sound much more intelligible, andthe telephone signal has better perceived quality.
NOISE SUPPRESSION METHODS
Before we compare some of the noise suppression methods foraudio and communications, please see the criteria in the sidebar,What Makes a Good Noise Suppression Algorithm?
If these criteria are met, the speech signal will not bedistorted by the noise suppression process. Ideally, the algorithmwill make no difference in audio quality except for the reductionof noise. You shouldn’t know it was there until you turn it off andhear an increase in noise level. And though adequate noisereduction will occur, intelligibility won’t suffer.
Noise gates attenuate the telephone signal when there is nospeech present and turn the gain back up when someone startstalking. Thus, they only remove noise during idle periods. At lownoise levels, this is not noticeable since the noise is masked byspeech. As the noise gets louder, its presence during speechbecomes more noticeable and results in a loss of perceived quality.At moderate noise levels, there may be a noticeable whooshing soundas the gain is ramped up and down to allow the speech to passthrough. This may result in a half-duplex feel because there isobviously some noise, but then the signal goes completely dead whenthere is no speech. If the noise levels get too high, the noisegate may have trouble deciding what is speech, and some speech mayactually get cut off. Most noise gates have threshold adjustmentswhich must be manually set for different noise levels for the bestperformance.
Various methods of speech enhancement or emphasis work by tryingto increase the perceived level of speech. They don’t actuallyremove noise, but try to emphasize the speech parts of the signal.This is usually accomplished by enhancing the speech formats. Ifthe level of the signal can be turned down (due to the increasedspeech level), noise is effectively removed. In general, thesemethods cannot improve the signal to noise ratio by very much(perhaps about 3 to 6 dB). Furthermore, intelligibility is notpreserved very well because certain consonant sounds are notemphasized.
Spectral subtraction is a noise reduction method that makes anadaptive estimate of the noise spectrum and subtracts that from thesignal. The noise estimate is subtracted all the time, both duringspeech and idle periods. However, there is an artifact associatedwith spectral subtraction called musical noise, which is a sort oftrickling sound added while the noise is removed. It also adds ahollow, resonant quality to speech. This tends to be annoying anddistracting.
In general, there is a tradeoff with spectral subtractionmethods. You can get moderate noise cancellation with anunacceptable amount of musical noise or a good amount of noisecancellation with an unacceptable amount of speech attenuation.Spectral subtraction does a fair job of removing noise, but it addsmany artifacts that may be more annoying than the noise itself.
Adaptive Digital Filtering
Adaptive digital filtering algorithms appear to be the answer.Figures 2a to 2d show remarkable gain reduction in acoustic echoes.Adaptive digital filters employ DSPs designed to differentiatebetween noise and program signal after converting the signal intothe digital domain. Their effect is striking and is the currentstate of the art.
In audio conferencing, the idea of acoustic gain is a littledifferent than in sound reinforcement. In this application, wedetermine how much of the loudspeaker signal is being picked up bythe microphone or how loud the loudspeaker signal is at the miccompared to local speech. In other words, acoustic gain is thedifference between amplified and unamplified speech at the farthestlistener. If we know the distances involved, it is a simple matterto calculate the needed acoustic gain, which is the minimum gainnecessary for comfort and intelligibility, and the potentialacoustic gain, which is the maximum level without feedback. Theidea is to design a system so that PAG is greater than NAG (referto Figure 3).
ACOUSTIC ECHO CANCELLATION
What happens when you have too much acoustic gain? If theacoustic gain of your system exceeds the acoustic echo canceller’scapabilities, the acoustic echo cancellation will no longer adaptwell to changing acoustic conditions in the room. This can resultin increased noise suppression causing half-duplex communications,lack of convergence (residual echo heard all the time) andexcessive feedback. As the acoustic gain increases, these problemswill get worse and may make communications impossible.
Acoustic echo is most noticeable (and annoying) when delay ispresent in the transmission path. This happens primarily inlong-distance circuits or in systems using speech compression (suchas video conferencing or digital cellular phones). Even though theecho might not be as annoying when there is no delay (as with shortlinks between conference rooms in the same building or distancelearning over fiber-optic cable), it is still intrusive and cancause fatigue and listener stress.
Acoustic echo cancellers can be used in both narrow-band (3.5kHz) and wide-band (7 kHz) conferencing systems. Narrow-bandapplications include teleconferencing and low bit-rate videoconferencing. Wide-band applications include high-qualityteleconferencing and video conferencing, as well as distancelearning. Wide-band conferencing system users should beparticularly interested in using an AEC solution, as it will helpthem to reap the most benefit from the additional audiocapabilities of their systems.
People at the remote end of the transmission path are theprimary beneficiaries of an AEC. Installed at the local end, an AECprevents the echo of the remote person’s voice from being returned(echoed) to them through the audio system. People on the same endas the AEC should not notice the AEC if it is doing its jobproperly and since the person on the far end hears better audioquality, the AEC enables the conversation to flow moresmoothly.
Be sure not to confuse acoustic echo with line echo. Line echoesare reflections within the telephone line, rather than echoes froma room or auditorium. They are usually only one or two noticeablereflections from telephone hybrids or impedance mismatches in theline and are usually delayed by less than 32 milliseconds and donot change frequently, if at all. Acoustic echoes have a verycomplex path with dozens or hundreds of reflections that last 100to 200 milliseconds and can vary continuously during aconversation.
In order for participants at both ends to hold a full-duplex,hands-free conversation, both ends must be equipped with an AEC asin Figure 4.
TELEPHONE SYSTEM FEEDBACK
If your sound system starts feeding back whenever a phone lineis introduced, the problem is not in the sound system but in theinterface to the phone line. Most often, the wrong device has beenused for bringing the phone call into the audio system. Telephonecouplers, which normally cost $300 or less, simply will not workfor your application because they cannot adequately isolate the twosides of the telephone call. This inadequate isolation results inbleed through of audio from the send side of the coupler to thereceive side. When this audio is amplified through your soundsystem, electronic feedback results.
Luckily, phone coupler problems are easy to resolve. Throw outthe couplers and replace them with digital telephone hybrids. Thesedevices are the same products used at radio and TV stations tobring callers into talk shows. They are considerably more expensivethan the couplers, but they won’t introduce feedback into youraudio system.
There is only one trick to using telephone hybrids: if the audiosent down the telephone line contains any of the caller’s audio,feedback will result. The trick is to use what is called amix-minus feed to the caller, a mix of all of the audio from yoursystem minus the caller’s audio. If your mixing system does nothave mix-minus capability (most don’t), you can either use aseparate mixer for the phone line or buy a digital telephone hybridwith automatic mix-minus capability.
Here are the top nine things you can do to design goodconference systems:
- Place one mic within arm’s reach for every two to threetalkers.
- Apply noise cancellation to every mic channel.
- Limit user control to loudspeaker volume, with limitedrange.
- Design conservatively when the remote sites’ characteristicsare unknown.
- Design for as big a difference between NAG and PAG as possible(15 to 20 dB is great).
- Ensure nominal levels are sent to and received from thecodec.
- Use acoustical treatment to reduce reverberation.
- Provide full-bandwidth program audio to wide bandwidthcodecs.
- Design a conference room with the same care you would for asound reinforcement system.
It is hard to achieve good sounding audio in conference roomnetworks. There are many obstacles including room acoustics,electronics and user technique. However, audio problems are notinsurmountable. Once your audio system has been correctly tuned,you will find a dramatic increase in your client’s productivity andenjoyment.
Michael Pocino is an engineer at ASPI Digital in Atlanta,Georgia. Michael can be reached via e-mail at [email protected].
What to look for in an automatic mic mixer
When you’re looking for an automatic mic mixer, make sure youpick the right one. The mic mixer you choose:
- SHOULD be able to handle the number of mics you will be using(most are expandable or can be cascaded).
- SHOULD provide an automatic gating threshold. This means that amic will only turn on when the sound level exceeds a pre-determinedlevel. The smarter mixers can automatically set the gatingthreshold above the background noise.
- SHOULD permit a chair override on one of the channels. Thispermits the leader or moderator to take control of the mixer bysimply speaking into his or her mic.
- SHOULD allow one mic to be always on. If all mics turn off,that room’s audio will go away (including background noise), makingit sound as though the connection were cut off. Leaving one mic onwill provide a more natural sound (and is essential for the properoperation of acoustic echo cancellers).
- SHOULD NOT require a specific type of mic. Ignoring this canprove costly.
WHAT MAKES A GOOD NOISE SUPPRESSION ALGORITHM?
- The desired signal is not removed.
- The desired signal is not distorted.
- Noise is removed during idle periods.
- Noise is removed while the signal is active.
- There are no audible transitions between on and offstates.