Feb 1, 1999 12:00 PM, Dennis Bohn
Among the many definitions for dharma is the essential function or natureof a specific thing, and along those lines. The audio industry has beenradically and irrevocably changed by the digital revolution. Arguments willensue forever about whether the true nature of the real world is analog ordigital, whether the fundamental essence (or dharma) of life is continuous(analog) or exists in tiny little chunks (digital). Here we shall butresolve to understand the dharma of A/D converters.
Data conversionOnce a waveform has been converted to digital format, nothing can occur tochange its sonic properties. While it remains in the digital domain, it isa series of digital words, representing numbers. Aside from the grossexample of having the digital processing actually fail and cause a word tobe lost or corrupted, nothing can change the sound of the word. It is justa bunch of ones and zeroes. There are no fractions. The point being thatsonically, it begins and ends with the conversion process. Nothing is moreimportant to digital audio than data conversion. Everything in between isarithmetic and waiting. That is why data conversion is so critical. Wecould go so far as to say that data conversion is the art of digital audiowhile everything else is the science; it is data conversion that ultimatelydetermines whether or not the original sound is preserved.
Because analog signals continuously vary between an infinite number ofstates and computers can only handle two, the signals must be convertedinto binary digital words before the computer can function. Each digitalword represents the value of the signal at one precise point in time.Today's common word length is 16 bits or 32 bits. Once converted intodigital words, the information may be stored, transmitted or operated uponwithin the computer. In order to properly explore the critical interfacebetween the analog and digital worlds, it is necessary to review a fewfundamentals and a little history.
Binary numbersWhenever we speak of digital, by inference, we speak of computers (computeris used here to represent any digital-based piece of audio equipment), andcomputers are really quite simple. They can understand only the most basicform of communication or information-yes or no, on or off-all of which canbe symbolically represented by two things, anything from two letters to twonumbers or two charges. To keep it simple, we choose two numbers-one andzero. Officially, this is known as binary representation, from Latin bini,meaning two by two. In mathematics, this is a base-2 number system asopposed to our decimal number system, which is called base-10 because weuse the 10 numbers, zero through nine.
In binary, zero is a good symbol for no, off, closed and gone, and one iseasy to understand as meaning yes, on, open and here. In electronics, it iseasy to determine whether a circuit is open or closed, conducting or notconducting, has voltage or lacks voltage. Thus, the binary number systemfound use in the first computer, and nothing has changed. Computers justgot faster, smaller and cheaper, and memory size increased in a decreasingspace.
One problem with using binary numbers is that they become big and unwieldyin a hurry. For instance, it takes six digits to express my age in binarybut only two in decimal. In binary, however, we had better not call themdigits because that implies a human finger or toe, of which there are 10.To get around that problem, John Tukey of Bell Laboratories dubbed thebasic unit of information (as defined by Shannon) a binary unit or binarydigit, which became abbreviated to bit. A bit is the simplest possiblemessage representing one of two states.
So, I am 6 bits old. Well, not quite. Nevertheless, it takes 6 bits toexpress my age as 110111. I am 55 years old in base-10 symbols, whichstands for five ones plus five tens. Each digit in our everyday numbersrepresents an additional power of 10 beginning with zero. That is, thefirst digit represents the number of ones (100); the second digitrepresents the number of tens (101). We can represent any size number byusing this shorthand notation.
Binary number representation is just the same except substituting thepowers of two for the powers of 10. Therefore, moving from left to right,each succeeding bit represents 20 = 1, 21 = 2, 22 = 4, 23 = 8, 24 = 16, 25=32 and so on. My age is thus represented as 110111, which is 32+16+0+4+2+1.
Harry and ClaudeThe French mathematician Fourier unknowingly laid the groundwork for A/Dconversion in the late 18th century. All data conversion techniques rely onlooking at or sampling the input signal at regular intervals and creating adigital word that represents the value of the analog signal at that precisemoment. The fact that we know this works lies with Harry Nyquist. Whileworking at Bell Laboratories in the late 1920s, Nyquist discovered andwrote the paper, "Certain Topics in Telegraph Transmission Theory" in whichhe described the criteria for sampled data systems. He taught that forperiodic functions, if you sampled at a rate that was at least twice asfast as the signal of interest, then no information (data) would be lostupon reconstruction. Because Fourier had already shown that all alternatingsignals are made up of a sum of harmonically related sine and cosine waves,audio signals are periodic functions and can be sampled without loss ofinformation. This became known as the Nyquist frequency, which is thehighest frequency that may be accurately sampled, and is half the samplingfrequency. The theoretical Nyquist frequency for the audio CD system is22.05 kHz, equaling half of the standardized sampling frequency of 44.1 kHz.
As powerful as Nyquist's discoveries were, they were not without their darkside, the biggest being aliasing frequencies. Following the Nyquistcriteria guarantees that no information will be lost. It does not, however,guarantee that no information will be gained. Sampling an analog signal atprecise time intervals is an act of multiplying the input signal by thesampling pulses. This introduces the possibility of generating falsesignals indistinguishable from the original. In other words, given a set ofsampled values, we cannot relate them specifically to one unique signal. AsFigure 1 shows, the same set of samples could have resulted from any of thethree waveforms shown and from all possible sum and difference frequenciesbetween the sampling frequency and the one being sampled. All such falsewaveforms that fit the sample data are called aliases. In audio, thesefrequencies show up mostly as intermodulation distortion products, and theycome from the random-like white noise or any ultrasonic signal present inevery electronic system. Solving the problem of aliasing frequencies iswhat improved audio conversion systems to today's level of sophistication.
Claude Shannon is recognized as the father of information theory. While ayoung engineer at Bell Labs in 1948, he defined an entirely new field ofscience. Earlier, while a 22-year-old student at MIT, he had shown in hismasters thesis how the algebra invented by the British mathematician GeorgeBoole in the mid-1800s, could be applied to electronic circuits. Since thattime, Boolean algebra has been the rock of digital logic and computerdesign. Shannon studied Nyquist's work closely and came up with adeceptively simple addition. He observed and proved that if you restrictthe input signal's bandwidth to less than half the sampling frequency, thenno aliasing errors are possible. Bandlimiting your input to no more thanhalf the sampling frequency guarantees no aliasing, but it is not possible.
To satisfy the Shannon limit, you must have the proverbial brickwall-infinite-slope filter, which cannot happen in our universe. You cannotguarantee that there is absolutely no signal or noise greater than theNyquist frequency. Fortunately, there is a way around this problem. If youcannot restrict the input bandwidth to prevent aliasing, then solve theproblem by increasing the sampling frequency until the aliasing productsthat do occur do so at ultrasonic frequencies and are effectively dealtwith by a simple single-pole filter. This is where the term oversamplingcomes in. For full-spectrum audio, the minimum sampling frequency must be40 kHz, giving a useable theoretical bandwidth of 20 kHz, the limit ofnormal human hearing. Sampling at anything significantly higher than 40 kHzis oversampling. In just a few years, we have seen the audio industry gofrom the CD system standard of 44.1 kHz and the pro audio quasi-standard of48 kHz to 8x and 16x oversampling frequencies of around 350 kHz and 700 kHzrespectively. With sampling frequencies this high, aliasing is no longer anissue.
QuantizationQuantizing is the process of determining which of the possible values(determined by the number of bits or voltage reference parts) is theclosest value to the current sample; you assign a quantity to that sample.Quantizing involves deciding between two values and thus, always introduceserror. How big the error or how accurate the answer depends on the numberof bits. The more bits, the better the answer. The converter has areference voltage which is divided up into 2n parts, where n is the numberof bits. Each part represents one bit. Because you cannot resolve anythingsmaller than one bit, there is always error in the conversion process. Thisis the accuracy issue.
The number of bits determines the converter accuracy. For 8 bits, there are28 (256) possible levels as shown in Figure 2. Because the signal swingspositive and negative, there are 128 levels for each direction. Assuming a+/-5 V reference, this makes each division or bit equal to 39 mV (5/128 =.039). Hence, an 8 bit system cannot resolve anything smaller than 39 mV..This means a worst-case accuracy error of 0.78%. Table 1 compares theaccuracy improvement gained by 16 bit, 20 bit and 24 bit systems along withthe reduction in error. This is not the only way to use the referencevoltage. Many schemes exist for coding, but this one nicely illustrates theprinciples involved. Each step size, resulting from dividing the referenceinto the number of equal parts dictated by the number of bits, is equal andis called a quantizing step or interval. Originally, this step was termedthe least significant bit (LSB) because it equals the value of the smallestcoded bit, but it is an illogical choice for mathematical treatments.
The error due to the quantizing process, quantizing error, can be thoughtof as an unwanted signal that the quantizing process adds to the perfectoriginal. An example best illustrates this principle. Let the sampled inputvalue be some arbitrarily chosen value, say, 2 V, and let this be a 3 bitsystem with a 5 V reference. The 3 bits divide the reference into 8 equalparts (23 = 8) of 0.625 V each. For the 2 V input example, the convertermust choose between either 1.875 V or 2.5 V, and because 2 V is closer to1.875 than 2.5, then it is the best fit. This results in a quantizing errorof -0.125 V; the quantized answer is too small by 0.125 V.
These alternating unwanted signals added by quantizing form a quantizederror waveform that is a kind of additive broadband noise that is generallyuncorrelated with the signal and is called quantizing noise. Because thequantizing error is essentially random (uncorrelated with the input), itcan be thought of like white noise. This is not quite the same thing asthermal noise, but it is similar. The energy of this added noise is equallyspread over the band from DC to half the sampling rate. This is a mostimportant point, and I will returned to it when I discuss delta-sigmaconverters and their use of extreme oversampling.
Successive approximationSuccessive approximation is one of the earliest and most successful A/Dconversion techniques. Therefore, it is no surprise it became the initialA/D workhorse of the digital audio revolution. Successive approximationpaved the way for the delta-sigma techniques to follow.
The heart of an A/D circuit is a comparator, an electronic block whoseoutput is determined by comparing the values of its two inputs. If thepositive input is larger than the negative input, then the output swingspositive, and if the negative input exceeds the positive input, the outputswings negative. Therefore, if a reference voltage is connected to oneinput and an unknown input signal is applied to the other input, you nowhave a device that can compare and tell you which is larger. Thus, acomparator gives you a high output (a one) when the input signal exceedsthe reference or a low output (a zero) when it does not. A comparator isthe key ingredient in the successive approximation technique as shown inFigure 3.
In successive approximation, the circuit evaluates each sample and createsa digital word representing the closest binary value. The process takes thesame number of steps as bits available; a 16 bit system requires 16 stepsfor each sample. The analog sample is successively compared to determinethe digital code beginning with the determination of the biggest (mostsignificant) bit of the code.
The description given in Daniel Sheingold's Analog-Digital ConversionHandbook offers the best analogy as to how successive approximation works.The process is analogous to a gold miner's assay scale or a chemicalbalance, which uses a set of graduated weights, each one half the value ofthe preceding one-1 g, 0.5 g, 0.25 g and so on. You compare the unknownsample against these known values by first placing the heaviest weight onthe scale. If it tips the scales, you remove it; if it does not, you leaveit and go to the next smaller value. If that value tips the scale youremove it, if it does not you leave it and go to the next lower value, andso on until you reach the smallest weight available. The sum of all theweights on the scale represents the closest value you can resolve. In thedigital terms, we can analyze this example by saying that a zero wasassigned to each weight removed and a one to each weight remaining. inessence creating a digital word equivalent to the unknown sample with thenumber of bits equaling the number of weights. The quantizing error will beno more than half the quantizing step. Again, the successive approximationtechnique must repeat this cycle for each sample. This remains atime-consuming process and is still limited to relatively slow samplingrates, but it did get us into the 16 bit, 44.1 kHz digital audio world.
PCM and PWMThe successive approximation method of data conversion is an example ofpulse code modulation (PCM). Three elements are required: sampling,quantizing and encoding into a fixed-length digital word. The reverseprocess reconstructs the analog signal from the PCM code. The output of aPCM system is a series of digital words where the word size is determinedby the available bits. For example, the output can be a series of 8 bit, 16bit or 20 bit words with each word representing the value of one sample.
Pulse width modulation (PWM) is simpler and quite different from PCM. (SeeFigure 4). In a typical PWM system, the analog input signal is applied to acomparator whose reference voltage is a triangle-shaped waveform whoserepetition rate is the sampling frequency. This simple block forms what iscalled an analog modulator. A simple way to understand the modulationprocess is to view the output with the input held steady at 0 V. The outputforms a 50% duty cycle (50% high, 50% low) square wave. As long as there isno input, the output is a steady square wave. As soon as the input isnon-zero, the output becomes a PWM waveform. That is, when the non-zeroinput is compared against the triangular reference voltage, it varies thelength of time, and the output is either high or low.
For example, say that there was a steady DC value applied to the input. Forall samples when the value of the triangle is less than the input value,the output stays low, and for all samples when it is greater than the inputvalue, it changes state and remains high. Therefore, if the triangle startslower than the input value, the output goes high; at the next sampleperiod, the triangle has increased in value but is still less than theinput, so the output remains high; this continues until the trianglereaches its apex and starts down again;. Eventually, the triangle voltagedrops below the input value, and the output drops low and stays there untilthe reference exceeds the input again. The resulting PWM output, whenaveraged over time, gives the exact input voltage. If the output spendsexactly 50% of the time with an output of 5 V and 50% of the time at 0 V,then the average output would be exactly 2.5 V.
This is also an FM, or frequency-modulated system-the varying pulse-widthtranslates into a varying frequency, and it is the core principle of mostClass-D switching power amps. The analog input is converted into a variablepulse-width stream used to turn the output switching transistors on. Theanalog output voltage is simply the average of the on-times of the positiveand negative outputs. Another way to look at this is that this simple devicecodes a single bit of information-a comparator is a 1 bit A/D converter. PWMis an example of a 1 bit A/D encoding system, and a 1 bit A/D encoder formsthe heart of delta-sigma modulation.
Delta-sigma modulationAfter nearly 30 years, delta-sigma modulation (sometimes sigma-delta) hasonly recently emerged as the most successful audio A/D convertertechnology. It waited patiently for the semiconductor industry to developthe technologies necessary to integrate analog and digital circuitry on thesame chip. Today's high-speed mixed-signal IC processing allows the totalintegration of the circuit elements necessary to create delta-sigma dataconverters of awesome magnitude.
How the name came about is interesting. Another way to look at the actionof the comparator is that the 1 bit information tells the output voltagewhich direction to go based upon what the input signal is doing. It looksat the input and compares it against its last sample to see if this newsample is bigger or smaller than the last one, that is, the informationtransfer-bigger or smaller, increasing or decreasing. If it is bigger, thanit tells the output to keep increasing, and if it is smaller, it tells theoutput to stop increasing and start decreasing. It reacts to the change.Mathematicians use D to stand for deviation or small incremental change,which is how this process came to be known as delta modulation. The sigmacame about by the significant improvements made from summing or integratingthe signal with the digital output before performing the delta modulation.Mathematicians use S to stand for summing. Essentially a delta-sigmaconverter digitizes the audio signal with a very low resolution (1 bit) A/Dconverter at a high sampling rate. It is the oversampling rate andsubsequent digital processing that separates this from plain deltamodulation.
Considering quantizing noise, it is possible to calculate the theoreticalsine wave signal-to-noise (S/N) ratio (actually the signal-to-error ratio,but for our purposes it is close enough to combine) of an A/D convertersystem knowing only n, the number of bits. Some math will show that thevalue of the added quantizing noise relative to a maximum (full-scale)input equals 6.02n + 1.76 dB for a sine wave. A perfect 16 bit system willhave a S/N ratio of 98.1 dB, while a 1 bit delta-modulator A/D converter,on the other hand, will have only 7.78 dB.
To get an intuitive feel for this, consider that because there is only 1bit, the amount of quantization error possible is as much as half a bit.Because the converter must choose between the only two possibilities ofmaximum or minimum values, then the error can be as much as half of that.Further, because this quantization error shows up as added noise, then thisreduces the S/N to something on the order of around 2:1 or 6 dB.
One attribute shines above all others for delta-sigma converters and makesthem a superior audio converter-simplicity. The simplicity of 1 bittechnology makes the conversion process fast, and a fast conversion allowsuse of extreme oversampling. Extreme over-sampling pushes the quantizingnoise and aliasing artifacts way out to mega-wiggle land, where it iseasily dealt with by digital filters (typically 64x over-sampling is used,resulting in a sampling frequency on the order of 3 MHz).
To understand how oversampling reduces audible quantization noise, we needto think in terms of noise power. From physics, you may remember that poweris conserved-changed but never destroyed. Quantization noise power issimilar. With oversampling, the quantization noise power is spread over aband that is as many times larger as is the rate of over-sampling. For 64xoversampling, the noise power is spread over a band that is 64x larger,reducing its power density in the audio band by 1/64[superscript]th. (SeeFigure 5.)
Noise shaping further reduces in-band noise. Oversampling pushes out thenoise, but it does so uniformly; that is, the spectrum is still flat. Noiseshaping changes that. Using clever complex algorithms and circuit tricks,noise shaping contours the noise so that it is reduced in the audibleregions and increased in the inaudible regions. Conservation still holds;the total noise is the same, but the amount of noise present in the audioband is decreased while simultaneously increasing the noise out-of-band.Then, the digital filter eliminates it.
As shown in Figure 6, a delta-sigma modulator consists of three parts-ananalog modulator, a digital filter and a decimation circuit. The analogmodulator is the 1 bit converter discussed previously with the change ofintegrating the analog signal before performing the delta modulation. Theintegral of the analog signal is encoded rather than the change in theanalog signal as is the case with traditional delta modulation.Oversampling and noise shaping pushes and contours all the bad stuff(aliasing and quantizing noise) so that the digital filter suppresses it.The decimation circuit (decimator) is the digital circuitry that generatesthe correct output word length of 16 bits, 20 bits or 24 bits and restoresthe desired output sample frequency. It is a digital sample-rate reductionfilter and is sometimes termed downsam-pling because it returns the samplerate from its 64x rate to the normal CD rate of 44.1 kHz (or 48 kHz or even96 kHz). The net result is greater resolution and dynamic range withincreased S/N ratio and far less distortion compared to successiveapproximation techniques, all at lower costs.
One more note. Due to the oversam-pling and noise-shaping characteristicsof delta-sigma A/D converters, certain measurements must use theappropriate bandwidth or inaccurate answers result. Specifications such assignal-to-noise, dynamic range and distortion are subject to misleadingresults if the wrong bandwidth is used. Because noise shaping purposelyreduces audible noise by shifting the noise to inaudible higherfrequencies, taking measurements over a bandwidth wider than 20 kHz resultsin answers that do not correlate with the listening experience. Therefore,it is important to set the correct measurement bandwidth to obtainmeaningful data.
DitherNow that oversampling helped get rid of the bad noise, let us add dither.Dither (from a 12th century English term meaning to tremble) means to be ina state of indecisive agitation, or to be nervously undecided in acting ordoing. Dither is one of life's many tradeoffs. Here the tradeoff is betweennoise and resolution. We can introduce dither (a form of noise) andincrease the ability to resolve small values, values, in fact, smaller thanour smallest bit. Perhaps you can begin to grasp the concept by making ananalogy between dither and anti-lock brakes. With regular brakes, if youjust stomp on them, you probably create an unsafe skid situation for thecar. Instead, if you rapidly tap the brakes, you control the stoppingwithout skidding. We shall call this dithering the brakes. What you havedone is introduce noise (tapping) to an otherwise rigidly binary (on oroff) function. Therefore, by tapping on our analog signal, we can improveour ability to resolve it. By introducing noise, the converter rapidlyswitches between two quantization levels rather than picking one or theother when neither is correct. Sonically, this comes out as noise ratherthan a discrete level with error. Subjectively, what would have beenperceived as distortion is now heard as noise.
The problem dither helps to solve is that of quantization error caused bythe data converter being forced to choose one of two exact levels for eachbit it resolves. It cannot choose between levels; it must pick one or theother. With 16 bit systems, the digitized waveform for high-frequency,low-signal levels looks very much like a steep staircase with few steps. Anexamination of the spectral analysis of this waveform reveals many nastysounding distortion products. We can improve this result either by addingmore bits or by adding dither. Prior to 1997, adding more bits for betterresolution was straightforward but expensive, thereby making dither aninexpensive compromise.
The dither noise is added to the low-level signal before conversion. Themixed noise causes the small signal to jump around, which causes theconverter to switch rapidly between levels rather than being forced tochoose between two fixed values. Now, the digitized waveform still lookslike a steep staircase, but each step, instead of being smooth, has manynarrow strips, like vertical Venetian blinds. The spectral analysis of thiswaveform shows almost no distortion products at all, albeit with anincrease in the noise content. The dither has caused the distortionproducts to be pushed out beyond audibility, and replaced with an increasein wideband noise. (See Figure 7.)
Life after 16Current digital recording standards allow for only 16 bits, yet it is safeto say that for all practical purposes, 16 bit technology is history.Everyone who can afford the upgrade is using 20 bit and 24 bit dataconverters and (temporarily, until DVD-Audio becomes the new standard)dithering (as opposed to truncating) down to 16 bits. In going to 20 bits,you gain 24 dB more dynamic range, 24 dB less residual noise, 16:1reduction in quantization distortion, and improved jitter (timingstability) performance. If it is 24 bits, add another 24 dB to each of theabove and make it a 256:1 reduction in quantizing error with essentiallyzero jitter.
With today's technology, analog-to-digital-to-analog conversion is theelement defining the sound of a piece of equipment, and if it is not doneperfectly, then everything that follows is compromised. With 20 bit,high-resolution conversion, low signal-level detail is preserved. Theimprovement in fine detail shows up most noticeably by reducing thequantization errors of low-level signals. Under certain conditions, thesecourse data steps can create audio passband harmonics not related to theinput signal. Audibility of this quantizing noise is much higher than innormal analog distortion and is known as granulation noise, but 20 bitsvirtually eliminates granulation noise. Commonly heard examples are musicalfades, like reverb tails and cymbal decay. With only 16 bits to work with,they do not so much fade as collapse in noisy chunks.
Where it matters most is in measuring small things. It does not make muchdifference when measuring big things. If your ruler measures in whole inchincrements and you are measuring something 10 feet (3 m) long, the most youcan be off is by half an inch. Not a big deal, but if what you aremeasuring is less than an inch, you have an accuracy problem. This isexactly the problem in digitizing small audio signals. Graduating our audiodigital ruler finer and finer means we can accurately resolve smaller andsmaller signal levels, allowing us to capture the musical details. Gettingthe exact right answer does result in better reproduction of music.
Candy, James C. and Gabor c. Temes, eds. Oversampling Delta-Sigma DataConverters: Theory, Design, and Simulation (IEEE Press ISBN 0-87942-285-8,NY, 1992).
"Delta Sigma A/D Conversion Technique Overview," Application Note AN 10(Crystal Semiconductor Corporation, TX, 1989).
Pohlmann, Ken C. Advanced Digital Audio (Sams ISBN 0-672-22768-1, IN, 1991).
Pohlmann, Ken C. Principles of Digital Audio, 3rd ed. (McGraw Hill ISBN0-07-050469-5, NY, 1995).
Sheingold, Daniel H., ed. Analog-Digital Conversion Handbook, 3rd ed.(Prentice-Hall ISBN 0-13-032848-0, NJ, 1986).
"Sigma-Delta ADCs and DACs," pp. 20-1 to 20-18, 1993 Applications ReferenceManual (Analog Devices, MA, 1993).
The American Heritage Dictionary of the English Language, 3rd ed. (HoughtonMiffin ISBN 0-395-44895-6, Boston, 1992).
Watkinson, John. The Art of Digital Audio, 2nd ed. (Focal Press, ISBN0-240-51320-7, Oxford, England, 1994).