Book review: The bible of communication acoustics
Mar 1, 1996 12:00 PM,
By Ted Uzzle
Harvey Fletcher, edited by Jont B. Allen, The ASA Edition of Speech and Hearing in Communication, published for the Acoustical Society of America by the American Institute of Physics, 1995, x + 34 + 487 pp., cloth, $38.
Some years ago the Acoustical Society of America undertook a program of revising, updating and republishing the out-of-print classic acoustics books. The most appealing thing about this program is the very reasonable pricing for these must-have editions. This program has gone from triumph to triumph, and you’ve seen many of these books reviewed here, bylines such as Beranek, Knudsen and Harris, Stevens and Davis, and von B‚k‚sy. They ain’t making them like that no more.
Now the ASA has republished the life’s work of Harvey Fletcher (1884-1981), who was, beyond a doubt. one of the greatest figures in the history of audio and acoustics.
Is it necessary to explain who Harvey Fletcher was? While a graduate student in the first decade of this century he worked with Robert Millikan to determine the charge of the electron. Millikan won the Nobel Prize for his oil-drop experiment; it was Fletcher who tried oil after water-drop experiments failed. He worked 33 years at the Bell Laboratories, 34 years at Brigham Young University and three years at Columbia University. His publications form a list 105 entries long, and more lost papers and articles are still turning up.
He invented stereophonic sound (Ampel, 1995). He worked out the mathematics of feedback stability (Uzzle, 1995). He developed equal-loudness contours. He invented the articulation index method of intelligibility prediction. He invented the hearing aid with gain, vacuum tubes in those days. Fletcher approved the Bell Laboratories research project that produced transistors. He was given the highest awards of and held office in the Acoustical Society, the Audio Engineering Society, the American Association for the Advancement of Science, the American Physical Society and more than we have room to print on this page.
In 1929 Fletcher’s book Speech and Hearing was published by D. Van Nostrand in New York. It was the result of work conducted at the AT&T Bell Laboratories since 1914. It covered the nature and characteristics of human speech and of speech perception. Long after Fletcher left Bell Laboratories, Krieger Publishing printed his Speech and Hearing in Communication. This new book was a thoroughly rewritten and updated version of the older one, relying on a quarter-century of new research by Fletcher and others. That was 1953; this book was reprinted in 1958, 1961, 1965 and 1972.
When the Acoustical Society chose Speech and Hearing in Communication to reprint in 1995, it invited Jont B. Allen of the AT&T Bell Laboratories to edit it and contribute material for the front and back of the book. Allen wound up chairing a session at the June 1995 Acoustical Society convention. This session reviewed Fletcher’s voluminous and diverse career, some of it ranging far beyond Speech and Hearing in Communication. The present book, slightly renamed The ASA Edition of Speech and Hearing in Communication, stands as the major statement of the extraordinary life’s work of an extraordinary man.
When Fancher Murray was a student of Harvey Fletcher’s at Brigham Young University in 1959, this book was used as the text, and Murray preserved a carefully assembled set of typographical corrections, most directly from Fletcher himself. Allen has incorporated these corrections, as well as others, in this new edition. In general, the text and figures have been left alone in the text, but the reader is directed to six pages of corrections and comments at the back of the book. Some of these very usefully convert to S.I. units, although not quite all of the beneficial conversions have been made, especially in the figures. Occasionally one finds a slight misspelling, but one can hardly complain in a book so demanding of the author and reader.
The book opens with Allen’s 34-page introduction, recapping Fletcher’s life and career briefly and his contributions to communications in some depth. Fletcher’s own research is described under three major headings: articulation, what we in the sound business call speech intelligibility; loudness, a subject Fletcher and Wilden A. Munson invented; and the critical band, which is vital to an understanding of the relationship of loudness to other perceptions, such as pitch, and to the ear mechanisms that create them.The book begins with descriptions of the human speech and hearing mechanisms. Fletcher’s co-evolutionary understanding of these is highlighted in the first paragraph of the book’s introduction: “The processes of speaking and hearing are very intimately related, so much so I have often said that we speak with our ears. We can listen without speaking but cannot speak without listening. People who are born without hearing learn to talk only with the greatest difficulty, and none of them has yet succeeded in producing what most of us would call normal speech.”
Fletcher spends much time on the English language, and what he has to say will be surprising to many who haven’t explored linguistics at all. The Bell Laboratories studied 39 phonemes, or basic speech sounds, atoms, as it were, of English speech, classified them and identified their waveform characteristics. We see line spectra for each phoneme in and melodic graphs, frequency vs. time, for, “Joe took father’s shoe bench out” and “She was waiting at my lawn.” These celebrated sentences were used at the Bell Laboratories to demonstrate all the fundamental sounds that actually contribute to speech amplitude.So what? Here’s what. Peak-to-average ratios in running American speech were identified with more care than we’ve seen before or since. This information has real impact on the design of speech communications systems, and components for them.
Fletcher studied speech peak-to-average ratios as what would be called Leq percentages in noise-control work. L10 is the sound-pressure level exceeded 10% of the time; L90 is the sound-pressure level exceeded 90% of the time. He took the long-term average levels of speech and found it was exceeded by 21 dB 0.1% of the time, one-thousandth of the time. It was exceeded by 18 dB to 21 dB 1.7% of the time. It was exceeded by 15 dB to 18 dB 5.1% of the time. Thus, American speech has peaks 15 dB above average 6.9% of the time. You can appreciate this wealth of new information about speech characteristics not seen anywhere else.
The methodology of this experiment is a little obscure. Fletcher divided speech into 125 ms increments, so there would be eight per second. Why? Or, more to the point, why not 100 ms increments, a rounder number seemingly easier from which to extrapolate? Fletcher doesn’t explain or justify his techniques, but of course he describes them in detail.
Turn the page and Fletcher is off chasing other, equally fascinating information. What is the occurrence of spoken phonemes and spoken words over the telephone, and how does it differ from their occurrence in written English? Well, in the late 1920s telephone researchers listened to 500 toll calls to the New York area. One week they recorded only the verbs, the next week only the nouns. They didn’t count “er,” “yeah,” “uh-huh,” “oh,” “all right,” “hello,” “good-bye” or profanity.
Guess what? Our spoken word-hoard is quite different from the one we use in writing. The most frequently spoken words, “I” and “you,” first and second, appear only as tenth and fifteenth in written occurrence. In speaking, “get,” “see,” “know,” “don’t,” “do,” “want,” “go” and “tell” all appear in the top 27, but they all appear below 50 in written English.
As we would expect, classical psychoacoustics is not shortchanged, and chapters are devoted to minimum perceptible changes in pressure level (what modern cognitive psychologists would call JNDs, or just-noticeable differences), to masking and to loudness.
Two chapters are of special interest to sound professionals, those on binaural hearing effects and auditory perspective, what we today would call stereophony.
Fletcher next turns to the space-time pattern theory of hearing, a grand theory of auditory perception he developed in the 1920s. It is bedeviled by the question of how pitch discrimination in the cochlea and brain interact. Judgment on this one we shall leave to the historians of the far future. I will only shake my finger again at audio types who persistently and willfully misunderstand modern research. The cochlea is not a piano keyboard, with pitch perception located along it, high pitches at one end and low pitches at the other. Nor is it an in-line filter bank. Decades of auditory research have definitively overturned this older, oversimplified, erroneous view.
The remainder of the book is devoted to speech intelligibility. If pitch perception is troublesome, this is a war zone. Fletcher rightly returns us to first principles and shows variables discovered by empirical research from 1919 to 1945, starting at the Bell Laboratories and then undertaken at Harvard to improve wartime aeronautical radio communications.
If you work with the capture, recording, reproduction or reinforcement of human speech, you need to read and refer to The ASA Edition of Speech and Hearing in Communication. We’re lucky the Acoustical Society has reprinted it in such a carefully corrected and expanded edition, and we’re lucky it’s available at so modest a price. You must be warned of two things: This book contains calculus, but truckloads can be mined from it even ignoring the math. Also, it is written in the jargon of an earlier age, and that can make it tricky to relate this work to contemporary research. But this book is absolutely essential.
ReferencesAmpel, Fred, and Ted Uzzle, “Multichannel Auditory Perspectives: A Historical View of Harvey Fletcher’s Forgotten Contributions and Their Ramifications,” read at the Acoustical Society convention, June 1995.Uzzle, Ted, “The Feedback Research of Harvey Fletcher,” S&VC, Vol. 13 No. 7, pp. 64-66, July 20, 1995.