Voiceprint analyses
Voiceprint analysis involves creating visual representations of an individual's voice through biometric measurements, helping to identify speakers in recorded conversations. Each person's voice is unique, influenced by anatomical features such as vocal cords, mouth shape, and speech patterns. This analysis is done using a device called a sound spectrograph, which produces a voiceprint considered to be as distinctive as a fingerprint. Law enforcement uses this technology to assist in investigations, although its admissibility in U.S. courts varies by state due to ongoing debates about accuracy and reliability.
Historically, voiceprint analysis began in the 1940s but gained traction in the 1960s when researchers explored its potential for identifying callers behind anonymous threats. While some studies, notably by engineer Lawrence Kersta, suggested high accuracy rates, subsequent investigations have reported mixed results. Concerns persist regarding the technology's capability to account for changes in an individual’s voice over time or the potential for voice disguise. For valid analysis, conditions such as recording quality and adequate sample duration must be met, and trained technicians are essential to ensure accurate comparisons. Despite its challenges, voiceprint analysis remains a significant tool within the forensic landscape.
Subject Terms
Voiceprint analyses
DEFINITION: Visual representations of individuals’ voices based on biometric measurements.
SIGNIFICANCE: Law-enforcement investigators use voiceprint analysis to determine the likelihood that particular individuals are the persons speaking in recorded conversations. In the United States, the admissibility of voiceprints as evidence varies from state to state.
The principle underlying the use of voiceprints, also called sound spectrograms, for identification is that each individual has a unique voice; that is, each person’s voice has particular characteristics that allow the voice to be distinguished from every other voice. Each person’s voice is affected by the size and shape of the person’s vocal cords, mouth, throat, teeth, and nasal cavity. In addition, voice uniqueness derives from the movement of the tongue, lips, and jaw muscles during speech. When a speaker’s voice is analyzed by an instrument known as a sound spectrograph, which maps the voice onto a graph to produce a visual representation, the resulting voiceprint is, according to proponents of the technology, as unique as a fingerprint. This technology is one of several that has been used to authenticate the claim that Osama Bin Laden is the speaker on audiotapes released by al-Qaeda .
History
In 1941, Bell Telephone Laboratories in New Jersey developed the sound spectrograph, a device that analyzes sound frequencies and wavelengths and creates visual records of sounds in the form of graphs. Although intelligence agencies were interested in the technology as a way to identify enemy agents from recorded telephone conversations, progress toward the identification of individual speakers was slow until the early 1960s. At that time, in New York City became interested in using voice analysis to assist in identifying a caller who was repeatedly phoning in bomb threats to airlines. They asked Lawrence Kersta, a Bell Labs engineer, to determine whether a comparison of sound spectrograms, which Kersta later called voiceprints, could be used to identify a positively as the caller. Kersta experimented with visual pattern matching of voiceprints and concluded that when an unknown voiceprint was compared with that of a known speaker, the likelihood of a match could be determined with more than 99 percent accuracy.
Kersta’s results were not universally accepted, and other researchers found a lower degree of accuracy, but in the early 1970s, law-enforcement agencies began trying to enter voiceprints into in criminal cases. Some courts accepted voiceprint evidence; others threw it out on the grounds that the technology had not been adequately proven. Although the American Board of Recorded Evidence, an advisory board of the American College of Forensic Examiners, published standards in 1997 for the comparison of voice samples and certifies speaker identification examiners, voiceprint evidence is not uniformly admissible in American courts.
Controversies Surrounding Voiceprint Evidence
From the beginning, voiceprint evidence has been a topic of controversy. Although Kersta claimed almost perfect accuracy in identifying speakers, his experiments were performed under ideal conditions with high school girls as subjects. Other experimenters, working under different, less ideal conditions, reported lower accuracy rates. Currently, the results of voiceprint comparisons are classified, based on the numbers of similarities in the samples, as positive identification, probable identification, positive elimination, probable elimination, or unable to determine. In a study of two thousand forensic voiceprints, the Federal Bureau of Investigation (FBI) found 0.31 percent false identifications and 0.53 percent false eliminations.
Among the questions that have plagued voiceprint evidence are whether voiceprints of the same person change over time and whether a voice can be disguised to fool the spectrograph. Studies have shown quite conclusively that although a person’s voice may sound different to listeners as the person ages, the frequency and wavelength of the sound remains essentially unchanged. Disguising or distorting the voice, however, can make voiceprint comparison invalid. A trained examiner will recognize that one voice sample has been artificially altered, and this may force an “unable to determine” finding. Courts have ruled that it is not a violation of suspects’ rights to compel them to provide acceptable voice samples.
Standards and Training
Certain conditions must be met for the results of voiceprint comparisons to be considered valid. Several minutes of speech from both the known speaker and the unknown speaker must be available for analysis. Ideally, the samples should contain many of the same words and phrases. The style of speech in the samples must be similar—for example, one cannot be shouted and the other whispered. Relaxed, normal conversation produces the most accurate results. The quality of both recordings must be good (for instance, clear and free of excessive background noise). In addition, the analyst must be a trained voiceprint technician. The analyst should make both a visual comparison of the voiceprints and an auditory comparison of the samples to listen for vocal tics, phrasing, and accent similarities and differences.
Minimum training for a voiceprint technician involves completing a two- to four-week course, performing a minimum of one hundred voice comparisons under the direct supervision of an expert, and passing an examination given by experts in the field. Voiceprint technicians who serve as expert witnesses often have additional training, including academic research in forensic linguistics or forensic phonetics. The International Association of Forensic Linguists and the International Association of Forensic Phonetics and Acoustics publish the International Journal of Speech Language and the Law, which presents research findings and reports on legal cases involving speaker identification through voice samples.
Bibliography
Dornman, Andy. “Biometrics Becomes a Commodity.” IT Architect, February 1, 2006, 46.
“Forensic Science, No Consensus.” Issues in Science and Technology 20 (Winter, 2004): 5–9.
Hollien, Harry. Forensic Voice Identification. San Diego, Calif.: Academic Press, 2002.
James, Stuart H., and Jon J. Nordby, eds. Forensic Science: An Introduction to Scientific and Investigative Techniques. 4th ed., CRC Press, 2014.
Kalat, David. "Nervous System: From a Cry in the Dark to the Forensic Voiceprint." BRG, 4 May 2022, www.thinkbrg.com/insights/publications/kalat-ns-forensic-voiceprint/. Accessed 19 Aug. 2024.
Moore, Sarah. "Voice Analysis in Forensics." AZO Life Sciences, 17 Dec. 2021, www.azolifesciences.com/article/Voice-Analysis-in-Forensics.aspx. Accessed 19 Aug. 2024.
Tanner, Dennis C. Medical-Legal and Forensic Aspects of Communication Disorders, Voice Prints, and Speaker Profiling. Tucson, Ariz.: Lawyers & Judges Publishing, 2007.