Beyond words, volumes may be learned from the human voice: Is a cry for help genuine or fake? Does a person unknowingly suffer from a vocal cord disorder? What is the relationship between spouses, or the compatibility of a therapist and patient? BIU computer scientist Dr. Joseph Keshet, a foremost authority on voice analysis, has developed sophisticated systems and software to address such questions and tap the vast potential in this exciting frontier field.
“When I started exploring speech recognition, I realized that an opportunity was being missed — not enough academic research was being done to keep pace with this rapidly industrialized field,” recalls Dr. Joseph Keshet. Since then the BIU computer scientist has made some impressive strides which are generating interest worldwide. He uses novel voice analysis techniques to diagnose vocal cord disorders, and to extract personal and physical information about the speaker, as well as details about his environment.
“There are many parameters in speech that are subconscious and uncontrollable, such as the duration of the ‘stop consonants’ pronounced by the speaker”, he explains, referring to the T, D, B, P, G, and K sounds in American English, which are formed by completely stopping the flow of air and then releasing it. “When I wish to communicate with someone, I tend to match the length of my consonants to that of my partner, who does the same, so we are synchronized.” He also notes the tendency to match accents and styles with speaking partners. “We take advantage of this phenomenon to measure the quality of the potential connection between the speakers, for example, the relationship between a therapist and patient. Keshet and his team base their research on analysis and processing of all existing international voice databases. “The parameters that we’re developing for speech analysis can work in any language and under severe noise conditions,” he stresses. For this big data challenge, he employs machines with powerful computational capacities — a vast improvement over previous research methods when manual linguistic analysis would take some 3,000 hours! “Currently we are working on ten times the data in just a few minutes.”
Diagnosing Disease from the Human Voice
“The computer receives a very large set of voice samples from patients with particular illnesses. The machine statistically analyzes the speech signals within their context in order to learn about the characteristics of those illnesses,” says Keshet. “Through ‘machine learning’, we are provided with a very accurate diagnosis of the specific disease and the precise clinical condition of the damaged vocal cords.”
The systems and algorithms devised by Keshet’s team – in cooperation with Dr. Jacob T. Cohen of Rambam Medical Center’s Head and Neck Surgery Department – can uninvasively diagnose cases of paralysis of the vocal cords, and soon papilloma, polyps (benign tumor), and cancer. The system can be used at any clinic and, in theory, can be activated with a mobile phone application. Says Keshet, “Until now the diagnosis was done by inserting a laryngoscope into the throat. Now it’s possible to identify the disease and its severity via a voice recording.”
Voice Analysis in the Service of Security and Interpersonal Relations
Advanced voice analysis capabilities are also enlisted to maintain security and to combat terrorism. Software developed in Keshet’s lab, which is evaluated in various US universities, has improved upon the software of the US Department of Homeland Security in gathering physical information about a speaker. “We were able to reach a more accurate resolution of weight and height,” relates the Israeli researcher. In addition, algorithms developed in Keshet’s lab can divulge details about the speaker’s environment, e.g., whether there is a concrete floor or even a fan.
In another study which analyzed the voices of a large number of English speakers in the United States, researchers were able to accurately identify the origin of the parents of children born and raised in the US who had never visited their parents’ country of origin. “We can detect the father’s Portuguese accent in the child’s voice,” says Keshet. The researchers were able to repeat this experiment successfully in German, Portuguese and Spanish.
Voice and speech obviously play an important role in both professional and personal relationships. Together with Dr. Dana Atzil-Slonim and Dr. Eran Bar-Kalifa of BIU’s Department of Psychology, Keshet analyzes the voice and compatibility of Israeli psychologists and patients in consentually recorded conversations in clinics, in order to gauge the quality of the interaction and level of trust between therapist and patient. In another joint study researchers analyze the level of support, reciprocity, and intimacy between spouses based on the analysis of their voice samples.
Clearly then, it’s not just what you say, but how you say it. And judging from the positive vibes his pioneering voice analysis research is making, chances are good that we have not heard the last word from BIU’s noted Dr. Joseph Keshet.