USF Health was host to the 2024 Voice AI Symposium, providing a unique opportunity to hear from voice and AI experts on the innovations, standards and direction for using voice as a biomarker for health.
The event is based on one of the four data generation projects funded in 2022 by NIH’s Bridge2AI that USF Health is co-leading with Weill Cornell Medicine.
The 2024 Voice AI Symposium in Tampa on May 1 and 2 gathered voice data and AI experts from around the world, including tech experts from Google, AI experts from institutes and universities, local and global entrepreneurs and startups, and medical and clinical professionals.
The symposium opened with a packed ballroom at the JW Marriott in downtown Tampa. Yaël Bensoussan, MD, assistant professor in the USF Health Morsani College of Medicine and co-PI on the NIH project Bridge2AI: Voice as a Biomarker for Health, set the stage and expectations for the two days of information and networking.
“You are the ones leading this effort,” Dr. Bensoussan said at the opening session for the symposium.
“It’s really a multi-directional transfer of knowledge and expertise between all of us,” she said. “We want to create a community of stakeholders. Our other main goal is to promote collaboration, to make connections, because that’s what will really change the future of voice to maximize the impact of voice AI discovery and patient outcomes.”
Among the topics and highlights across the two-day conference were team challenges when real patients to told their stories and shared their challenges and attendees worked together in small groups and offered ideas for solutions, voice AI tech fair that featured innovative startup companies, disease case studies, vibrant discussions aimed at setting defined standards to avoid global duplication of effort, and a scientific poster competition.
And a series of smaller workshops offered deeper dives into key topics, including standardization practices, hardware and software considerations for collecting and storing voice data, the current state of options for people with voice disorders, and a primer on the physiology of our voice and signal processing.
What they said:
- “I would say it's really important to have very diverse datasets. You need to have a diverse dataset to train from and you would need to have a diverse dataset of conditions to show that you can parse through all these conditions, particularly the ones you are after. I'm not sure that clinicians need necessarily to have something that can diagnose a particular condition. If you have a heart monitor, for example, you're not diagnosing anything. You're breaking up a particular measure that is helpful for the clinician. I would see the next step from the things we've learned in the past six to seven years as being able to be very, very good at one measurement, one vocal biomarker measurement. People speak differently, people have different accents, people have different languages and to be able just to measure one thing very well, one feature that is meaningful for conditions, that would be the next step.” Dale Joachim is the VP of Speech and Data Science at Sonde Health
- “In recent years I’ve been focusing on what we can do to massively expand our support of different languages because there are, depending on which linguist you talk to, 7000 active languages around the planet, which means that English is just a starting point. Reaching critical mass in English is a very sound strategy and then developing a game plan for how to expand. There will be a lot of elements of voice characteristics that maybe don't require so much new data to apply that same diagnostic model or content to do something else in another language. So first, use English as sort of the development language understanding the boundaries of what you can and can't do, and have a game plan for how to expand beyond that.” Bob MacDonald, Technical Program Manager, Google
- “The on-device analysis is the future for this work. It really does overcome a lot of the privacy issues and it parallels with signal analysis by some companies as well. For our own perspective, we’re excited about the wider adoption of speech and voice in larger genomic studies.” Adam Vogel, Director, Professor, Centre for Neurosciences of Speech, University of Melbourne
- “One of the most exciting things over the past year is seeing how the health care community as a whole, and this means everything from providers and academic medical centers and researchers and some of the federal agencies and patient advocacy groups, have all come together around this concept of trustworthiness and responsible AI. I have been pleasantly surprised to see people saying we know that there will likely be something happening from a policy perspective but we need to address this now because we need to make sure we understand how this is being used in our clinics and how it impacts patients.” Geralyn Miller, Senior Director of Health AI at Microsoft
- “Learning different ways we can communicate through technology will be incredibly important especially for those that have lost their voice due to too many surgeries or reconstruction of their tracheas or their voice box is just no longer functioning. So I think for me personally to know that there are options and that next year or five years down the road that I'll still be able to communicate, not just from my face, because I think that does a very good job, but using my voice in itself,” Breanne Leuze, Speaker, CareerFoundry
- “One of the biggest hurdles in the last few decades has been our use of mobile digital health apps because many of those are focused on one condition rather than thinking about the opportunity to collect a little bit more data on that individual so we're not just looking at final condition...I'm so excited to hear that there's data being captured along with voice, along with speech, of things like MRI and CT scans, in order to corroborate the evidence that we have with the voice. We can't get personalized in terms of personalized medicine until we have the right amount of data in order to see those patterns… AI is hungry for data. If you give it very little data, it's not going to do a very good job to see the kinds of things that we want. You have to have enough data and enough diverse data in order to see those patterns, not just for some populations but across populations… Why is this such a big area? We've seen the numbers already from voice biomarkers – 2.6 billion today and going up to 9.4 billion by 2034 – for speech biomarkers there’s a little bit more of a lag. These are huge annual growth rates. The reason is because we have now in our hands, in our pockets or in our purses a device that can collect data of our speech and voice. We have in our homes things that listen to us, smart speakers. Smart speakers were one of the fastest penetrating technologies in the last few decades. While the iPhone before that was thought to be the quickest adopted technology smart home devices were even faster, meaning that almost everyone had not just one but multiple of these devices in their homes allowing us to talk to the system.” Rupal Patel, PhD, founder and director of the Communication Analysis and Design Laboratory
- “The burden of privacy protection is going to be costly. It's hard to get people enthused about spending a lot on privacy. About our 1970s privacy protection is that we think consent empowered us, but it disempowered us as research subjects. It got us to answer questions that other people have decided for us. If someone else decided what the damage would be and what privacy protections are and then we'll get a yes/no vote, I want in/I want out. All we want is control. We want a say in what data uses are going to be, and what privacy protections we get. It's more like unionizing the data subjects so that we say we want these privacy protections to take our data or leave it instead of telling us to take it or leave it. So I'm in favor of a much more democratized system that gives more control to the people whose data are in our systems.” Barbara Evans, Professor of Law and Stephen C. O'Connell Chair, University of Florida Levin College of Law, and a Renwick Faculty Fellow in AI & Ethics, University of Florida
- “Individual consent divides groups of people and it also does not have the understanding, at least in the United States, that this is supposed to be a rolling agreement. Right now, it's considered a contract and that it’s a one-and-done contract, you signed on the dotted line and then you don't have any more input. And it's going to get worse with AI. In the past, in the 70s, you could pull your data out or, if you die, your data would automatically be pulled out. Those things are not true anymore.” Jospeh Yracheta, Board Vice President, Native BioData Consortium
- “As we think more about using biomarkers, using voice as biomarkers, how do we also think about trusting people's own experience in comparison to what the markers may tell us?… How do we actually balance whether we give more epistemic trust in certain things like biomarkers, voice biomarkers, compared to people's own testimony… So in trusting how we use data and trusting people's own testimony that we're not only listing all the quantifiable aspects or the biomarkers and listening to how people actually think about their own narrative. When we think about voice, think about people's own voice in determining what their experience might be.” Anita Ho, bioethicist at the University of British Columbia, the University of California, San Francisco, and the Northern California Vice President of Ethics for Common Spirit Health
- “We have to think of voice AI, voice as a biomarker, in the same context of the HIPAA laws, of the oath to do no harm, the commitment to duty of care. We have think about it and put around it the guidelines, the best practices and how do we educate and advocate. Opportunities to meet those who do direct care and learning from you, learning what you’re doing, is learning how to synthesize all of that information into best practices to help people understand and help people advance the trustworthiness of the technology.” Oita Coleman, Senior Advisor, Open Voice Network
- “At the frontline of medical care where you see these problems, just the voice disorder side is one aspect of these biomarkers…You have people where early detection would really benefit their treatment outcome so people with voice changes resulting from cancer, which is much better if you detect early rather than later. Increasing the speech diagnosis and early diagnosis is the key element across most of medicine to improve care. I also think being aggressive or more quantitative and rigorous about treatment outcomes helps you prepare the adequacy of the different things we do. With voice disorders, sometimes it’s a bit qualitative, which is still important, but having more rigorous biomarker assessment would probably allow for more quantitative comparison between different treatments.” Alexander Gelbard, MD, board-certified Otolaryngologist and tenured professor at Vanderbilt University
- “One of the biggest innovations in voice data trying to bring the voice outside the clinical world… One way to achieve this, to make sure it's done in a safe and secure manner, is to have an intermediate step to bring this as a new digital clinical endpoint into your research. From my standpoint it's the only way to develop secure biomarkers…To achieve this objective I think we need to make sure that the data point is high enough. We should not neglect the interaction with the end users.” Guy Fagherazzi, Head of the Deep Digital Phenotyping Research Unit at the Luxembourg Institute of Health
- “I do see the value of using voice as a biomarker for health and I would even argue that voice should be a vital sign. There's no difference between blood pressure, temperature and heart rate you get from your primary care physician and they can provide us – have them actually measure your voice… Just like they place a monitor in your mouth to take your temperature, there should be a voice recording – maybe only 10 seconds – something that is traceable that will actually put voice to the forefront and will allow us to track it across a wide variety of conditions and not just voices disorders.” Amir Lahav, Strategic Advisor, Clinical Neuroscientist, and Global Thought Leader.
- “We spoke a lot about the greatness of how the voice and the bio side of biomarkers help physicians diagnose a patient, but I want to take it a step further and say that it's not just about the AI because we're kind of just seeing that today – hey, summarize this for my patient document for me. It's the next steps. I’d like to focus a little bit more past the AI and more execution of the AI, on the execution of the actual diagnosis or the summarization of the AI. That, OK fine, we detected a bad cough or your voice is a little bit higher than normal so you should do this, but now add in that we should contact and bring in person. I know that we are so deep in the weeds of AI technology, but are we also adding that there’s another tool for doctors.” Satrajit Ghosh, Director of the Open Data in Neuroscience Initiative, McGovern Institute at MIT
- “Role of AI a way for humans to make decisions or take actions that improves speed or quality, or both. So maybe that's something we need to think of a little bit more if we're collecting the data, to explain to people what we're really doing, what is the role that we want AI to play in the health care arena. I think people will be perhaps a little more responsive to that in terms of understanding the overall intent.” Charlie Reavis, President of Dysphonia International
Photos and video by Freddie Coleman and Ryan Rossy, USF Health Communications