AI Headphones Isolate Speaker in Crowds

BY Mark Howell 29 May 20244 MINS READ
article cover

Today in Edworking News we want to talk about Engineering

AI Headphones Let Wearer Listen to a Single Person in a Crowd, by Looking at Them Just Once

UW News
Noise-canceling headphones have drastically improved at creating an auditory blank slate. However, allowing certain sounds from a user's environment to filter through this silence remains a challenge for researchers. For instance, the latest edition of Apple's AirPods Pro adjusts sound levels for users automatically, sensing when they are in conversation, but users have little control over whom to listen to or when this occurs.
A team from the University of Washington has developed an artificial intelligence system that allows a user wearing headphones to look at a person speaking for just three to five seconds to "enroll" them. This system, known as "Target Speech Hearing" (TSH), cancels all other sounds in the environment and plays only the enrolled speaker's voice in real-time. This occurs even as the listener moves around in noisy places and no longer faces the speaker.

University of Washington's AI-powered headphones allow users to focus on one speaker even in noisy environments.
The research team presented its findings on May 14 in Honolulu at the ACM CHI Conference on Human Factors in Computing Systems. The code for this proof-of-concept device is available for other developers to build on, yet it is not commercially available.

How the System Works

To use the system, one needs off-the-shelf headphones fitted with microphones. By tapping a button while directing their head at someone talking, the user enrolls the speaker's voice. The sound waves from that speaker should reach the microphones on both sides of the headset simultaneously, within a 16-degree margin of error. These sound signals are sent to an on-board embedded computer where the team’s machine learning software learns the vocal patterns of the desired speaker.
The ability of the system to focus on the enrolled voice improves as the speaker continues to talk, providing the system with more training data.

Testing and Performance

The team tested the system on 21 subjects who rated the clarity of the enrolled speaker's voice nearly twice as high as unfiltered audio on average. The TSH system builds on the team's previous semantic hearing research, which allowed users to select sound classes, such as birds or voices, to hear while canceling other environmental sounds.
Currently, the system can enroll only one speaker at a time. It is designed to enroll a speaker as long as there is no other loud voice from the same direction. Users who are not satisfied with the sound quality can re-enroll the speaker to improve clarity.
Future efforts are directed at expanding the system to earbuds and hearing aids.

Research Team and Funding

The senior author of the project, Shyam Gollakota, is a professor at the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Co-authors include doctoral students Bandhav Veluri, Malek Itani, and Tuochao Chen, and Takuya Yoshioka, director of research at AssemblyAI. This study was funded by the Moore Inventor Fellow award, the Thomas J. Cable Endowed Professorship, and a UW CoMotion Innovation Gap Fund.
Edworking is the best and smartest decision for SMEs and startups to be more productive. Edworking is a FREE superapp of productivity that includes all you need for work powered by AI in the same superapp, connecting Task Management, Docs, Chat, Videocall, and File Management. Save money today by not paying for Slack, Trello, Dropbox, Zoom, and Notion.

Remember these 3 key ideas for your startup:

  1. Enhanced Focus in Crowded Environments: The TSH system provides the capacity to focus on a single speaker even in noisy environments, enhancing communication clarity in busy settings such as conferences, open offices, and bustling networking events.

  2. Market Potential for Hearable Tech: The success and innovation demonstrated by this AI system highlight a significant opportunity for startups to develop and market advanced hearable technology like targeted speech hearing for broader consumer use, including integrations with earbuds and hearing aids.

  3. AI in Everyday Gadgets: This research illustrates the potential of incorporating AI into everyday devices, paving the way for startups to innovate new applications that modify and enhance user experience based on personal preferences, opening opportunities for new markets.

For more information on the latest trends and breakthroughs in technology and innovation, stay tuned to Edworking News.
Stay productive and innovative with Edworking!
For more details, see the original source.

article cover
About the Author: Mark Howell Linkedin

Mark Howell is a talented content writer for Edworking's blog, consistently producing high-quality articles on a daily basis. As a Sales Representative, he brings a unique perspective to his writing, providing valuable insights and actionable advice for readers in the education industry. With a keen eye for detail and a passion for sharing knowledge, Mark is an indispensable member of the Edworking team. His expertise in task management ensures that he is always on top of his assignments and meets strict deadlines. Furthermore, Mark's skills in project management enable him to collaborate effectively with colleagues, contributing to the team's overall success and growth. As a reliable and diligent professional, Mark Howell continues to elevate Edworking's blog and brand with his well-researched and engaging content.

Trendy NewsSee All Articles
Try EdworkingA new way to work from  anywhere, for everyone for Free!
Sign up Now