Abstract
This study introduces a novel real-time, gaze-directed audio-visual speech enhancement (AVSE) framework for hearing aids designed to improve speech intelligibility for individuals with hearing loss (pHL) in noisy environments. Existing gaze estimation methods often rely solely on eye angle, leading to reduced accuracy. Our approach addresses this limitation by combining eye angle and nose position with head pose estimation to enhance target speaker identification and facilitate noise reduction within the AVSE framework. We utilize a novel eye gaze estimation algorithm that leverages the listener's nose position for improved accuracy. Head pose estimation is also used to capture the overall direction of attention. This combined information is utilized in real-time to steer a beamformer towards the target speaker, effectively enhancing their voice and suppressing background noise. Pilot trials with pHL users demonstrated high accuracy (99.55% - 99.88%) in estimating target speaker direction using the proposed algorithm. This research presents a promising approach for improving communication accessibility and social interaction for pIH users by potentially enhancing speech recognition in challenging listening situations. Future studies will quantify the improvement in speech intelligibility achieved by the gaze directed AVSE framework.
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech 2024 |
Publisher | The International Symposium on Computer Architecture (ISCA) |
Pages | 2034-2035 |
Number of pages | 2 |
DOIs | |
Publication status | Published - 2024 |
Event | 25th Interspeech Conferece 2024: Speech and Beyond - Kos Island, Greece Duration: 1 Sept 2024 → 5 Sept 2024 Conference number: 25th https://interspeech2024.org/ (Interspeech 2024 conference website) |
Publication series
Name | Interspeech 2024 |
---|---|
Publisher | The International Symposium on Computer Architecture (ISCA) |
ISSN (Electronic) | 2958-1796 |
Conference
Conference | 25th Interspeech Conferece 2024 |
---|---|
Abbreviated title | Interspeech 2024 |
Country/Territory | Greece |
City | Kos Island |
Period | 1/09/24 → 5/09/24 |
Internet address |
|
Keywords
- communication
- multimodal hearing-aid
- speech enhancement
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation