Speech is an intrinsically multisensory signal and seeing the speaker’s lips forms a cornerstone of communication in acoustically impoverished environments. Still, it remains unclear how the brain exploits visual speech for comprehension and previous work debated whether lip signals are mainly processed along the auditory pathways or whether the visual system directly implements speech-related processes. To probe this question, we systematically characterized dynamic representations of multiple acoustic and visual speech-derived features in source localized MEG recordings that were obtained while participants listened to speech or viewed silent speech. Using a mutual-information framework we provide a comprehensive assessment of how well temporal and occipital cortices reflect the physically presented signals and speech-related features that were physically absent but may still be critical for comprehension. Our results demonstrate that both cortices are capable of a functionally specific form of multisensory restoration: during lip reading both reflect unheard acoustic features, with occipital regions emphasizing spectral information and temporal regions emphasizing the speech envelope. Importantly, the degree of envelope restoration was predictive of lip reading performance. These findings suggest that when seeing the speaker’s lips the brain engages both visual and auditory pathways to support comprehension by exploiting multisensory correspondences between lip movements and spectro-temporal acoustic cues.
|Place of Publication||Cold Spring Harbour Laboratory|
|Number of pages||29|
|Publication status||Published - 22 Feb 2022|
- Speech entrainment
- speech tracking