Qui ne dit mot, peut en penser

How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex. Author Summary Top Spoken language is a uniquely human trait. The human brain has evolved computational mechanisms that decode highly variable acoustic inputs into meaningful elements of language such as phonemes and words. Unraveling these decoding mechanisms in humans has proven difficult, because invasive recording of cortical activity is usually not possible. In this study, we take advantage of rare neurosurgical procedures for the treatment of epilepsy, in which neural activity is measured directly from the cortical surface and therefore provides a unique opportunity for characterizing how the human brain performs speech recognition. Using these recordings, we asked what aspects of speech sounds could be reconstructed, or decoded, from higher order brain areas in the human auditory system. We found that continuous auditory representations, for example the speech spectrogram, could be accurately reconstructed from measured neural signals. Reconstruction quality was highest for sound features most critical to speech intelligibility and allowed decoding of individual spoken words. The results provide insights into higher order neural speech processing and suggest it may be possible to readout intended speech directly from brain activity. Citation: Pasley BN, David SV, Mesgarani N, Flinker A, Shamma SA, et al. (2012) Reconstructing Speech from Human Auditory Cortex. PLoS Biol 10(1): e1001251. doi:10.1371/journal.pbio.1001251 Academic Editor: Robert Zatorre, McGill University, Canada Received: June 24, 2011; Accepted: December 13, 2011; Published: January 31, 2012 Copyright: © 2012 Pasley et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This research was supported by NS21135 (RTK), PO4813 (RTK), NS40596 (NEC), and K99NS065120 (EFC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. Abbreviations: A1, primary auditory cortex; STG, superior temporal gyrus; STRF, spectro-temporal receptive field * E-mail: bpasley@berkeley.edu

Voir : http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001251



Vos réactions

Soyez le premier à réagir !

Les réactions aux articles sont réservées aux professionnels de santé inscrits
Elles ne seront publiées sur le site qu’après modération par la rédaction (avec un délai de quelques heures à 48 heures). Sauf exception, les réactions sont publiées avec la signature de leur auteur.

Réagir à cet article