Key Points
-
An auditory object is a perceptual construct, corresponding to the sound that can be assigned to a particular acoustic source. An auditory object spans acoustic events that unfold over time, and a sequence of objects forms a 'stream': for example, when a person is walking, the sound of each step is a unique auditory object but the temporal sequence of footsteps is linked together to form a stream.
-
An auditory object is constructed from the spectrotemporal regularities in the acoustic environment. More specifically, an auditory stimulus comes into our awareness as a sound as a result of the simultaneous and sequential principles that group the acoustic features of the auditory stimulus into stable spectrotemporal entities.
-
Auditory-object processing occurs in the cortex. In particular, the ventral auditory pathway mediates the computations underlying a listener's ability to perceive a sound (auditory object), whereas object-related information that is found in the dorsal pathway is used in the pursuit of audiomotor behaviours.
-
Neural correlates of the perception of an auditory object are found in the auditory cortex. Whereas some studies indicate that the ventral pathway contains brain regions specialized for auditory-object processing, auditory perception is most likely to be mediated by a broad network of brain areas in this pathway.
-
A hallmark of auditory-object processing is that it can be influenced by attention and that attention can act on the object itself and not the lower-level spectrotemporal details of the auditory stimulus. Both single-unit and functional imaging studies demonstrate the effects of attention on the representation of auditory objects in the auditory cortex.
Abstract
The fundamental perceptual unit in hearing is the 'auditory object'. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Griffiths, T. D. & Warren, J. D. What is an auditory object? Nature Rev. Neurosci. 5, 887–892 (2004).
Rauschecker, J. P. Processing of complex sounds in the auditory cortex of cat, monkey, and man. Acta Otolaryngol. Suppl. 532, 34–38 (1997).
Kaas, J. H. & Hackett, T. A. Subdivisions of auditory cortex and processing streams in primates. Proc. Natl Acad. Sci. USA 97, 11793–11799 (2000).
Romanski, L. M. et al. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nature Neurosci. 2, 1131–1136 (1999).
Rauschecker, J. P. & Tian, B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc. Natl Acad. Sci. USA 97, 11800–11806 (2000).
Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nature Neurosci. 12, 718–724 (2009).
Recanzone, G. H. & Cohen, Y. E. Serial and parallel processing in the primate auditory cortex revisited. Behav. Brain Res. 206, 1–7 (2010).
Sharpee, T. O., Atencio, C. A. & Schreiner, C. E. Hierarchical representations in the auditory cortex. Curr. Opin. Neurobiol. 21, 761–767 (2011).
Bendor, D. & Wang, X. Cortical representations of pitch in monkeys and humans. Curr. Opin. Neurobiol. 16, 391–399 (2006).
Fishman, Y. I. & Steinschneider, M. in The Oxford Handbook of Auditory Science: the Auditory Brain (ed. Rees, A.) 215–245 (Oxford Univ. Press, 2010).
Bregman, A. S. Auditory Scene Analysis (MIT Press, 1990).
Winkler, I., Denham, S. L. & Nelken, I. Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn. Sci. 13, 532–540 (2009).
Kubovy, M. & Van Valkenburg, D. Auditory and visual objects. Cognition 80, 97–126 (2001).
Shinn-Cunningham, B. G. Object-based auditory and visual attention. Trends Cogn. Sci. 12, 182–186 (2008).
Schnupp, J. W., Nelken, I. & King, A. J. Auditory Neuroscience: Making Sense of Sound (MIT Press, 2012).
Miller, C. T. & Cohen, Y. E. in Primate Neuroethology (eds Ghazanfar, A. & Platt, M. L.) 237–255 (Oxford Univ. Press, 2010).
Alain, C. & Arnott, S. R. Selectively attending to auditory objects. Front. Biosci. 5, D202–D212 (2000).
DiCarlo, J. J., Zoccolan, D. & Rust, N. C. How does the brain solve visual object recognition? Neuron 73, 415–434 (2012).
Ding, N. & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl Acad. Sci. USA 109, 11854–11859 (2012).
Reddy, L. & Kanwisher, N. Coding of visual objects in the ventral stream. Curr. Opin. Neurobiol. 16, 408–414 (2006).
Miller, C. T., Dibble, E. & Hauser, M. D. Amodal completion of acoustic signals by a nonhuman primate. Nature Neurosci. 4, 783–784 (2001).
Petkov, C. I., O'Connor, K. N. & Sutter, M. L. Encoding of illusory continuity in primary auditory cortex. Neuron 54, 153–165 (2007).
Bendixen, A., Schroger, E. & Winkler, I. I heard that coming: event-related potential evidence for stimulus-driven prediction in the auditory system. J. Neurosci. 29, 8447–8451 (2009). The authors propose a key role for the auditory cortex in the generation of predictions about sequences of ongoing sounds. ERP recordings demonstrate that the neural response to a predictable but omitted sound looks very similar to the neural response to the tone when actually present.
Shinn-Cunningham, B. G. & Wang, D. Influences of auditory object formation on phonemic restoration. J. Acoust. Soc. Am. 123, 295–301 (2008).
Warren, R. M., Obusek, C. J. & Ackroff, J. M. Auditory induction: perceptual synthesis of absent sounds. Science 176, 1149–1151 (1972).
Micheyl, C. et al. The neurophysiological basis of the auditory continuity illusion: a mismatch negativity study. J. Cogn. Neurosci. 15, 747–758 (2003).
Ungerleider, L. G. & Mishkin, M. in Analysis of Visual Behavior (eds Ingle, D. J., Goodale, M. A. & Mansfield, R. J.) 549–586 (MIT Press, 1982).
Rust, N. C. & Stocker, A. A. Ambiguity and invariance: two fundamental challenges for visual processing. Curr. Opin. Neurobiol. 20, 382–388 (2010).
Ison, M. J. & Quiroga, R. Q. Selectivity and invariance for visual object perception. Front. Biosci. 13, 4889–4903 (2008).
Riesenhuber, M. & Poggio, T. Neural mechanisms of object recognition. Curr. Opin. Neurobiol. 12, 162–168 (2002).
Riesenhuber, M. & Poggio, T. Models of object recognition. Nature Neurosci. 3, 1199–1204 (2000).
Tian, B., Reser, D., Durham, A., Kustov, A. & Rauschecker, J. P. Functional specialization in rhesus monkey auditory cortex. Science 292, 290–293 (2001).
Alain, C., Arnott, S. R., Hevenor, S., Graham, S. & Grady, C. L. “What” and “where” in the human auditory system. Proc. Natl Acad. Sci. USA 98, 12301–12306 (2001).
Maeder, P. P. et al. Distinct pathways involved in sound recognition and localization: a human fMRI study. Neuroimage 14, 802–816 (2001).
Arnott, S. R., Binns, M. A., Grady, C. L. & Alain, C. Assessing the auditory dual-pathway model in humans. Neuroimage 22, 401–408 (2004).
Obleser, J. et al. Vowel sound extraction in anterior superior temporal cortex. Hum. Brain Mapp. 27, 562–571 (2006).
Chang, E. F. et al. Categorical speech representation in human superior temporal gyrus. Nature Neurosci. 13, 1428–1432 (2010).
Binder, J. R., Liebenthal, E., Possing, E. T., Medler, D. A. & Ward, B. D. Neural correlates of sensory and decision processes in auditory object identification. Nature Neurosci. 7, 295–301 (2004). The authors attempt to identify both sensory and decision-making activity in the human brain using fMRI. They demonstrate a functional distinction between sensory and decision mechanisms underlying auditory-object identification.
Hill, K. T. & Miller, L. M. Auditory attentional control and selection during cocktail party listening. Cereb. Cortex 20, 583–590 (2010).
Lee, A. K. et al. Auditory selective attention reveals preparatory activity in different cortical regions for selection based on source location and source pitch. Front. Neurosci. 6, 190 (2012). The authors combined magnetoencephalography recordings and structural MRI data to map the attentional networks involved in selectively attending to either spatial or non-spatial features of a sound. Left frontal eye fields were activated by spatial attention, whereas lateral posterior superior temporal sulcus was activated by attention to pitch.
Cohen, Y. E. et al. A functional role for the ventrolateral prefrontal cortex in non-spatial auditory cognition. Proc. Natl Acad. Sci. USA 106, 20045–20050 (2009).
Obleser, J. & Eisner, F. Pre-lexical abstraction of speech in the auditory cortex. Trends Cogn. Sci. 13, 14–19 (2009).
Rauschecker, J. P. Ventral and dorsal streams in the evolution of speech and language. Front. Evol. Neurosci. 4, 7 (2012).
Bizley, J. K., Walker, K. M., Silverman, B. W., King, A. J. & Schnupp, J. W. Interdependent encoding of pitch, timbre, and spatial location in auditory cortex. J. Neurosci. 29, 2064–2075 (2009).
Miller, L. M. & Recanzone, G. H. Populations of auditory cortical neurons can accurately encode acoustic space across stimulus intensity. Proc. Natl Acad. Sci. USA 106, 5931–5935 (2009). The authors measured neural responses to sounds that varied in spatial location and used optimal decoding strategies to assess whether neural responses could support behavioural localization abilities. Although neural populations throughout the auditory cortex contained spatial information in their responses, only those in the caudolateral field had sufficient information to account for behaviour.
Stecker, G. C. & Middlebrooks, J. C. Distributed coding of sound locations in the auditory cortex. Biol. Cybern. 89, 341–349 (2003).
Harrington, I. A., Stecker, G. C., Macpherson, E. A. & Middlebrooks, J. C. Spatial sensitivity of neurons in the anterior, posterior, and primary fields of cat auditory cortex. Hear. Res. 240, 22–41 (2008).
Cloutman, L. L. Interaction between dorsal and ventral processing streams: where, when and how? Brain Lang. http://dx.doi.org/10.1016/j.bandl.2012.08.003 (2012).
Middlebrooks, J. C. & Onsan, Z. A. Stream segregation with high spatial acuity. J. Acoust. Soc. Am. 132, 3896–3911 (2012).
Middlebrooks, J. C. & Bremen, P. Spatial stream segregation by auditory cortical neurons. J. Neurosci. 33, 10986–11001 (2013).
Rauschecker, J. P. An expanded role for the dorsal auditory pathway in sensorimotor control and integration. Hear. Res. 271, 16–25 (2011).
Teki, S., Chait, M., Kumar, S., von Kriegstein, K. & Griffiths, T. D. Brain bases for auditory stimulus-driven figure–ground segregation. J. Neurosci. 31, 164–171 (2011).
Leaver, A. M., Van Lare, J., Zielinski, B., Halpern, A. R. & Rauschecker, J. P. Brain activation during anticipation of sound sequences. J. Neurosci. 29, 2477–2485 (2009).
Cusack, R. The intraparietal sulcus and perceptual organization. J. Cogn. Neurosci. 17, 641–651 (2005).
Rao, S. C., Rainer, G. & Miller, E. K. Integration of what and where in the primate prefrontal cortex. Science 276, 821–824 (1997).
Bendor, D. & Wang, X. The neuronal representation of pitch in primate auditory cortex. Nature 436, 1161–1165 (2005). The authors demonstrate that a subset of neurons — specifically in the low-frequency border of area A1 and the rostral field in the marmoset — respond to sounds with a fundamental frequency that matches their characteristic frequency regardless of whether the fundamental frequency is present or not.
Lee, C. C. & Middlebrooks, J. C. Specialization for sound localization in fields A1, DZ, and PAF of cat auditory cortex. J. Associ. Res. Otolaryngol. 14, 61–82 (2013).
Camalier, C. R., D'Angelo, W. R., Sterbing-D'Angelo, S. J., de la Mothe, L. A. & Hackett, T. A. Neural latencies across auditory cortex of macaque support a dorsal stream supramodal timing advantage in primates. Proc. Natl Acad. Sci. USA 109, 18168–18173 (2012).
Grimsley, J. M., Shanbhag, S. J., Palmer, A. R. & Wallace, M. N. Processing of communication calls in guinea pig auditory cortex. PloS ONE 7, e51646 (2012).
Patterson, R. D., Uppenkamp, S., Johnsrude, I. S. & Griffiths, T. D. The processing of temporal pitch and melody information in auditory cortex. Neuron 36, 767–776 (2002). The authors present evidence for the hierarchical processing of pitch by performing fMRI on human listeners using sounds that are matched in spectral content but that either did or did not evoke a pitch percept.
Penagos, H., Melcher, J. R. & Oxenham, A. J. A neural representation of pitch salience in nonprimary human auditory cortex revealed with functional magnetic resonance imaging. J. Neurosci. 24, 6810–6815 (2004).
Warren, J. D. & Griffiths, T. D. Distinct mechanisms for processing spatial sequences and pitch sequences in the human auditory brain. J. Neurosci. 23, 5799–5804 (2003).
Garcia, D., Hall, D. A. & Plack, C. J. The effect of stimulus context on pitch representations in the human auditory cortex. Neuroimage 51, 808–816 (2010).
Kumar, S., Stephan, K. E., Warren, J. D., Friston, K. J. & Griffiths, T. D. Hierarchical processing of auditory objects in humans. PLoS Computat. Biol. 3, e100 (2007). The authors present evidence for the hierarchical processing of spectral timbre in human listeners. The use of dynamic causal modelling techniques indicated that processing was both serial and hierarchical.
Bizley, J. K., Walker, K. M., Nodal, F. R., King, A. J. & Schnupp, J. W. Auditory cortex represents both pitch judgments and the corresponding acoustic cues. Curr. Biol. 23, 620–625 (2013). The authors recorded neural responses in the auditory cortex of ferrets performing a pitch-direction discrimination task. Neural activity was modulated more by the ferrets' decision regarding the pitch of a target sound than by the actual pitch category.
Griffiths, T. D. et al. Direct recordings of pitch responses from human auditory cortex. Curr. Biol. 20, 1128–1132 (2010).
Staeren, N., Renvall, H., De Martino, F., Goebel, R. & Formisano, E. Sound categories are represented as distributed patterns in the human auditory cortex. Curr. Biol. 19, 498–502 (2009).
Hall, D. A. & Plack, C. J. Pitch processing sites in the human auditory brain. Cereb. Cortex 19, 576–585 (2009).
Bizley, J. K., Walker, K. M., King, A. J. & Schnupp, J. W. Neural ensemble codes for stimulus periodicity in auditory cortex. J. Neurosci. 30, 5078–5091 (2010).
Griffiths, T. D. & Hall, D. A. Mapping pitch representation in neural ensembles with fMRI. J. Neurosci. 32, 13343–13347 (2012).
Nelken, I. et al. Responses of auditory cortex to complex stimuli: functional organization revealed using intrinsic optical signals. J. Neurophysiol. 99, 1928–1941 (2008).
Darwin, C. J. Auditory grouping. Trends Cogn. Sci. 1, 327–333 (1997).
Hackett, T. A. Information flow in the auditory cortical network. Hear. Res. 271, 133–146 (2011).
Dick, F. et al. In vivo functional and myeloarchitectonic mapping of human primary auditory areas. J. Neurosci. 32, 16095–16105 (2012).
Schebesch, G., Lingner, A., Firzlaff, U., Wiegrebe, L. & Grothe, B. Perception and neural representation of size-variant human vowels in the Mongolian gerbil (Meriones unguiculatus). Hear. Res. 261, 1–8 (2010).
Versnel, H. & Shamma, S. A. Spectral-ripple representation of steady-state vowels in primary auditory cortex. J. Acoust. Soc. Am. 103, 2502–2514 (1998).
Formisano, E., De Martino, F., Bonte, M. & Goebel, R. “Who” is saying “what”? Brain-based decoding of human voice and speech. Science 322, 970–973 (2008).
Bizley, J. K. & Walker, K. M. Distributed sensitivity to conspecific vocalizations and implications for the auditory dual stream hypothesis. J. Neurosci. 29, 3011–3013 (2009).
Walker, K. M., Bizley, J. K., King, A. J. & Schnupp, J. W. Multiplexed and robust representations of sound features in auditory cortex. J. Neurosci. 31, 14565–14576 (2011).
Bidelman, G. M., Moreno, S. & Alain, C. Tracing the emergence of categorical speech perception in the human auditory system. Neuroimage 79, 201–212 (2013).
Parker, A. J. & Newsome, W. T. Sense and the single neuron: probing the physiology of perception. Annu. Rev. Neurosci. 21, 227–277 (1998).
Nienborg, H., Cohen, M. R. & Cumming, B. G. Decision-related activity in sensory neurons: correlations among neurons and with behavior. Annu. Rev. Neurosci. 35, 463–483 (2012).
Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).
Schall, J. D. & Bichot, N. P. Neural correlates of visual and motor decision processes. Curr. Opin. Neurobiol. 8, 211–217 (1998).
Niwa, M., Johnson, J. S., O'Connor, K. N. & Sutter, M. L. Activity related to perceptual judgment and action in primary auditory cortex. J. Neurosci. 32, 3193–3210 (2012). The authors recorded single- and multiunit activity in the auditory cortex of animals performing an auditory modulation detection task. In addition to acoustic information, neural activity was informative about both motor actions and the animals' behavioural choice.
Kilian-Hutten, N., Valente, G., Vroomen, J. & Formisano, E. Auditory cortex encodes the perceptual interpretation of ambiguous sound. J. Neurosci. 31, 1715–1720 (2011).
Russ, B. E., Orr, L. E. & Cohen, Y. E. Prefrontal neurons predict choices during an auditory same-different task. Curr. Biol. 18, 1483–1488 (2008). The authors recorded from neurons in the ventrolateral prefrontal cortex of monkeys performing a non-spatial same–different task. Neural activity predicted animals' behavioural choices, demonstrating a direct link between single neurons and behavioural choice.
Tsunada, J., Lee, J. H. & Cohen, Y. E. Representation of speech categories in the primate auditory cortex. J. Neurophysiol. 105, 2634–2646 (2011).
Russ, B. E., Ackelson, A. L., Baker, A. E. & Cohen, Y. E. Coding of auditory-stimulus identity in the auditory non-spatial processing stream. J. Neurophysiol. 99, 87–95 (2008).
Lemus, L., Hernandez, A. & Romo, R. Neural encoding of auditory discrimination in ventral premotor cortex. Proc. Natl Acad. Sci. USA 106, 14640–14645 (2009).
Lemus, L., Hernandez, A. & Romo, R. Neural codes for perceptual discrimination of acoustic flutter in the primate auditory cortex. Proc. Natl Acad. Sci. USA 106, 9471–9476 (2009).
Selezneva, E., Scheich, H. & Brosch, M. Dual time scales for categorical decision making in auditory cortex. Curr. Biol. 16, 2428–2433 (2006).
Gold, J. I. & Shadlen, M. N. Neural computations that underlie decisions about sensory stimuli. Trends Cogn. Sci. 5, 10–16 (2001).
Buffalo, E. A., Fries, P., Landman, R., Buschman, T. J. & Desimone, R. Laminar differences in gamma and alpha coherence in the ventral stream. Proc. Natl Acad. Sci. USA 108, 11262–11267 (2011).
Niwa, M., Johnson, J. S., O'Connor, K. N. & Sutter, M. L. Differences between primary auditory cortex and auditory belt related to encoding and choice for AM sounds. J. Neurosci. 33, 8378–8395 (2013).
Romo, R. & Salinas, E. Sensing and deciding in the somatosensory system. Curr. Opin. Neurobiol. 9, 487–493 (1999).
Riecke, L. et al. Hearing an illusory vowel in noise: suppression of auditory cortical activity. J. Neurosci. 32, 8024–8034 (2012).
Riecke, L., Mendelsohn, D., Schreiner, C. & Formisano, E. The continuity illusion adapts to the auditory scene. Hear. Res. 247, 71–77 (2009).
Riecke, L., Micheyl, C. & Oxenham, A. J. Global not local masker features govern the auditory continuity illusion. J. Neurosci. 32, 4660–4664 (2012).
Pressnitzer, D., Suied, C. & Shamma, S. A. Auditory scene analysis: the sweet music of ambiguity. Front. Hum. Neurosci. 5, 158 (2011).
Leopold, D. A. & Logothetis, N. K. Multistable phenomena: changing views in perception. Trends Cogn. Sci. 3, 254–264 (1999).
Shamma, S. A. & Micheyl, C. Behind the scenes of auditory perception. Curr. Opin. Neurobiol. 20, 361–366 (2010).
Pressnitzer, D., Sayles, M., Micheyl, C. & Winter, I. M. Perceptual organization of sound begins in the auditory periphery. Curr. Biol. 18, 1124–1128 (2008).
Micheyl, C. et al. The role of auditory cortex in the formation of auditory streams. Hear. Res. 229, 116–131 (2007).
Gutschalk, A., Micheyl, C. & Oxenham, A. J. Neural correlates of auditory perceptual awareness under informational masking. PLoS Biol. 6, e138 (2008).
Kondo, H. M. & Kashino, M. Involvement of the thalamocortical loop in the spontaneous switching of percepts in auditory streaming. J. Neurosci. 29, 12695–12701 (2009).
Deike, S., Gaschler-Markefski, B., Brechmann, A. & Scheich, H. Auditory stream segregation relying on timbre involves left auditory cortex. Neuroreport 15, 1511–1514 (2004).
Hill, K. T., Bishop, C. W., Yadav, D. & Miller, L. M. Pattern of BOLD signal in auditory cortex relates acoustic response to perceptual streaming. BMC Neurosci. 12, 85 (2011).
Micheyl, C., Tian, B., Carlyon, R. P. & Rauschecker, J. P. Perceptual organization of tone sequences in the auditory cortex of awake macaques. Neuron 48, 139–148 (2005).
Fishman, Y. I., Reser, D. H., Arezzo, J. C. & Steinschneider, M. Neural correlates of auditory stream segregation in primary auditory cortex of the awake monkey. Hear. Res. 151, 167–187 (2001). The authors present single-unit recordings in the auditory cortex in response to ABA tone sequences. Non-best frequency tones were suppressed at presentation rates and frequency separations in a manner that mirrored human perception.
Elhilali, M., Ma, L., Micheyl, C., Oxenham, A. J. & Shamma, S. A. Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron 61, 317–329 (2009). Using psychophysical methods, the authors demonstrate that spectral components that are well separated in frequency are no longer heard as separate streams if presented synchronously rather than consecutively. The authors present a 'temporal coherence' theory of auditory streaming.
Micheyl, C., Kreft, H., Shamma, S. & Oxenham, A. J. Temporal coherence versus harmonicity in auditory stream formation. J. Acoust. Soc. Am. 133, EL188–EL194 (2013).
Kashino, M. & Kondo, H. M. Functional brain networks underlying perceptual switching: auditory streaming and verbal transformations. Phil. Trans. R. Soc. B 367, 977–987 (2012).
Tsunada, J., Lee, J. H. & Cohen, Y. E. Differential representation of auditory categories between cell classes in primate auditory cortex. J. Physiol. 590, 3129–3139 (2012).
Obleser, J., Leaver, A. M., Vanmeter, J. & Rauschecker, J. P. Segregation of vowels and consonants in human auditory cortex: evidence for distributed hierarchical organization. Front. Psychol. 1, 232 (2010).
Chevillet, M. A., Jiang, X., Rauschecker, J. P. & Riesenhuber, M. Automatic phoneme category selectivity in the dorsal auditory stream. J. Neurosci. 33, 5208–5215 (2013).
Leaver, A. M. & Rauschecker, J. P. Cortical representation of natural complex sounds: effects of acoustic features and auditory object category. J. Neurosci. 30, 7604–7612 (2010). The authors used fMRI to investigate the hierarchical processing of natural sounds in the ventral pathway. Category-selective responses were identified in anterior superior temporal regions, whereas responses in the superior temporal sulcus were not category-selective but rather responded to acoustic features.
Giordano, B. L., McAdams, S., Zatorre, R. J., Kriegeskorte, N. & Belin, P. Abstract encoding of auditory objects in cortical activity patterns. Cereb. Cortex 23, 2025–2037 (2013). The authors combined multivariate analyses of fMRI data with analysis of the low-level acoustical information to examine the abstract encoding of non-speech categories. They observed category sensitivity in the planum temporale, suggesting that object processing is not restricted to the ventral pathway.
Gifford, G. W., MacLean, K. A., Hauser, M. D. & Cohen, Y. E. The neurophysiology of functionally meaningful categories: macaque ventrolateral prefrontal cortex plays a critical role in spontaneous categorization of species-specific vocalizations. J. Cogn. Neurosci. 17, 1471–1482 (2005).
Ohl, F. W., Scheich, H. & Freeman, W. J. Change in pattern of ongoing cortical activity with auditory category learning. Nature 412, 733–736 (2001). The authors recorded from the auditory cortex of gerbils while the animals learned an acoustic classification task. They demonstrate that the stimulus representation in the auditory cortex undergoes a dramatic change in its dynamic pattern at the point when animals begin to correctly classify the acoustic stimuli.
Fritz, J. B., David, S. V., Radtke-Schuller, S., Yin, P. & Shamma, S. A. Adaptive, behaviorally gated, persistent encoding of task-relevant auditory information in ferret frontal cortex. Nature Neurosci. 13, 1011–1019 (2010).
King, A. J. & Nelken, I. Unraveling the principles of auditory cortical processing: can we learn from the visual system? Nature Neurosci. 12, 698–701 (2009).
Hegde, J. & Van Essen, D. C. Role of primate visual area V4 in the processing of 3D shape characteristics defined by disparity. J. Neurophysiol. 94, 2856–2866 (2005).
Alain, C. Breaking the wave: effects of attention and learning on concurrent sound perception. Hear. Res. 229, 225–236 (2007).
Naatanen, R. & Picton, T. The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology 24, 375–425 (1987).
Kujala, T., Tervaniemi, M. & Schroger, E. The mismatch negativity in cognitive and clinical neuroscience: theoretical and methodological considerations. Biol. Psychol. 74, 1–19 (2007).
Picton, T. W., Alain, C., Otten, L., Ritter, W. & Achim, A. Mismatch negativity: different water in the same river. Audiol. Neurootol. 5, 111–139 (2000).
Alain, C., Woods, D. L. & Ogawa, K. H. Brain indices of automatic pattern processing. Neuroreport 6, 140–144 (1994).
Sussman, E. S., Horvath, J., Winkler, I. & Orr, M. The role of attention in the formation of auditory streams. Percept. Psychophys. 69, 136–152 (2007).
Cusack, R., Deeks, J., Aikman, G. & Carlyon, R. P. Effects of location, frequency region, and time course of selective attention on auditory scene analysis. J. Exp. Psychol. Hum. Percept. Perform. 30, 643–656 (2004).
Winkler, I., Takegata, R. & Sussman, E. Event-related brain potentials reveal multiple stages in the perceptual organization of sound. Brain Res. Cogn. Brain Res. 25, 291–299 (2005).
Snyder, J. S., Alain, C. & Picton, T. W. Effects of attention on neuroelectric correlates of auditory stream segregation. J. Cogn. Neurosci. 18, 1–13 (2006).
Snyder, J. S., Carter, O. L., Hannon, E. E. & Alain, C. Adaptation reveals multiple levels of representation in auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 35, 1232–1244 (2009).
Knudsen, E. I. Fundamental components of attention. Annu. Rev. Neurosci. 30, 57–78 (2007).
Shinn-Cunningham, B. G. & Best, V. Selective attention in normal and impaired hearing. Trends Amplif. 12, 283–299 (2008).
Desimone, R. & Duncan, J. Neural mechanisms of selective visual attention. Annu. Rev. Neurosci. 18, 193–222 (1995).
Zatorre, R. J., Mondor, T. A. & Evans, A. C. Auditory attention to space and frequency activates similar cerebral systems. Neuroimage 10, 544–554 (1999).
Duncan, J. E.P. S. Mid-Career Award 2004: brain mechanisms of attention. Q. J. Exp. Psychol. 59, 2–27 (2006).
Lee, A. K. & Shinn-Cunningham, B. G. Effects of reverberant spatial cues on attention-dependent object formation. J. Assoc. Res. Otolaryngol. 9, 150–160 (2008).
Darwin, C. J. & Hukin, R. W. Perceptual segregation of a harmonic from a vowel by interaural time difference in conjunction with mistuning and onset asynchrony. J. Acoust. Soc. Am. 103, 1080–1084 (1998).
Best, V., Gallun, F. J., Carlile, S. & Shinn-Cunningham, B. G. Binaural interference and auditory grouping. J. Acoust. Soc. Am. 121, 1070–1076 (2007).
Shinn-Cunningham, B. G., Lee, A. K. & Oxenham, A. J. A sound element gets lost in perceptual competition. Proc. Natl Acad. Sci. USA 104, 12223–12227 (2007).
Kastner, S. & Ungerleider, L. G. Mechanisms of visual attention in the human cortex. Annu. Rev. Neurosci. 23, 315–341 (2000).
Shamma, S. On the emergence and awareness of auditory objects. PLoS Biol. 6, e155 (2008).
Dick, F., Lee, H. L., Nusbaum, H. & Price, C. J. Auditory-motor expertise alters “speech selectivity” in professional musicians and actors. Cereb. Cortex 21, 938–948 (2011).
Fritz, J., Shamma, S., Elhilali, M. & Klein, D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nature Neurosci. 6, 1216–1223 (2003).
Atiani, S., Elhilali, M., David, S. V., Fritz, J. B. & Shamma, S. A. Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields. Neuron 61, 467–480 (2009).
Niwa, M., Johnson, J. S., O'Connor, K. N. & Sutter, M. L. Active engagement improves primary auditory cortical neurons' ability to discriminate temporal modulation. J. Neurosci. 32, 9323–9334 (2012).
Lee, C. C. & Middlebrooks, J. C. Auditory cortex spatial sensitivity sharpens during task performance. Nature Neurosci. 14, 108–114 (2011).
Alain, C. & Woods, D. L. Attention modulates auditory pattern memory as indexed by event-related brain potentials. Psychophysiology 34, 534–546 (1997).
Woods, D. L., Alho, K. & Algazi, A. Intermodal selective attention: evidence for processing in tonotopic auditory fields. Psychophysiology 30, 287–295 (1993).
Woods, D. L., Alho, K. & Algazi, A. Intermodal selective attention. I. Effects on event-related potentials to lateralized auditory and visual stimuli. Electroencephalogr. Clin. Neurophysiol. 82, 341–355 (1992).
Petkov, C. I. et al. Attentional modulation of human auditory cortex. Nature Neurosci. 7, 658–663 (2004).
Rinne, T. et al. Attention modulates sound processing in human auditory cortex but not the inferior colliculus. Neuroreport 18, 1311–1314 (2007).
Woldorff, M. G. et al. Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proc. Natl Acad. Sci. USA 90, 8722–8726 (1993).
Ding, N. & Simon, J. Z. Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. J. Neurophysiol. 107, 78–89 (2012).
Mesgarani, N. & Chang, E. F. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485, 233–236 (2012). The authors used electrocorticographic recording in human patients to investigate neural activity in listeners selectively attending to one stream of speech while ignoring a distractor stream. Neural activity represented crucial features of the attended speech while apparently suppressing the unattended stream.
Degerman, A., Rinne, T., Salmi, J., Salonen, O. & Alho, K. Selective attention to sound location or pitch studied with fMRI. Brain Res. 1077, 123–134 (2006).
Salmi, J., Rinne, T., Degerman, A. & Alho, K. Orienting and maintenance of spatial attention in audition and vision: an event-related brain potential study. Eur. J. Neurosci. 25, 3725–3733 (2007).
Ahveninen, J. et al. Task-modulated “what” and “where” pathways in human auditory cortex. Proc. Natl Acad. Sci. USA 103, 14608–14613 (2006).
Buffalo, E. A., Fries, P., Landman, R., Liang, H. & Desimone, R. A backward progression of attentional effects in the ventral stream. Proc. Natl Acad. Sci. USA 107, 361–365 (2010).
Sugihara, T., Diltz, M. D., Averbeck, B. B. & Romanski, L. M. Integration of auditory and visual communication information in the primate ventrolateral prefrontal cortex. J. Neurosci. 26, 11138–11147 (2006).
Romanski, L. M., Averbeck, B. B. & Diltz, M. Neural representation of vocalizations in the primate ventrolateral prefrontal cortex. J. Neurophysiol. 93, 734–747 (2005).
Gifford, G. W., Hauser, M. D. & Cohen, Y. E. Discrimination of functionally referential calls by laboratory-housed rhesus macaques: implications for neuroethological studies. Brain Behav. Evol. 61, 213–224 (2003).
Teki, S. et al. Navigating the auditory scene: an expert role for the hippocampus. J. Neurosci. 32, 12251–12257 (2012).
Culling, J. F. & Summerfield, Q. Perceptual separation of concurrent speech sounds: absence of across-frequency grouping by common interaural delay. J. Acoust. Soc. Am. 98, 785–797 (1995).
Darwin, C. J. & Hukin, R. W. Perceptual segregation of a harmonic from a vowel by interaural time difference and frequency proximity. J. Acoust. Soc. Am. 102, 2316–2324 (1997).
McAdams, S. & Bregman, A. S. Hearing musical streams. Computer Music J. 3, 26–43 (1979).
Shamma, S. A., Elhilali, M. & Micheyl, C. Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 34, 114–123 (2011).
Acknowledgements
We thank H. Hersh for a critical reading of the manuscript. J.K.B. is supported by a Royal Society Dorothy Hodgkin Research Fellowship and BBSRC grant BB/H016813/1. Y.E.C. is supported by grants from the US National Institute on Deafness and Other Communication Disorders and US National Institutes of Health.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Glossary
- Pitch
-
The attribute of a sound that enables it to be ordered from high to low on a musical scale. The perceived pitch for a periodic sound is determined by its fundamental frequency (F0), usually the lowest frequency component.
- Timbre
-
The quality of a sound that is determined by its spectral or temporal envelope. Timbre allows a listener to differentiate between a violin and a banjo despite the fact that the two instruments may be producing a sound that has the same pitch.
- Harmonicity
-
A harmonic sound contains frequency components at integer multiples of the fundamental frequency (see the definition for 'pitch'). Many vocalizations and other pitch-evoking sounds have a harmonic structure.
- Spectral envelope
-
This term refers to the distribution of power across frequency in a sound. For a harmonic sound, this equates to the relative power across harmonics.
- Dynamic causal modelling
-
A computational approach that performs Bayesian model comparisons in order to infer the organizational structure of processing within different brain regions.
- Auditory flutter
-
The sensation produced by a periodic stimulus in which a listener can hear the sound as being intermittent. At higher frequencies, the sound is fused into one with a continuous melodic pitch. The border between being heard as intermittent or continuous is the flicker–fusion limit.
- Forward masking
-
A process by which a sound is obscured by a masker (for example, a noise burst) that precedes the sound.
- Categorical perception
-
The experience of perceiving a stimulus as being the same (that is, invariant) despite the fact that the physical properties of the stimulus have changed smoothly along a specific axis or continuum. A characteristic of categorical perception is that for a continuously changing stimulus dimension, subjects generalize across changes, with a sharp change in the perception from one class to another at the position of the boundary of the stimulus identity.
- Scene analysis
-
The process by which the brain organizes and segregates acoustic stimuli into meaningful elements or objects.
- Grandmother cells
-
Hypothetical cells that represent a very specific complex object or concept — such as one's grandmother.
- Object-related negativity
-
An evoked-potential component that is elicited when two concurrently presented sounds are perceived as originating from different sources based on simultaneous grouping cues.
Rights and permissions
About this article
Cite this article
Bizley, J., Cohen, Y. The what, where and how of auditory-object perception. Nat Rev Neurosci 14, 693–707 (2013). https://doi.org/10.1038/nrn3565
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrn3565
This article is cited by
-
Neural signatures of natural behaviour in socializing macaques
Nature (2024)
-
Transcutaneous cervical vagus nerve stimulation improves sensory performance in humans: a randomized controlled crossover pilot study
Scientific Reports (2024)
-
Intermediate acoustic-to-semantic representations link behavioral and neural responses to natural sounds
Nature Neuroscience (2023)
-
Dynamic encoding of phonetic categories in zebra finch auditory forebrain
Scientific Reports (2023)
-
Spectrotemporal content of human auditory working memory represented in functional connectivity patterns
Communications Biology (2023)