Cortical substrates for exploratory decisions in humans

Daw, Nathaniel D.; O'Doherty, John P.; Dayan, Peter; Seymour, Ben; Dolan, Raymond J.

doi:10.1038/nature04766

Letter
Published: 15 June 2006

Cortical substrates for exploratory decisions in humans

Nathaniel D. Daw¹^na1,
John P. O'Doherty²^na1^nAff3,
Peter Dayan¹,
Ben Seymour² &
…
Raymond J. Dolan²

Nature volume 441, pages 876–879 (2006)Cite this article

22k Accesses
1343 Citations
18 Altmetric
Metrics details

Abstract

Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this ‘exploration–exploitation’ dilemma¹, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning² (RL) theory, indicates that a dopaminergic^3,4, striatal^5,6,7,8,9 and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical¹ and ethological¹⁰ perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

Figure 2: **Reward-related activations.**

Figure 3: **Exploration-related activity in frontopolar cortex.**

Figure 4: **Exploration-related activity in intraparietal sulcus.**

Choice-relevant information transformation along a ventrodorsal axis in the medial prefrontal cortex

Article Open access 10 August 2021

Disentangling the roles of dopamine and noradrenaline in the exploration-exploitation tradeoff during human decision-making

Article Open access 15 December 2022

Primate anterior insular cortex represents economic decision variables proposed by prospect theory

Article Open access 07 February 2022

References

Gittins, J. C. & Jones, D. in Progress in Statistics (ed. Gani, J.) 241–266 (North-Holland, Amsterdam, 1974)
Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, Cambridge, Massachusetts, 1998)
MATH Google Scholar
Montague, P. R., Dayan, P. & Sejnowski, T. J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996)
Article CAS PubMed PubMed Central Google Scholar
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)
Article CAS PubMed PubMed Central Google Scholar
Delgado, M. R., Nystrom, L. E., Fissell, C., Noll, D. C. & Fiez, J. A. Tracking the hemodynamic responses to reward and punishment in the striatum. J. Neurophysiol. 84, 3072–3077 (2000)
Article CAS PubMed Google Scholar
Knutson, B., Westdorp, A., Kaiser, E. & Hommer, D. fMRI visualization of brain activity during a monetary incentive delay task. Neuroimage 12, 20–27 (2000)
Article CAS PubMed Google Scholar
McClure, S. M., Berns, G. S. & Montague, P. R. Temporal prediction errors in a passive learning task activate human striatum. Neuron 38, 339–346 (2003)
Article CAS PubMed Google Scholar
O'Doherty, J. P., Dayan, P., Friston, K., Critchley, H. & Dolan, R. J. Temporal difference models and reward-related learning in the human brain. Neuron 38, 329–337 (2003)
Article CAS PubMed Google Scholar
O'Doherty, J. P. et al. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004)
Article ADS CAS PubMed Google Scholar
Charnov, E. L. Optimal foraging: The marginal value theorem. Theor. Popul. Biol. 9, 129–136 (1976)
Article CAS PubMed Google Scholar
Owen, A. M. Cognitive planning in humans: Neuropsychological, neuroanatomical and neuropharmacological perspectives. Prog. Neurobiol. 53, 431–450 (1997)
Article CAS PubMed Google Scholar
Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioural control. Nature Neurosci. 8, 1704–1711 (2005)
Article CAS PubMed Google Scholar
Kakade, S. & Dayan, P. Dopamine: Generalization and bonuses. Neural Netw. 15, 549–559 (2002)
Article PubMed Google Scholar
Kaelbling, L. P. Learning in Embedded Systems (MIT Press, Cambridge, Massachusetts, 1993)
Google Scholar
McClure, S. M., Laibson, D. I., Loewenstein, G. & Cohen, J. D. Separate neural systems value immediate and delayed monetary rewards. Science 306, 503–507 (2004)
Article ADS CAS PubMed Google Scholar
O'Doherty, J., Kringelbach, M. L., Rolls, E. T., Hornak, J. & Andrews, C. Abstract reward and punishment representations in the human orbitofrontal cortex. Nature Neurosci. 4, 95–102 (2001)
Article CAS PubMed Google Scholar
O'Doherty, J. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Curr. Opin. Neurobiol. 14, 769–776 (2004)
Article CAS PubMed Google Scholar
Gottfried, J. A., O'Doherty, J. & Dolan, R. J. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 301, 1104–1107 (2003)
Article ADS CAS PubMed Google Scholar
Tanaka, S. C. et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neurosci. 7, 887–893 (2004)
Article CAS PubMed Google Scholar
Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202 (2001)
Article CAS PubMed Google Scholar
Ramnani, N. & Owen, A. M. Anterior prefrontal cortex: Insights into function from anatomy and neuroimaging. Nature Rev. Neurosci. 5, 184–194 (2004)
Article CAS Google Scholar
Koechlin, E., Ody, C. & Kouneiher, F. A. The architecture of cognitive control in the human prefrontal cortex. Science 302, 1181–1185 (2003)
Article ADS CAS PubMed Google Scholar
Braver, T. S. & Bongiolatti, S. R. The role of frontopolar cortex in subgoal processing during working memory. Neuroimage 15, 523–536 (2002)
Article PubMed Google Scholar
Platt, M. L. & Glimcher, P. W. Neural correlates of decision variables in parietal cortex. Nature 400, 233–238 (1999)
Article ADS CAS PubMed Google Scholar
Sugrue, L. P., Corrado, G. S. & Newsome, W. T. Matching behaviour and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004)
Article ADS CAS PubMed Google Scholar
Dorris, M. C. & Glimcher, P. W. Activity in posterior parietal cortex is correlated with the relative subjective desirability of action. Neuron 44, 365–378 (2004)
Article CAS PubMed Google Scholar
Grefkes, C. & Fink, G. R. The functional organization of the intraparietal sulcus in humans and monkeys. J. Anat. 207, 3–17 (2005)
Article PubMed PubMed Central Google Scholar
Burgess, P. W., Veitch, E., de Lacy Costello, A. & Shallice, T. The cognitive and neuroanatomical correlates of multitasking. Neuropsychologia 38, 848–863 (2000)
Article CAS PubMed Google Scholar
Usher, M., Cohen, J. D., Servan-Schreiber, D., Rajkowski, J. & Aston-Jones, G. The role of locus coeruleus in the regulation of cognitive performance. Science 283, 549–554 (1999)
Article ADS CAS PubMed Google Scholar
Doya, K. Metalearning and neuromodulation. Neural Netw. 15, 495–506 (2002)
Article PubMed Google Scholar

Download references

Acknowledgements

We thank J. Li, S. McClure, B. King-Casas and P. R. Montague for sharing their unpublished data on exploration, and Y. Niv, Z. Gharamani and C. Camerer for discussions. Funding was from a Royal Society USA Research Fellowship (N.D.), the Gatsby Foundation (N.D., P.D.), the EU BIBA project (N.D., P.D.), and a Wellcome Trust Programme Grant (J.O.D., R.D.).

Author information

John P. O'Doherty
Present address: Division of Humanities and Social Sciences, California Institute of Technology, 1200 East California Boulevard, Pasadena, California, 91125, USA
Nathaniel D. Daw and John P. O'Doherty: *These authors contributed equally to this work

Authors and Affiliations

Gatsby Computational Neuroscience Unit, University College London (UCL), Alexandra House, 17 Queen Square, WC1N 3AR, London, UK
Nathaniel D. Daw & Peter Dayan
Wellcome Department of Imaging Neuroscience, UCL, London, 12 Queen Square, WC1N 3BG, UK
John P. O'Doherty, Ben Seymour & Raymond J. Dolan

Authors

Nathaniel D. Daw
View author publications
You can also search for this author in PubMed Google Scholar
John P. O'Doherty
View author publications
You can also search for this author in PubMed Google Scholar
Peter Dayan
View author publications
You can also search for this author in PubMed Google Scholar
Ben Seymour
View author publications
You can also search for this author in PubMed Google Scholar
Raymond J. Dolan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Nathaniel D. Daw or John P. O'Doherty.

Ethics declarations

Competing interests

Reprints and permissions information is available at npg.nature.com/reprintsandpermissions. The authors declare no competing financial interests.

Supplementary information

Supplementary Notes

This file contains Supplementary Methods, Supplementary Discussion and Supplementary Tables 1–5. (PDF 371 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Daw, N., O'Doherty, J., Dayan, P. et al. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006). https://doi.org/10.1038/nature04766

Download citation

Received: 07 February 2006
Accepted: 30 March 2006
Issue Date: 15 June 2006
DOI: https://doi.org/10.1038/nature04766

This article is cited by

Dynamic computational phenotyping of human cognition
- Roey Schurr
- Daniel Reznik
- Samuel J. Gershman
Nature Human Behaviour (2024)
Exploring the steps of learning: computational modeling of initiatory-actions among individuals with attention-deficit/hyperactivity disorder
- Gili Katabi
- Nitzan Shahar
Translational Psychiatry (2024)
Corticostriatal activity related to performance during continuous de novo motor learning
- Sungbeen Park
- Junghyun Kim
- Sungshin Kim
Scientific Reports (2024)
How do animals weigh conflicting information about reward sources over time? Comparing dynamic averaging models
- Jack Van Allsburg
- Timothy A. Shahan
Animal Cognition (2024)
Exploring global trends and future directions in advertising research: A focus on consumer behavior
- Ahmed H. Alsharif
- Nor Zafir Md Salleh
- Abdalwali Lutfi
Current Psychology (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Cortical substrates for exploratory decisions in humans

Abstract

Access options

Similar content being viewed by others

Choice-relevant information transformation along a ventrodorsal axis in the medial prefrontal cortex

Disentangling the roles of dopamine and noradrenaline in the exploration-exploitation tradeoff during human decision-making

Primate anterior insular cortex represents economic decision variables proposed by prospect theory

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Supplementary information

Supplementary Notes

Rights and permissions

About this article

Cite this article

This article is cited by

Dynamic computational phenotyping of human cognition

Exploring the steps of learning: computational modeling of initiatory-actions among individuals with attention-deficit/hyperactivity disorder

Corticostriatal activity related to performance during continuous de novo motor learning

How do animals weigh conflicting information about reward sources over time? Comparing dynamic averaging models

Exploring global trends and future directions in advertising research: A focus on consumer behavior

Comments

Best to go with what you know?

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links