Do no harm: a roadmap for responsible machine learning for health care

Wiens, Jenna; Saria, Suchi; Sendak, Mark; Ghassemi, Marzyeh; Liu, Vincent X.; Doshi-Velez, Finale; Jung, Kenneth; Heller, Katherine; Kale, David; Saeed, Mohammed; Ossorio, Pilar N.; Thadaney-Israni, Sonoo; Goldenberg, Anna

doi:10.1038/s41591-019-0548-6

Perspective
Published: 19 August 2019

Do no harm: a roadmap for responsible machine learning for health care

Jenna Wiens ORCID: orcid.org/0000-0002-1057-7722¹^na1,
Suchi Saria^2,3,4^na1,
Mark Sendak ORCID: orcid.org/0000-0001-5828-4497⁵,
Marzyeh Ghassemi^6,7,8,
Vincent X. Liu⁹,
Finale Doshi-Velez¹⁰,
Kenneth Jung¹¹,
Katherine Heller^12,13,
David Kale¹⁴,
Mohammed Saeed¹⁵,
Pilar N. Ossorio¹⁶,
Sonoo Thadaney-Israni¹⁷ &
…
Anna Goldenberg^6,8,18,19^na1

Nature Medicine volume 25, pages 1337–1340 (2019)Cite this article

32k Accesses
400 Citations
707 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 19 September 2019

This article has been updated

Abstract

Interest in machine-learning applications within medicine has been growing, but few studies have progressed to deployment in patient care. We present a framework, context and ultimately guidelines for accelerating the translation of machine-learning-based interventions in health care. To be successful, translation will require a team of engaged stakeholders and a systematic process from beginning (problem formulation) to end (widespread deployment).

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: A roadmap for deploying effective ML systems in health care.**

Causal machine learning for predicting treatment outcomes

Article 19 April 2024

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Demographic bias in misdiagnosis by computational pathology models

Article 19 April 2024

Change history

19 September 2019
An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

Lazer, D., Kennedy, R., King, G. & Vespignani, A. Big data. The parable of Google Flu: traps in big data analysis. Science 343, 1203–1205 (2014).
Article CAS Google Scholar
Hutson, M. Even artificial intelligence can acquire biases against race and gender. Science https://doi.org/10.1126/science.aal1053 (2017).
He, J. et al. The practical implementation of artificial intelligence technologies in medicine. Nat. Med. 25, 30–36 (2019).
Article CAS Google Scholar
Silva, I., Moody, G., Scott, D. J., Celi, L. A. & Mark, R. G. Predicting in-hospital mortality of ICU patients: the Physionet/Computing in Cardiology Challenge 2012. Comput. Cardiol. 39, 245–248 (2012).
Google Scholar
Luo, Y., Cai, X., Zhang, Y. & Xu, J. Multivariate time series imputation with generative adversarial networks. in Advances in Neural Information Processing Systems 1596–1607 (NeurIPS, 2018).
O’Malley, K. J. et al. Measuring diagnoses: ICD code accuracy. Health Serv. Res. 40, 1620–1639 (2005).
Article Google Scholar
Saria, S. & Subbaswamy, A. Tutorial: safe and reliable machine learning. Preprint at https://arxiv.org/abs/1904.07204 (2019).
Chen, I. Y., Szolovits, P. & Ghassemi, M. Can AI help reduce disparities in general medical and mental health care? AMA J. Ethics 21, E167–E179 (2019).
Article Google Scholar
Schulam, P. & Saria, S. Reliable decision support using counterfactual models. in Advances in Neural Information Processing Systems 1697–1708 (NeurIPS, 2017).
O’neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Broadway Books, 2016).
Williams, D. R., Mohammed, S. A., Leavell, J. & Collins, C. Race, socioeconomic status, and health: complexities, ongoing challenges, and research opportunities. Ann. NY Acad. Sci. 1186, 69–101 (2010).
Article Google Scholar
Rajpurkar, P. et al. Chexnet: radiologist-level pneumonia detection on chest X-rays with deep learning. Preprint at https://arxiv.org/abs/1711.05225 (2017).
Liu, V.X., Bates, D.W., Wiens, J. & Shah, N.H. The number needed to benefit: estimating the value of predictive analytics in healthcare. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocz088 (2019).
Oh, J. et al. A generalizable, data-driven approach to predict daily risk of Clostridium difficile infection at two large academic health centers. Infect. Control Hosp. Epidemiol. 39, 425–433 (2018).
Article Google Scholar
Schulam, P. & Saria, S. Can you trust this prediction? Auditing pointwise reliability after learning. in The 22nd International Conference on Artificial Intelligence and Statistics 1022–1031 (PMLR, 2019).
Henderson, P. et al. Deep reinforcement learning that matters. in Thirty-second AAAI Conference on Artificial Intelligence (AAAI, 2018).
Nestor, B. et al. Rethinking clinical prediction: why machine learning must consider year of care and feature aggregation. Preprint at https://arxiv.org/abs/1811.12583 (2018).
Henry, K. E., Hager, D. N., Pronovost, P. J. & Saria, S. A targeted real-time early warning score (TREWScore) for septic shock. Sci. Transl. Med. 7, 299ra122 (2015).
Article Google Scholar
Hemming, K., Haines, T. P., Chilton, P. J., Girling, A. J. & Lilford, R. J. The stepped wedge cluster randomised trial: rationale, design, analysis, and reporting. Br. Med. J. 350, h391 (2015).
Article CAS Google Scholar
Evans, B. & Ossorio, P. The challenge of regulating clinical decision support software after 21^st century cures. Am. J. Law Med. 44, 237–251 (2018).
Article Google Scholar
Okoro, A. O. Preface: The 21^st Century Cures Act—a cure for the 21^st century? Am. J. Law Med. 44, 155 (2018).
Article Google Scholar
Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) (U.S. Food & Drug Administration, 2019); https://www.fda.gov/media/122535/download
Massachusetts Institute of Technology. Self-driving cars, robots: identifying AI ‘blind spots’. ScienceDaily (25 January 2019).
Chien, S. & Wagstaff, K. L. Robotic space exploration agents. Sci. Robot. 2, eaan4831 (2017).
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the participants in the MLHC Conference 2018 (http://www.mlforhc.org), specifically the organizers and participants of the pre-meeting workshop that served as the genesis for this manuscript, for providing valuable feedback on the initial ideas through a panel discussion.

Author information

These authors contributed equally: Jenna Wiens, Suchi Saria, Anna Goldenberg.

Authors and Affiliations

Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
Jenna Wiens
Departments of Computer Science and Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
Suchi Saria
Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
Suchi Saria
Bayesian Health, New York, NY, USA
Suchi Saria
Duke Institute for Health Innovation, Duke University School of Medicine, Durham, NC, USA
Mark Sendak
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Marzyeh Ghassemi & Anna Goldenberg
Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Marzyeh Ghassemi
Vector Institute, Toronto, Ontario, Canada
Marzyeh Ghassemi & Anna Goldenberg
Kaiser Permanente Division of Research, Oakland, CA, USA
Vincent X. Liu
School of Engineering and Applied Science, Harvard University, Cambridge, MA, USA
Finale Doshi-Velez
Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
Kenneth Jung
Google Inc., Mountain View, CA, USA
Katherine Heller
Department of Statistical Science, Duke University, Durham, NC, USA
Katherine Heller
Information Sciences Institute, University of Southern California, Los Angeles, CA, USA
David Kale
Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
Mohammed Saeed
Law School, University of Wisconsin–Madison, Madison, WI, USA
Pilar N. Ossorio
Presence and Program in Bedside Medicine, Stanford University School of Medicine, Stanford, CA, USA
Sonoo Thadaney-Israni
SickKids Research Institute, Toronto, Ontario, Canada
Anna Goldenberg
Child and Brain Development Program, CIFAR, Toronto, Ontario, Canada
Anna Goldenberg

Authors

Jenna Wiens
View author publications
You can also search for this author in PubMed Google Scholar
Suchi Saria
View author publications
You can also search for this author in PubMed Google Scholar
Mark Sendak
View author publications
You can also search for this author in PubMed Google Scholar
Marzyeh Ghassemi
View author publications
You can also search for this author in PubMed Google Scholar
Vincent X. Liu
View author publications
You can also search for this author in PubMed Google Scholar
Finale Doshi-Velez
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth Jung
View author publications
You can also search for this author in PubMed Google Scholar
Katherine Heller
View author publications
You can also search for this author in PubMed Google Scholar
David Kale
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Pilar N. Ossorio
View author publications
You can also search for this author in PubMed Google Scholar
Sonoo Thadaney-Israni
View author publications
You can also search for this author in PubMed Google Scholar
Anna Goldenberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jenna Wiens or Anna Goldenberg.

Ethics declarations

Competing interests

J.W., F.D.-V., D.K. and K.J. are on the board of Machine Learning for Healthcare, a non-profit organization that hosts a yearly academic meeting; they are reimbursed for registration and travel expenses. F.D.-V. consults for DaVita, a healthcare company. S.T.-I. serves on the board of Scients (https://scients.org/) and is reimbursed for travel expenses. S.S. is a founder of, and holds equity in, Bayesian Health. The results of the study discussed in this publication could affect the value of Bayesian Health. This arrangement has been reviewed and approved by Johns Hopkins University in accordance with its conflict-of-interest policies. S.S. is a member of the scientific advisory board for PatientPing. M. Sendak is a named inventor of the Sepsis Watch deep-learning model, which was licensed from Duke University by Cohere Med, Inc. M. Sendak does not hold any equity in Cohere Med, Inc. M. Saeed is a founder and Chief Medical Officer at HEALTH at SCALE Technologies and holds equity in this company. P.O. consults for Roche-Genentech, from whom she has received travel reimbursement and consulting fees of less than $4,000/year. A.G., K.H., M.G. and V.L. have no conflicts to declare.

Additional information

Peer review information Joao Monteiro was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wiens, J., Saria, S., Sendak, M. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med 25, 1337–1340 (2019). https://doi.org/10.1038/s41591-019-0548-6

Download citation

Received: 10 July 2019
Accepted: 17 July 2019
Published: 19 August 2019
Issue Date: September 2019
DOI: https://doi.org/10.1038/s41591-019-0548-6

This article is cited by

Predicting non-muscle invasive bladder cancer outcomes using artificial intelligence: a systematic review using APPRAISE-AI
- Jethro C. C. Kwong
- Jeremy Wu
- Girish S. Kulkarni
npj Digital Medicine (2024)
Deep learning-aided decision support for diagnosis of skin disease across skin tones
- Matthew Groh
- Omar Badri
- Rosalind Picard
Nature Medicine (2024)
A causal perspective on dataset bias in machine learning for medical imaging
- Charles Jones
- Daniel C. Castro
- Ben Glocker
Nature Machine Intelligence (2024)
The algorithm journey map: a tangible approach to implementing AI solutions in healthcare
- William Boag
- Alifia Hasan
- Mark Sendak
npj Digital Medicine (2024)
New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology
- Bouchra Derraz
- Gabriele Breda
- Stephen Gilbert
npj Precision Oncology (2024)