Quantifying biogenic bias in screening libraries

Hert, Jérôme; Irwin, John J; Laggner, Christian; Keiser, Michael J; Shoichet, Brian K

doi:10.1038/nchembio.180

Article
Published: 31 May 2009

Quantifying biogenic bias in screening libraries

Jérôme Hert¹,
John J Irwin¹,
Christian Laggner¹,
Michael J Keiser¹ &
…
Brian K Shoichet¹

Nature Chemical Biology volume 5, pages 479–483 (2009)Cite this article

3440 Accesses
182 Citations
33 Altmetric
Metrics details

Abstract

In lead discovery, libraries of 10⁶ molecules are screened for biological activity. Given the over 10⁶⁰ drug-like molecules thought possible, such screens might never succeed. The fact that they do, even occasionally, implies a biased selection of library molecules. We have developed a method to quantify the bias in screening libraries toward biogenic molecules. With this approach, we consider what is missing from screening libraries and how they can be optimized.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 2: Compounds in screening libraries are biased toward biogenic molecules.**

**Figure 3: Biogenic bias increases with molecular size.**

**Figure 4: Core ring structures common among drugs and related molecules.**

Modeling the expansion of virtual screening libraries

Article 16 January 2023

A practical guide to large-scale docking

Article 24 September 2021

Navigating the DNA encoded libraries chemical space

Article Open access 11 September 2020

References

Wilhelm, S. et al. Discovery and development of sorafenib: a multikinase inhibitor for treating cancer. Nat. Rev. Drug Discov. 5, 835–844 (2006).
Article CAS Google Scholar
Spencer, R.W. High-throughput screening of historic collections: observations on file size, biological targets, and file diversity. Biotechnol. Bioeng. 61, 61–67 (1998).
Article CAS Google Scholar
Fox, S., Farr-Jones, S., Sopchak, L., Boggs, A. & Comley, J. High-throughput screening: searching for higher productivity. J. Biomol. Screen. 9, 354–358 (2004).
Article CAS Google Scholar
Macarron, R. Critical review of the role of HTS in drug discovery. Drug Discov. Today 11, 277–279 (2006).
Article Google Scholar
Pereira, D.A. & Williams, J.A. Origin and evolution of high throughput screening. Br. J. Pharmacol. 152, 53–61 (2007).
Article CAS Google Scholar
Bohacek, R., McMartin, C. & Guida, W. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).
Article CAS Google Scholar
Roth, B., Sheffler, D. & Kroeze, W. Magic shotguns versus magic bullets: selectively non-selective drugs for mood disorders and schizophrenia. Nat. Rev. Drug Discov. 3, 353–359 (2004).
Article CAS Google Scholar
Paolini, G., Shapland, R., van Hoorn, W., Mason, J. & Hopkins, A. Global mapping of pharmacological space. Nat. Biotechnol. 24, 805–815 (2006).
Article CAS Google Scholar
Yildirim, M., Goh, K.-I., Cusick, M., Barabasi, A.-L. & Vidal, M. Drug–target network. Nat. Biotechnol. 25, 1119–1126 (2007).
Article CAS Google Scholar
Martin, Y.C. Diverse viewpoints on computational aspects of molecular diversity. J. Comb. Chem. 3, 231–250 (2001).
Article CAS Google Scholar
Breinbauer, R., Vetter, I.R. & Waldmann, H. From protein domains to drug candidates—natural products as guiding principles in the design and synthesis of compound libraries. Angew. Chem. Int. Ed. 41, 2879–2890 (2002).
Google Scholar
Koehn, F. & Carter, G. The evolving role of natural products in drug discovery. Nat. Rev. Drug Discov. 4, 206–220 (2005).
Article CAS Google Scholar
Arve, L., Voigt, T. & Waldmann, H. Charting biological and chemical space: PSSC and SCONP as guiding principles for the development of compound collections based on natural product scaffolds. QSAR Comb. Sci. 25, 449–456 (2006).
Article CAS Google Scholar
Ertl, P., Roggo, S. & Schuffenhauer, A. Natural product-likeness score and its application for prioritization of compound libraries. J. Chem. Inf. Model. 48, 68–74 (2008).
Article CAS Google Scholar
Gupta, S. Aires-de-Sousa, J. Comparing the chemical spaces of metabolites and available chemicals: models of metabolite-likeness. Mol. Divers. 11, 23–36 (2007).
Article CAS Google Scholar
Fink, T. & Reymond, J.L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J. Chem. Inf. Model. 47, 342–353 (2007).
Article CAS Google Scholar
Sadowski, J. & Kubinyi, H. A scoring scheme for discriminating between drugs and nondrugs. J. Med. Chem. 41, 3325–3329 (1998).
Article CAS Google Scholar
Good, A.C. & Hermsmeier, M.A. Measuring CAMD technique performance. 2. How “druglike” are drugs? Implications of random test set selection exemplified using druglikeness classification models. J. Chem. Inf. Model. 47, 110–114 (2007).
Article CAS Google Scholar
Glen, R.C. et al. Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs 9, 199–204 (2006).
CAS Google Scholar
Bemis, G.W. & Murcko, M.A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
Article CAS Google Scholar
Schreiber, S. Target-oriented and diversity-oriented organic synthesis in drug discovery. Science 287, 1964–1969 (2000).
Article CAS Google Scholar
Haggarty, S., Clemons, P., Wong, J. & Schreiber, S. Mapping chemical space using molecular descriptors and chemical genetics: deacetylase inhibitors. Comb. Chem. High Throughput Screen. 7, 669–676 (2004).
Article CAS Google Scholar
Shang, S. & Tan, D.S. Advancing chemistry and biology through diversity-oriented synthesis of natural product-like libraries. Curr. Opin. Chem. Biol. 9, 248–258 (2005).
Article CAS Google Scholar
Gregori-Puigjané, E. & Mestres, J. Coverage and bias in chemical library design. Curr. Opin. Chem. Biol. 12, 359–365 (2008).
Article Google Scholar
Ertl, P., Jelfs, S., Mühlbacher, J., Schuffenhauer, A. & Selzer, P. Quest for the rings. In silico exploration of ring universe to identify novel bioactive heteroaromatic scaffolds. J. Med. Chem. 49, 4568–4573 (2006).
Article CAS Google Scholar
Wester, M.J. et al. Scaffold topologies. 2. Analysis of chemical databases. J. Chem. Inf. Model. 48, 1311–1324 (2008).
Article CAS Google Scholar
Wetzel, S., Schuffenhauer, A., Roggo, S., Ertl, P. & Waldmann, H. Cheminformatic analysis of natural products and their chemical space. Chimia 61, 355–360 (2007).
Article CAS Google Scholar
Fink, T., Bruggesser, H. & Reymond, J.L. Virtual exploration of the small-molecule chemical universe below 160 Daltons. Angew. Chem. Int. Ed. 44, 1504–1508 (2005).
Article CAS Google Scholar
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Article CAS Google Scholar
Buckingham, J. Dictionary of Natural Products (Chapman & Hall/CRC, United Kingdom, 2008).
Google Scholar
Irwin, J.J. & Shoichet, B.K. ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45, 177–182 (2005).
Article CAS Google Scholar
Morgan, H.L. Generation of a unique description for chemical structures-a technique developed at Chemical Abstract Service. J. Chem. Doc. 5, 107–113 (1965).
Article CAS Google Scholar
Hert, J. et al. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org. Biomol. Chem. 2, 3256–3266 (2004).
Article CAS Google Scholar
Koch, M. et al. Charting biologically relevant chemical space: a structural classification of natural products (SCONP). Proc. Natl. Acad. Sci. USA 102, 17272–17277 (2005).
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by US National Institutes of Health grant GM59957 to B.K.S. J.H. was supported by a Marie Curie fellowship from the 6th Framework Program of the European Commission; M.J.K. was supported by a US National Science Foundation graduate fellowship; C.L. was supported by a fellowship from the Max Kade Foundation.

Author information

Authors and Affiliations

Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA
Jérôme Hert, John J Irwin, Christian Laggner, Michael J Keiser & Brian K Shoichet

Authors

Jérôme Hert
View author publications
You can also search for this author in PubMed Google Scholar
John J Irwin
View author publications
You can also search for this author in PubMed Google Scholar
Christian Laggner
View author publications
You can also search for this author in PubMed Google Scholar
Michael J Keiser
View author publications
You can also search for this author in PubMed Google Scholar
Brian K Shoichet
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The project was conceived of by J.H. and B.K.S. J.H. undertook most of the calculations, with molecular proof checking by J.J.I. and C.L. and algorithmic assistance from M.J.K. J.H. and B.K.S. wrote the manuscript, which was read and commented on by the other authors.

Corresponding author

Correspondence to Brian K Shoichet.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Tables 1–3 (PDF 306 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hert, J., Irwin, J., Laggner, C. et al. Quantifying biogenic bias in screening libraries. Nat Chem Biol 5, 479–483 (2009). https://doi.org/10.1038/nchembio.180

Download citation

Received: 30 January 2009
Accepted: 10 April 2009
Published: 31 May 2009
Issue Date: July 2009
DOI: https://doi.org/10.1038/nchembio.180

This article is cited by

Combatting over-specialization bias in growing chemical databases
- Katharina Dost
- Zac Pullar-Strecker
- Jörg S. Wicker
Journal of Cheminformatics (2023)
Modeling the expansion of virtual screening libraries
- Jiankun Lyu
- John J. Irwin
- Brian K. Shoichet
Nature Chemical Biology (2023)
A geometric deep learning approach to predict binding conformations of bioactive molecules
- Oscar Méndez-Lucio
- Mazen Ahmad
- Jörg Kurt Wegner
Nature Machine Intelligence (2021)
Marine dissolved organic matter: a vast and unexplored molecular space
- Teresa S. Catalá
- Spencer Shorte
- Thorsten Dittmar
Applied Microbiology and Biotechnology (2021)
De novo generation of hit-like molecules from gene expression signatures using artificial intelligence
- Oscar Méndez-Lucio
- Benoit Baillif
- Joerg Wichard
Nature Communications (2020)

Quantifying biogenic bias in screening libraries

Abstract

Access options

Similar content being viewed by others

Modeling the expansion of virtual screening libraries

A practical guide to large-scale docking

Navigating the DNA encoded libraries chemical space

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Supplementary information

Supplementary Text and Figures

Rights and permissions

About this article

Cite this article

This article is cited by

Combatting over-specialization bias in growing chemical databases

Modeling the expansion of virtual screening libraries

A geometric deep learning approach to predict binding conformations of bioactive molecules

Marine dissolved organic matter: a vast and unexplored molecular space

De novo generation of hit-like molecules from gene expression signatures using artificial intelligence

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links