Key Points
-
Genomic islands (GIs) are large genomic regions (typically 10–200 kb in length) that are found in bacterial genomes and that have probably been horizontally acquired.
-
GIs disproportionately carry genes related to various functions of medical and environmental importance and have been named accordingly as 'pathogenicity islands', antibiotic 'resistance islands' and 'metabolic islands'.
-
The location of GIs can be computationally predicted by identifying one or more of the various features associated with GIs, such as sequence composition bias, known integration sites and genes of particular function, as well as abnormal phyletic patterns.
-
The accuracy of GI prediction programs varies widely, with some having high precision and others having high recall.
-
Many other bioinformatics tools can complement GI prediction programs, such as whole-genome alignment programs, genome viewers, genome annotators and databases of previously identified GIs.
-
Although various methods exist for the identification of GIs, manual curation is still often required to verify the predictions. Although increased genomic sampling should improve the accuracy of many of the methods that are currently available, future methods that combine various GI prediction approaches and improve the identification of GI boundaries should further help researchers to identify these important genomic regions.
Abstract
Bacterial genomes contain clusters of genes that are acquired by horizontal transfer, called genomic islands (GIs). GIs are frequently associated with microbial adaptations that are of medical and environmental interest, and they have had a substantial impact on bacterial evolution. Therefore, there is growing interest in efficiently identifying GIs in newly sequenced bacterial genomes. Several computational methods for detecting GIs have been developed recently, presenting researchers with a myriad of choices. Here, we discuss the limitations and benefits of the main approaches that are available and present guidelines to aid researchers in effectively identifying these important genomic regions.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Hacker, J. et al. Deletions of chromosomal regions coding for fimbriae and hemolysins occur in vitro and in vivo in various extraintestinal Escherichia coli isolates. Microb. Pathog. 8, 213–225 (1990).
Hacker, J., Blum-Oehler, G., Muhldorfer, I. & Tschape, H. Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution. Mol. Microbiol. 23, 1089–1097 (1997).
Hacker, J. & Kaper, J. B. Pathogenicity islands and the evolution of microbes. Annu. Rev. Microbiol. 54, 641–679 (2000).
Boyd, E. F., Almagro-Moreno, S. & Parent, M. A. Genomic islands are dynamic, ancient integrative elements in bacterial evolution. Trends Microbiol. 17, 47–53 (2009).
Gal-Mor, O. & Finlay, B. B. Pathogenicity islands: a molecular toolbox for bacterial virulence. Cell. Microbiol. 8, 1707–1719 (2006).
Dobrindt, U., Hochhut, B., Hentschel, U. & Hacker, J. Genomic islands in pathogenic and environmental microorganisms. Nature Rev. Microbiol. 2, 414–424 (2004). A review of GIs and their importance in bacterial evolution.
Winstanley, C. et al. Newly introduced genomic prophage islands are critical determinants of in vivo competitiveness in the Liverpool Epidemic Strain of Pseudomonas aeruginosa. Genome Res. 19, 12–23 (2008). A recent study showing that several newly aquired prophages and GIs provide an advantage to a virulent P. aeruginosa strain.
Ho Sui, S. J., Fedynak, A., Hsiao, W. W., Langille, M. G. & Brinkman, F. S. The association of virulence factors with genomic islands. PLoS ONE 4, e8094 (2009).
Lawrence, J. G. Common themes in the genome strategies of pathogens. Curr. Opin. Genet. Dev. 15, 584–588 (2005).
Manson, J. M. & Gilmore, M. S. Pathogenicity island integrase cross-talk: a potential new tool for virulence modulation. Mol. Microbiol. 61, 555–559 (2006).
Bueno, S. M. et al. Precise excision of the large pathogenicity island, SPI7, in Salmonella enterica serovar Typhi. J. Bacteriol. 186, 3202–3213 (2004).
Middendorf, B. et al. Instability of pathogenicity islands in uropathogenic Escherichia coli 536. J. Bacteriol. 186, 3086–3096 (2004).
Finlay, B. B. & Falkow, S. Common themes in microbial pathogenicity revisited. Microbiol. Mol. Biol. Rev. 61, 136–169 (1997).
Gogol, E. B., Cummings, C. A., Burns, R. C. & Relman, D. A. Phase variation and microevolution at homopolymeric tracts in Bordetella pertussis. BMC Genomics 8, 122 (2007).
Hochhut, B. et al. Molecular analysis of antibiotic resistance gene clusters in Vibrio cholerae O139 and O1 SXT constins. Antimicrob. Agents Chemother. 45, 2991–3000 (2001).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Darling, A. C. E., Mau, B., Blattner, F. R. & Perna, N. T. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14, 1394–1403 (2004).
Vernikos, G. S. & Parkhill, J. Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Bioinformatics 22, 2196–2203 (2006).
Karlin, S., Mrazek, J. & Campbell, A. M. Codon usages in different gene classes of the Escherichia coli genome. Mol. Microbiol. 29, 1341–1355 (1998).
Karlin, S. Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol. 9, 335–343 (2001). This article describes one of the first attempts to use sequence composition to predict the location of GIs.
Sandberg, R. et al. Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11, 1404–1409 (2001).
Tsirigos, A. & Rigoutsos, I. A new computational method for the detection of horizontal gene transfer events. Nucleic Acids Res. 33, 922–933 (2005).
Lawrence, J. G. & Ochman, H. Amelioration of bacterial genomes: rates of change and exchange. J. Mol. Evol. 44, 383–397 (1997). The first publication to suggest that the sequence composition of a sequence derived from an HGT event adapts to that of the new host over time, therefore making the prediction of ancient GIs using sequence composition bias more difficult.
Williams, K. P. Integration sites for genetic elements in prokaryotic tRNA and tmRNA genes: sublocation preference of integrase subfamilies. Nucleic Acids Res. 30, 866–875 (2002).
Reiter, W. D., Palm, P. & Yeats, S. Transfer RNA genes frequently serve as integration sites for prokaryotic genetic elements. Nucleic Acids Res. 17, 1907–1914 (1989).
Langille, M. G. & Brinkman, F. S. IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25, 664–665 (2009).
Fouts, D. Phage_Finder: automated identification and classification of prophage regions in complete bacterial genome sequences. Nucleic Acids Res. 34, 5839–5851 (2006).
Hsiao, W. W. et al. Evidence of a large novel gene pool associated with prokaryotic genomic islands. PLoS Genet. 1, e62 (2005).
Vernikos, G. S. & Parkhill, J. Resolving the structural features of genomic islands: a machine learning approach. Genome Res. 18, 331–342 (2008).
Nakamura, Y., Itoh, T., Matsuda, H. & Gojobori, T. Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nature Genet. 36, 760–766 (2004).
Waack, S. et al. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics 7, 142 (2006).
Suttle, C. A. Viruses in the sea. Nature 437, 356–361 (2005).
Langille, M. G. I., Hsiao, W. W. L. & Brinkman, F. S. L. Evaluation of genomic island predictors using a comparative genomics approach. BMC Bioinformatics 9, 329 (2008). An in-depth analysis of the differences between the GI prediction programs that are currently available.
Merkl, R. SIGI: score-based identification of genomic islands. BMC Bioinformatics 5, 22 (2004).
Nakamura, Y., Gojobori, T. & Ikemura, T. Codon usage tabulated from the international DNA sequence databases; its status 1999. Nucleic Acids Res. 27, 292 (1999).
Eddy, S. R. What is a hidden Markov model? Nature Biotech. 22, 1315–1316 (2004).
Finn, R. D. et al. The Pfam protein families database. Nucleic Acids Res. 36, D281–D288 (2008).
Tu, Q. & Ding, D. Detecting pathogenicity islands and anomalous gene clusters by iterative discriminant analysis. FEMS Microbiol. Lett. 221, 269–275 (2003).
Rajan, I., Aravamuthan, S. & Mande, S. S. Identification of compositionally distinct regions in genomes using the centroid method. Bioinformatics 23, 2672–2677 (2007).
Pundhir, S., Vijayvargiya, H. & Kumar, A. PredictBias: a server for the identification of genomic and pathogenicity islands in prokaryotes. In Silico Biol. 8, 223–234 (2008).
Delcher, A. L., Phillippy, A., Carlton, J. & Salzberg, S. L. Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res. 30, 2478–2483 (2002).
Ou, H. Y. et al. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands. Nucleic Acids Res. 35, W97–W104 (2007).
Ou, H. Y. et al. A novel strategy for the identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites in closely related bacteria. Nucleic Acids Res. 34, e3 (2006).
Rutherford, K. et al. Artemis: sequence visualization and annotation. Bioinformatics 16, 944–945 (2000).
Hsiao, W., Wan, I., Jones, S. J. & Brinkman, F. S. IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics 19, 418–420 (2003).
Chiapello, H. et al. Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops. BMC Bioinformatics 6, 171 (2005).
Mantri, Y. & Williams, K. P. Islander: a database of integrative islands in prokaryotic genomes, the associated integrases and their DNA site specificities. Nucleic Acids Res. 32, D55–D58 (2004).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997).
Laslett, D., Canback, B. & Andersson, S. BRUCE: a program for the detection of transfer-messenger RNA genes in nucleotide sequences. Nucleic Acids Res. 30, 3449–3453 (2002).
Yoon, S. et al. Towards pathogenomics: a web-based resource for pathogenicity islands. Nucleic Acids Res. 35, D395–D400 (2006).
Yang, J., Chen, L., Sun, L., Yu, J. & Jin, Q. VFDB 2008 release: an enhanced web-based resource for comparative pathogenomics. Nucleic Acids Res. 36, D539–D542 (2008).
Smart, C. H., Walshaw, M. J., Hart, C. A. & Winstanley, C. Use of suppression subtractive hybridization to examine the accessory genome of the Liverpool cystic fibrosis epidemic strain of Pseudomonas aeruginosa. J. Med. Microbiol. 55, 677–688 (2006).
Fothergill, J. L., Mowat, E., Ledson, M. J., Walshaw, M. J. & Winstanley, C. Fluctuations in phenotypes and genotypes within populations of Pseudomonas aeruginosa in the cystic fibrosis lung during pulmonary exacerbations. J. Med. Microbiol. 59, 472–481 (2009).
Carver, T. J. et al. ACT: the Artemis Comparison Tool. Bioinformatics 21, 3422–3423 (2005).
Brudno, M. et al. Glocal alignment: finding rearrangements during alignment. Bioinformatics 19 (Suppl. 1), i54–i62 (2003).
Frazer, K. A., Pachter, L., Poliakov, A., Rubin, E. M. & Dubchak, I. VISTA: computational tools for comparative genomics. Nucleic Acids Res. 32, W273–W279 (2004).
Markowitz, V. M. et al. The integrated microbial genomes (IMG) system. Nucleic Acids Res. 34, 344–348 (2006).
Azad, R. K. & Lawrence, J. G. Detecting laterally transferred genes: use of entropic clustering methods and genome position. Nucleic Acids Res. 35, 4629–4639 (2007).
Arvey, A. J., Azad, R. K., Raval, A. & Lawrence, J. G. Detection of genomic islands via segmental genome heterogeneity. Nucleic Acids Res. 37, 5255–5266 (2009).
Chen, J. & Novick, R. P. Phage-mediated intergeneric transfer of toxin genes. Science 323, 139–141 (2009).
Canchaya, C., Fournous, G. & Brussow, H. The impact of prophages on bacterial chromosomes. Mol. Microbiol. 53, 9–18 (2004).
Casjens, S. Prophages and bacterial genomics: what have we learned so far? Mol. Microbiol. 49, 277–300 (2003).
Tinsley, C. R., Bille, E. & Nassif, X. Bacteriophages and pathogenicity: more than just providing a toxin? Microbes Infect. 8, 1365–1371 (2006).
Rajakumar, K., Sasakawa, C. & Adler, B. Use of a novel approach, termed island probing, identifies the Shigella flexneri she pathogenicity island which encodes a homolog of the immunoglobulin A protease-like family of proteins. Infect. Immun. 65, 4606–4614 (1997).
Al-Hasani, K. et al. The sigA gene which is borne on the she pathogenicity island of Shigella flexneri 2a encodes an exported cytopathic protease involved in intestinal fluid accumulation. Infect. Immun. 68, 2457–2463 (2000).
Al-Hasani, K. et al. Genetic organization of the she pathogenicity island in Shigella flexneri 2a. Microb. Pathog. 30, 1–8 (2001).
Al-Hasani, K., Adler, B., Rajakumar, K. & Sakellaris, H. Distribution and structural variation of the she pathogenicity island in enteric bacterial pathogens. J. Med. Microbiol. 50, 780–786 (2001).
Kurtz, S. & Schleiermacher, C. REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics 15, 426–427 (1999).
Tatusov, R. L., Koonin, E. V. & Lipman, D. J. A genomic perspective on protein families. Science 278, 631–637 (1997).
Acknowledgements
We gratefully acknowledge the Simon Fraser University and University of British Columbia's Bioinformatics Training Program, which is funded by the Canadian Institutes of Health Research (CIHR) and the Michael Smith Foundation for Health Research (MSFHR), for providing initial funding. F.S.L.B. is the recipient of a MSFHR Senior Scholar award and a CIHR New Investigator award. Support for analyses was also provided by Genome Canada and the Cystic Fibrosis Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Related links
Related links
DATABASES
Entrez Genome Project
Escherichia fergusonii ATCC 35469
FURTHER INFORMATION
Glossary
- Horizontal gene transfer
-
Transfer of genetic material from one organism to another organism that is not its offspring.
- Pathogenicity island
-
A subset of genomic islands that contribute to the pathogenicity of a bacterium.
- Genomic island
-
In a bacterial genome, a cluster of genes for which there is evidence of horizontal origins.
- Mobile genetic element
-
Any sequence of DNA that is physically moved within the genome of an organism or between different organisms.
- Prophage
-
A viral genome that has integrated into a bacterial host genome.
- Integron
-
A gene capture system that assembles tandem arrays of genes and provides them with a promoter for expression. Integrons are often found in other mobile elements.
- Conjugative transposon
-
An integrated DNA element that can excise and transfer, by conjugation, to another bacterial host.
- Integrative conjugative element
-
A self-transmissible MGE that is transferred by conjugation and integrates into the genome in order to replicate.
- Phyletic pattern
-
The presence or absence of evolutionarily related genes or organisms.
- Integrase
-
An enzyme that is often used by phages for site-specific recombination between two DNA strands, catalysing the integration or excision of DNA and resulting in the formation of a transient covalent bond with the DNA substrate.
- Transposase
-
An enzyme that is encoded by transposons and insertion sequence elements and is required for site-specific recombination between two DNA elements that specifically does not involve the formation of a covalent enzyme–substrate intermediate.
- Insertion sequence element
-
A short mobile DNA sequence similar to a transposon but only encoding genes for its transposition.
- k-mer
-
A piece of nucleotide sequence of length k nucleotides.
- Hidden Markov Model
-
A statistical model used for pattern recognition that can be used to analyse DNA sequences.
Rights and permissions
About this article
Cite this article
Langille, M., Hsiao, W. & Brinkman, F. Detecting genomic islands using bioinformatics approaches. Nat Rev Microbiol 8, 373–382 (2010). https://doi.org/10.1038/nrmicro2350
Issue Date:
DOI: https://doi.org/10.1038/nrmicro2350
This article is cited by
-
Biosynthetic gene cluster profiling from North Java Sea Virgibacillus salarius reveals hidden potential metabolites
Scientific Reports (2023)
-
Connecting genomic islands across prokaryotic and phage genomes via protein families
Scientific Reports (2023)
-
Novel approach toward the understanding of genetic diversity based on the two types of amino acid repeats in Erwinia amylovora
Scientific Reports (2023)
-
Genome annotation and comparative functional analysis of genomic islands in Bordetella pertussis Tohama I, Bordetella parapertussis 12822, and Bordetella bronchiseptica RB50 genomes
Network Modeling Analysis in Health Informatics and Bioinformatics (2023)