Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
Reference genome assemblies provide a map of a species’ DNA sequence and its spatial context—that is, where along the chromosomes a specific piece of DNA sequence can be found. In the past, the generation of reference assemblies was prohibitively expensive and labour-intensive, so they were only produced for humans and the most important model organisms, and still contained gaps and errors. Draft genomes generated using more affordable second-generation sequencing technologies could be assembled for a larger number of species, but these were of lower quality because they were highly fragmented and annotation was erroneous. However, for a complete understanding of evolutionary processes and other fundamental questions in biology, high-quality reference genome assemblies of all species are required. Technological advances, improved computational methods and the ever-decreasing cost of sequencing enabled the Vertebrate Genomes Project (VGP), which was launched in 2017, to pursue the ambitious goal of producing a reference genome assembly for each of the extant vertebrate species on Earth. In the first phase of the project, the VGP has been focused on testing and improving genome sequencing and assembly approaches, on assembling a first set of 260 high-quality genomes of species representing all vertebrate orders (a work that is still in progress), and on the initial reporting of insights into genome evolution in vertebrates. Milestones for phase II will be the production of assemblies for about 1,159 vertebrate families, and for phase III will involve the generation of assemblies for more than 10,000 genera; finally, in phase IV, assemblies will be completed for all vertebrate species.
This collection showcases articles describing the project, data and findings from the first phase of the VGP and includes useful links to the VGP resources.
The Vertebrate Genome Project has used an optimized pipeline to generate high-quality genome assemblies for sixteen species (representing all major vertebrate classes), which have led to new biological insights.
A revised, universal nomenclature for the vertebrate genes that encode the oxytocin and vasopressin–vasotocin ligands and receptors will improve our understanding of gene evolution and facilitate the translation of findings across species.
A trio-binning approach is used to produce a fully haplotype-resolved diploid genome assembly for the common marmoset, providing insight into the heterozygosity spectrum and the evolution of the sex-differentiation region.
New reference genomes of the two extant monotreme lineages (platypus and echidna) reveal the ancestral and lineage-specific genomic changes that shape both monotreme and mammalian evolution.
Reference-quality genomes for six bat species shed light on the phylogenetic position of Chiroptera, and provide insight into the genetic underpinnings of the unique adaptations of this clade.
Methods to produce haplotype-resolved genome assemblies often rely on access to family trios. The authors present FALCON-Phase, a tool that combines ultra-long range Hi-C chromatin interaction data with a long read de novo assembly to extend haplotype phasing to the contig or scaffold level.