With completion of the sequencing of the human and many other genomes, molecular biology is able to return to its roots to focus on, as Francis Crick1 put it, “the description of [proteins, viruses, bacteria and chromosomes] in terms of their structure, i.e. the spatial distribution of their constituent atoms”. Significantly, this definition includes systems that range from individual globular proteins in aqueous solution or highly regular crystals, to large assemblies of macromolecules, such as virus particles and membranes, that behave like aperiodic solids2.

Nuclear magnetic resonance (NMR) spectroscopy can be used on molecules in all physical states and degrees of order, and so is fundamentally capable of determining the structures of proteins in all these environments. Although powerful, the full promise of this technique has yet to be realized. In this issue, Kainosho et al. (page 52)3 demonstrate a method to optimize the isotopic make-up of the proteins being studied. This allows far more information to be gleaned by both simplifying and sharpening the spectra.

The potential of NMR spectroscopy for structure determination was obvious from the earliest results, when Pake4 showed that the distances and orientations of pairs of nuclei could be determined from the perturbations that the magnetic field of one nucleus causes in a neighbouring nucleus. The initial 1H NMR spectrum of a protein5 showed that a dispersion of chemical shift frequencies results from the local spatial relationships of atoms that are not necessarily bonded to one another; thus, the chemical shift interaction provides a means to distinguish all of the atoms in a protein.

The Achilles' heel6 of protein NMR is the ‘size problem’, which has two distinct components. The first is obvious: the spectra of larger polypeptides are more complex, with more signals and consequently overlap within the same limited range of frequencies. This is compounded by the second, more daunting, ‘correlation-time problem’: larger polypeptides (and their complexes) tumble slowly in solution7, which results in broad linewidths. The correlation-time problem can be solved under certain circumstances for proteins that are immobile (on NMR timescales) by virtue of their being in the solid state or very large complexes. But the problem has so far been intractable for larger soluble proteins in their native aqueous environment.

The past half-century of technological developments in NMR for investigating proteins has followed three parallel paths: instrumentation, experimental methods and isotopic labelling. The benefits of advances in all three areas are highly synergistic, despite their seemingly disparate nature. The selective replacement of hydrogen (1H) with deuterium (2H) in a protein8 was the first real step towards the resolution of overlapping NMR spectra, preceding the advent of high-field magnets and the sophisticated magnetic pulse sequences employed today. Deuterium nuclei couple much more weakly with the external magnetic field (and with the nearby 1H nuclei of interest), so that replacing most 1H by 2H has a substantial effect on the properties of the remaining 1H nuclei. In various incarnations, the incorporation of deuterium remains among the most widely used methods in protein NMR, even though, up to now, it required several difficult trade-offs.

When uniform deuteration is performed at very high levels, there are no 1H nuclei left as sources of magnetization or the couplings that provide structural information. Intermediate levels of deuteration generate a mixture of proteins with the 2H nuclei in different positions, complicating the chemical shift patterns, and the signal intensity decreases because there are fewer 1H nuclei in the sample.

Kainosho and colleagues3 describe a strategy for optimal isotope labelling for protein NMR that addresses these problems. Their approach goes far beyond just improving sample preparation — it is a major step forward in the attack on NMR's size problem. In addition, it provides a path to structure determination with higher spatial resolution. The authors developed protocols for ‘stereo-array isotope labelling’ (SAIL), which incorporates deuterium into a protein's constituent amino acids such that each carbon or nitrogen nucleus in the final protein will have at most one 1H nucleus bonded to it, the remaining hydrogen atoms having been replaced by 2H. These amino acids are then used to make the protein in a cell-free expression system that maintains the careful positioning of the isotopes. The patterns of replacement are designed to provide the most information consistent with spectral simplification and isotopic dilution (Fig. 1 on page 52), and it constitutes a general approach to overcoming the limitations of deuteration in a way that also enhances the contributions of 13C and 15N labelling for both distance and angle measurements throughout the protein. Contemporary alternatives are more limited; their focus is on specific labelling of methyl groups in a highly deuterated background9,10,11.

The SAIL technology improves NMR spectra because it decreases the number of signals without losing sensitivity for the resonances of interest, and it produces narrower linewidths because the dilution of 1H nuclei attenuates the line broadening. Moreover, it makes the short-range distance measurements more accurate, and, combined with easier assignment of side-chain resonances, leads to higher-resolution structures. Kainosho et al. demonstrate their approach by solving the structures of two proteins — calmodulin (relative molecular mass 17,000) and maltose binding protein (41,000) — at a resolution of about 1 Å. This rivals the highest resolution of structures previously determined for smaller proteins by solution NMR or X-ray crystallography.

The combination of SAIL with other NMR technologies means that protein structure determinations can be expected where accuracy is no longer limited by experimental methodology. Rather, the structures could reflect the fundamental dynamic properties of the backbone and side-chain atoms — portraying the essence of a working protein. Equivalent improvements in solid-state NMR studies can be readily envisaged.

Kainosho and colleagues' study is a significant advance in the field of protein NMR. It demonstrates that by dipping your heel in the river while sailing, the protein size problem will fade away. This will allow the study of proteins selected on the basis of their biological functions or in genuinely unbiased surveys of proteomes, resulting from its ease of automation.