Main

Most mechanical unfolding studies have focused on small globular domains, using either the atomic force microscope (AFM)2,3,4 or optical tweezers5. By grabbing the molecule at specific residues, one can select both the unfolding reaction coordinate and the region of the molecule subjected to mechanical force. AFM studies designed to pull along different molecular axes have demonstrated that the orientation of the applied force can determine the accessibility to different unfolding pathways6,7,8. As we show here, mechanical manipulation provides a unique method to induce the selective unfolding of a particular region of a protein, making it possible to characterize its effect on the rest of the molecule and to investigate the importance of protein topology on the folding cooperativity and communication between domains.

We chose to study the unfolding of a cysteine-free variant of T4 lysozyme (*T4L), a protein composed of two globular regions, which we refer to as domains (Fig. 1). These domains are discontinuous in sequence: the carboxy-terminal domain (residues 60–164) also contains the amino-terminal helix A (residues 1–12) (Fig. 1c). This re-entrant topology, termed a ‘discontinuous subdomain’9, is not uncommon among enzymes. Although traditional ensemble studies show that *T4L folds as a cooperative two-state protein10, a variety of spectroscopic techniques indicate some degree of structural and energetic independence between these domains11,12,13,14,15. In this study, we investigate the coupling between these two regions by exploiting our ability to unfold each domain selectively through the choice of pulling positions and the use of circular permutants15,16.

Figure 1: Experimental set-up and structure of *T4L variants.
figure 1

a, Schematic of the optical tweezer experimental set-up. The protein is tethered between two polystyrene beads, one held by suction to a micropipette and the other trapped in an optical trap. The protein contains cysteines that permit covalent attachment to two 500 base pair dsDNA molecular handles through disulphide linkages. The dsDNA is attached to the micropipette bead via biotin/streptavidin and to the optical trap bead through a digoxigenin/anti-digoxigenin antibody association. By moving the two beads relative to each other, force and extension can be applied to the protein. b, Three-dimensional protein structure of T4L coloured blue and green to distinguish the two energetically distinct C- and N-domains, respectively14. The sites of DNA attachment (residues 16, 61 and 159) are coloured red. c, Schematic of WT*T4L and CP13*T4L variants coloured as in b.

PowerPoint slide

We constructed optical tweezer samples by attaching double-stranded DNA molecular ‘handles’17,18 via thiol chemistry to two cysteines engineered at specific sites where force is to be applied (see Supplementary Information and Supplementary Table 1). A construct with cysteines at residues 16 and 159 (16,159 WT*T4L), was designed to apply force across both protein domains. Force ramp studies show that unfolding occurs at high forces (30–50 pN), whereas refolding is at approximately 5 pN (Fig. 2a; Supplementary Fig. 1); under these conditions, the unfolding/refolding cycle is not an equilibrium process. The sizes of the unfolding transitions, assuming a worm-like chain model and a persistence length of 0.65 nm (ref. 19), are consistent with complete unfolding between the attachment points (the change in contour length, ΔLc<meas> = 47 ± 2 nm (n = 68); ΔLc<calc> = 47.93 nm: 143 residues × 0.36 nm per residue – 3.55 nm (the distance between residues 16 and 159 in the folded protein)). These force-extension curves yield an unfolding rate extrapolated to zero force of = 1.4 × 10-3 ± 7 × 10-4 s-1 and a distance to the transition state = 0.64 ± 0.05 nm (Supplementary Table 2), similar to those obtained in previous AFM pulling studies across the whole molecule20,21.

Figure 2: Unfolding and refolding force-extension curves of WT*T4L and CP13*T4L variants.
figure 2

Representative force-extension curves from a, 16,159 WT*T4L, b, 16,61 WT*T4L, c, 16,159 CP13*T4L and d, 16,61 CP13*T4L (red, unfolding; blue, refolding). All data shown were collected using a 50 Hz sampling rate and a pulling speed of 180 nm s-1 (except for 16,61 WT*T4L, which were collected using a pulling speed of 60 nm s-1). The insert in c shows the curves for 16,159 CP13*T4L in finer detail, illustrating the diverse unfolding and refolding behaviour. These curves were analysed as described previously17.

PowerPoint slide

Next, we investigated the coupling between the two domains. Unlike most denaturing methods, which act globally, mechanical manipulation can be used to apply force solely to one domain. Thus, we changed the attachment points to apply force exclusively on the N-domain (16,61 WT*T4L)14,16 (Fig. 2b). Although a much smaller portion of the protein backbone is forced open (ΔLc<meas> = 14 ± 1 nm (n = 99); ΔLc<calc> = 14.75 nm) compared to 16,159 WT*T4L, the forces at which this construct unfolds are similar and even somewhat higher than those of 16,159 WT*T4L at all pulling speeds (Fig. 3). This behaviour translates into comparable distances to the unfolding transition state ( = 0.49 ± 0.05 nm) and extrapolated zero-force unfolding rates ( = 1.3 × 10-3 ± 8 × 10-4 s-1) (Supplementary Table 2), and indicates that these two constructs might unfold over similar trajectories. Unfortunately, the experimental observable—the change in extension upon unfolding—only yields information about the conformation of the amino acids between the pulling points, and therefore we cannot obtain any direct information about the conformation of the C-domain in this construct.

Figure 3: The dynamic force spectrum of *T4L variants.
figure 3

The protein unfolding force dependence on experimental pulling speed. All points represent the average unfolding force measured at each pulling speed and the error bars are the standard error of the mean of the measured unfolding forces; the regression lines are included to guide the eye. 16,61 WT*T4L (pink circles, 460 total unfolding events from 48 individual proteins); 16,159 WT*T4L (maroon triangles, 1,189 total unfolding events from 47 proteins); 16,61 CP13*T4L (orange diamonds, 877 total unfolding events from 30 proteins); and 16,159 CP13*T4L full protein (yellow squares), C-domain (green squares) and N-domain (blue squares) (1,869 total unfolding events from 38 proteins).

PowerPoint slide

We surmise, however, that if the C-domain unfolds during the observed mechanical unfolding of 16,61 WT*T4L, the overall free energy change of the transition should reflect the contribution of both domains and not merely that of the N-domain. It is impossible, however, to extract free energies directly from the work done unfolding the molecule along these non-equilibrium trajectories. We therefore turned to Crooks’ fluctuation theorem (CFT)22, a statistical mechanical result shown to be a powerful tool for obtaining unfolding free energies (ΔG<CFT>) from an ensemble of single molecule mechanical unfolding trajectories in which the system is far from equilibrium23,24. Application of CFT requires repeated determination of the work needed to unfold and refold the protein by integrating the force-extension curve with bounds that enclose the observed transitions. This integration, however, also includes the work carried out to reversibly stretch the DNA handles and the unfolded protein. Here, we implemented a new method that mathematically eliminates this offset (see Supplementary Information).

Application of CFT to 170 unfolding and 110 folding events for 16,61 WT*T4L yielded a value for ΔG<CFT> = 12.3 ± 0.6 kcal mol-1 (Fig. 4a and Supplementary Fig. 2). In comparison, the ΔG<bulk> from ensemble equilibrium denaturation experiments is 14.1 ± 0.8 kcal mol-1 (ref. 15). The similarity between these values indicates that when WT*T4L is pulled from residues 16 and 61 the entire protein unfolds—even though the attachment points flank only the N-domain—and indicates a high degree of coupling between the domains. Notably, this powerful non-equilibrium approach made it possible to infer the folding status of regions of the protein not bounded by the pulling points.

Figure 4: The normalized probability curves of unfolding and refolding work of 16,61 WT*T4L and CP13*T4L.
figure 4

The normalized probability curves of the work required for the unfolding (in red) and refolding (in blue) measured as described in the main and Supplementary texts. The CFT was used to calculate the free energy from the single molecule experiments. a, The calculated free energy for 16,61 WT*T4L (ΔG<CFT> = 12.3 ± 0.6 kcal mol-1) agrees well with the free energy measured in bulk solution unfolding experiments (ΔG<bulk> = 14.1 ± 0.6 kcal mol-1) (ref. 15). b, The calculated free energy of 16,61 13CP*T4L (ΔG<CFT> = 3.6 ± 0.2 kcal mol-1) corresponds to the free energy of the N-domain measured in bulk solution using native state hydrogen exchange (ΔG<NSHX> = 6.1 ± 1.0 kcal mol-1) (ref. 14).

PowerPoint slide

What structural element mediates this domain coupling? One obvious candidate is the topologically re-entrant helix A. To test this hypothesis we used a circular permutant, CP13*T4L (ref. 15), in which the first twelve residues are attached to the C terminus, creating two continuous domains discretely segregated in the sequence (Fig. 1c). This topological variant selectively alters the physical connectivity of the backbone, leaving all of the native interactions intact. CP13*T4L is active, folds cooperatively, and possesses a global stability and a structure very similar to those of WT*T4L (refs 15, 16). Native state hydrogen exchange studies indicate that the C-domain is energetically less coupled to the rest of the protein16.

Force application across both domains in 16,159 CP13*T4L resulted in more complex unfolding trajectories than its non-permuted counterpart (compare Fig. 2c to Fig. 2a). Three classes of unfolding transitions were observed, each associated with different structural transitions on the basis of ΔLc values (Supplementary Fig. 3a). Individual protein molecules exhibited all three of these unfolding transitions in different relaxation and stretching cycles. The first class (n = 741) corresponds to the complete unfolding of the protein: a single rip with ΔLc = 46 ± 3 nm. The second class (n = 309) also corresponds to complete unfolding of the molecule, but with two rips of unequal size in rapid succession (ΔLc = 17 ± 2 nm and ΔLc = 28 ± 3 nm); these double-rip transitions are consistent with unfolding of the N-domain followed by unfolding of the C-domain (Supplementary Figs 3a, 4, 5). The third, minor class of unfolding transitions (n = 181), consisted of a single rip with ΔLc = 28 ± 3 nm, consistent with unfolding of just the C-domain, indicating that during the previous stretch/relaxation cycle only the C-domain refolded (Supplementary Fig. 6).

Because a significant fraction of the unfolding events in the circular permutant involved an apparent three-state sequential unfolding, we wondered if some, or even all, of the apparently single cooperative unfolding rips also involved two successive transitions with a short-lived intermediate that was masked by our low sampling rate (50 Hz). We therefore repeated these experiments using a newer instrument with a 1 kHz sampling rate. Indeed, at this higher sampling rate, >90% of the unfolding events involve two transitions (see Supplementary Information). These higher resolution data indicate that, within this force regime, the dominant unfolding pathway for the circular permutant when pulled across both domains (16,159 CP13*T4L) is three-state. Importantly, similar studies on the wild-type topology (16,159 WT*T4L) did not uncover this three-state unfolding pathway (0 out of 136 showed two transitions at 1 kHz). Taken together, these results indicate that transferring the A-helix to the C terminus decouples the two domains and reduces their cooperativity.

The existence of an intermediate comprised of just a folded C-domain is consistent with both protein fragmentation and native state hydrogen exchange studies15,16. In addition, an isolated C-domain fragment is mechanically stable, showing cooperative unfolding in the 10 pN range (data not shown). Conversely, an isolated N-domain fragment is not cooperatively folded15,16 and we saw no evidence for the isolated mechanical unfolding of the N-domain. These data support the idea that in the circular permutant, the C-domain can refold independently of the rest of the protein and is mechanically stable even in the absence of an organized N-domain.

The refolding trajectories of 16,159 CP13*T4L also showed reduced interdomain coupling compared to 16,159 WT*T4L, displaying refolding trajectories with either one or two clear compactions (Fig. 2c and Supplementary Fig. 3b, c). Thus, the refolding of 16,159 CP13*T4L involves at least one kinetically resolvable intermediate (see Supplementary Information).

The forces at which 16,159 CP13*T4L unfolded were all significantly lower than those observed for 16,159 WT*T4L (Fig. 3). This permutant also has a much shallower response of unfolding force to experimental pulling speed compared to both WT*T4L constructs (Fig. 3), and a larger calculated distance to the transition state (Supplementary Table 2), indicating that 16,159 CP13*T4L is mechanically more compliant than wild type and explores a qualitatively different unfolding trajectory. This result is consistent with the A-helix no longer acting as a bridge between the domains, mechanically weakening the inter-domain region and giving rise to more compliant behaviour.

The re-entrant A-helix seems to be responsible for the high degree of cooperativity observed in WT*T4L, coupling the two domains and giving rise to an all-or-none, two-state unfolding behaviour. These results indicate that the topological organization of the polypeptide chain can dictate the degree of a protein’s inter-domain coupling, and, consequently, its mechanical unfolding pathway. To test this idea, we investigated the mechanical unfolding of 16,61 CP13*T4L. If our interpretation is correct, pulling this construct—where the force acts directly only on the N-domain—should lead to unfolding exclusively that domain. This construct unfolds at low forces compared to its corresponding wild-type analogue (Fig. 3) in a single transition with ΔLc corresponding to the N-domain (ΔLc<meas> = 15 ± 1 nm (n = 84); ΔLc<calc> = 14.75 nm). To uncover the status of the C-domain, we used CFT to determine the free energy associated with this transition. In contrast to the ΔG<CFT> for 16,61 WT*T4L, the ΔG<CFT> for 16,61 CP13*T4L is 3.6 ± 0.2 kcal mol-1 (Fig. 4b), comparable to the energy required to unfold only the N-domain in CP13*T4L (as determined from native state hydrogen exchange, ΔG<NSHX> = 6.1 ± 1.0 kcal mol-1; ref. 16). Thus, CFT analysis shows that the unfolding of the N-domain in CP13*T4L is no longer coupled to the rest of the protein, confirming our inference that the discontinuous-domain topology of the wild-type polypeptide is responsible for mechanically coupling the two domains. Circular permutation removes this coupling and transforms a mechanically cooperative system into a non-cooperative one that goes through a long-lived structural intermediate where only the C-domain is folded.

T4 lysozyme has been the subject of two previous mechanical unfolding studies using the AFM20,21. The first involved pulling across the whole protein from residues 21 and 124 and revealed a cooperative, single unfolding transition20. Residues 21 and 124 proved refractory to our DNA modification, but we observe a similar extrapolated unfolding rate and distance to the transition state for 16,159 WT*T4L. Recently, other researchers used the AFM to unfold T4 lysozyme and a circular permutant similar to the one we studied here21. In these AFM studies, the samples are polyproteins and hence each construct is pulled from its respective N and C terminus. Therefore, it is impossible to apply force to a single region of the protein in this experiment. For each construct, the authors observed multiple parallel unfolding pathways, and they were unable to directly correlate the heterogeneous changes in contour lengths with a particular structural transition. In our study, we find no evidence for such heterogeneous kinetic partitioning. Instead, our results are most consistent with the formation of a well-defined unfolding intermediate, one that we can associate by its change in contour length with the unfolding of the N-domain. The differences seen in the two studies may arise from the difference in loading rates afforded by the two methods, the nature of the sample (a tethered polyprotein versus single molecule tether), or the placement of the applied force.

Cooperativity is a hallmark of natural proteins. Mutations that lower cooperativity and increase the population of partially unfolded forms promote misfolding. Classic examples are variants of human lysozyme that fold stably25,26, but in which selective destabilization of one region lowers the coupling between the domains and allows for misfolding and amyloid formation25,26. Our results indicate that a discontinuous topology, although seemingly complex, may confer an advantage by promoting coupling between regions. In fact, circular permutation of enzymes is usually tolerated for both folding and function27, suggesting that a discontinuous topology may fine-tune the folding landscape such that these molecules avoid regions that may lead to kinetic trapping and frustration. The mechanical unfolding trajectories depend markedly on this topological feature.

Spontaneous folding and self-assembly are fundamental processes of bio-morphogenesis. Although driven thermodynamically, these processes can be frustrated by the formation of intermediates, which are often kinetically trapped and unproductive. Our results indicate that productive cooperative interactions among protein domains, and which regions of the folding landscape are explored, depend not only on the local details of the protein’s structure, but also on the topological organization of its polypeptide chain.

Methods Summary

Variants of *T4L with cysteines at specific sites were constructed using standard genetic engineering techniques. All cysteines were placed in solvent-exposed sites. Ensemble folding and equilibrium denaturation studies showed that all of the cysteine variants used here display minimal perturbation in either the energetics or folding kinetics of the protein (see Supplementary Table 1). These cysteines were then used to attach the protein to two DNA handles. The DNA handles were generated by PCR, using pGEMEX 1 plasmid DNA from Promega as template. One handle was synthesized using the primers 5′thiol-GCTACCGTAATTGAGACCAC-3′ and 5′biotin-CAAAAAACCCCTCAAGACCC-3′; the other was synthesized using the same 5′-thiol primer together with 5′digoxigenin-version of the biotin primer. The chemistry and protocol for handle attachment was as described in ref. 18. The optical tweezer experiments were performed in 250 mM NaCl, 10 mM Tris pH 7.0 and 1 mM EDTA at ambient room temperature.

Single molecule protein tethers were identified by their DNA overstretching transition and those samples were repeatedly stretched and relaxed at a constant speed. Force-extension curves were analysed to obtain unfolding forces and changes in contour length17. The work done in folding or unfolding the protein was calculated by integrating the area under individual force-extension curves, using equal-force bounds that directly surrounded the refolding or unfolding transition (results described in detail in Supplementary Information).

Analysis of force-extension curves

After tethering a protein sample between the two beads as diagrammed in Fig. 1, the beads were moved away from one another at a constant speed until a predetermined maximum force (typically 70 pN) was attained. Individual protein tethers that displayed an overstretching transition for DNA at 67 pN were then repeatedly stretched and relaxed between 0 and 50 pN of force. These force extension curves were analysed as described previously17.

Online Methods

Identifying single molecule tethers

Single molecule tethers were identified by examining the DNA handle overstretching transition. Only those fibres that exhibited an overstretching transition of the expected length (230 nm) at the correct force (67 pN) for a single fibre of 1,116 bp DNA were selected for analysis28.

Alternate bounds of integration for the Crooks fluctuation theorem

We calculate the work done in unfolding or refolding a protein by integrating the area under a force-extension curve around the transition. This can be done by overlaying many force-extension curves and setting universal bounds of integration that enclose all of the unfolding and refolding events. The work values must then be corrected for the energy put into stretching the DNA handles and the unfolded polypeptide chain.

The unfolded protein’s stretching energy is calculated first by modelling the protein as a wormlike chain and then integrating this theoretical force-extension curve for the unfolded polypeptide chain, with lower and upper bounds defined by the distance between cysteines in the folded state, and the maximum force of the pulling protocol, respectively. The stretching energy of the DNA handles is calculated similarly. Here, we introduce a novel approach that mathematically eliminates the need to calculate the stretching energy of DNA. This removes a key source of uncertainty and simplifies the calculations.

Our approach relies on the fact that, for an equilibrium transition like stretching and relaxing dsDNA, the work done in any trajectory that starts and ends at the same force is zero. We take advantage of this by integrating each force-extension curve individually, and setting the bounds of integration at two points at equal force, on either side of the unfolding or refolding event. As the DNA handles have not undergone any net conformational change, the only necessary correction is to subtract the energy put into stretching the unfolded protein between the native structure and the force at which the transition occurs.

This method depends on the ability to set bounds that enclose the structural transitions, but are also at the same force. Because of this, it is applicable only to two-state transitions. The WT and CP13 mutants of 16,61 *T4L both meet this condition, as the force-bearing region between the cysteines folds and unfolds in an all-or-nothing manner in both constructs.

Pulling using a higher time resolution (1 kHz)

Using a new optical tweezers instrument with a 1 kHz sampling rate, we applied a force-ramp protocol with a similar loading rate to the original data (16 pN s-1): a 210 nm s-1 pulling rate with an optical-trap spring constant of 0.075 pN nm-1. Several (74) force-extension curves were recorded. When averaged down to a sampling rate of 50 Hz, the unfolding kinetics of the 16,159 CP13 *T4L mutant replicated the original observed behaviour: some unfolding events were a single event with an extension change corresponding to the complete unfolding of the protein; some were split into two rips, with the N-terminal domain unfolding before the C-terminal domain; and some showed unfolding of just the C-terminal domain, due to incomplete refolding of the protein in the previous curve.

At 1 kHz, 61 (82%) of the curves showed a discernible pause at the unfolding intermediate. This is a marked increase from the 1,231 force-ramp data recorded at 50 Hz, where only 25% of the transitions showed this intermediate. Nine force-extension curves (12%) showed unfolding of just the C-terminal subdomain: a similar fraction to the 15% originally recorded. The remaining four curves (5%) taken at 1 kHz showed just one unfolding event, corresponding to the full unfolding of the protein.

These results indicate that the original estimates were limited by the low sampling rate of the instrument. Of the 61 ramps showing a clear unfolding intermediate at 1 kHz, 34 show a pause at the intermediate state lasting for <40 ms: this is the minimum duration (defined as two sampling intervals) that one could expect to observe an intermediate state at 50 Hz. Thus, as stated above, averaging down the data to 50 Hz again would reduce the fraction of force extension curves with clear unfolding intermediates to 36%—correlating well with the original 25% estimate.

Similar experiments were carried out at this higher sampling rate on 16,159 WT *T4L. Unlike the permuted variant, this protein unfolded in a single transition. Out of 136 pulls, no double-rip transitions were observed, and the contour lengths of the observed transition were consistent with unfolding of the entire region between residues 16 and 159.