Main

Small-molecule activity-based probes (ABPs) have recently been described as a means to track protease activities in cells, tissues and whole animals (for review see refs. 13). By making use of the intrinsically unique chemical reactivity of each protease class as well as the substrate-recognition domains of individual proteases, it is possible to tailor probes to react selectively either with broad classes of proteases or with individual protease targets. Although considerable progress has been made in the development of protease-specific ABPs, for several key enzyme families suitable reagents are still lacking. For example, despite successful efforts to map the substrate specificity of caspases4,5, most currently available ABPs for this family are limited by lack of specificity and high background labeling when applied to crude proteomes. For enzymes such as legumain, a lysosomal cysteine protease thought to be important in the initial stages of antigen presentation, only a select few inhibitors have been described and none of these has been used to generate ABPs for application in studies of legumain function6,7,8.

We chose to focus our efforts on the acyloxymethyl ketone (AOMK) 'warhead' because of its reported high selectivity for cysteine proteases and low reactivity toward weak nucleophiles (for a comprehensive review, see ref. 9). Although a limited number of ABPs using the AOMK warhead have been reported10,11, a simple, general method for the synthesis of AOMK probes was lacking. We initially based our synthesis on a set of elegant protocols reported for acyloxymethyl, aminomethyl and thiomethyl ketones12,13. This approach requires the synthesis of 9-fluorenylmethyloxycarbonyl (Fmoc)-protected chloromethyl ketones for attachment to a solid support through a hydrazone linkage (Scheme 1a) . However, it was difficult to obtain clean AOMK products when extended peptides were synthesized, because of displacement of the acyloxy group by the piperidine base used for Fmoc deprotection during peptide synthesis. We overcame this problem through an extensive study of alternative weak bases for Fmoc removal (Supplementary Fig. 1 online), in which we determined that treatment of resin-bound peptides with 5% diethylamine resulted in quantitative Fmoc removal without loss of the AOMK functional group. Furthermore, the PEG-based resin reported previously could be substituted with standard polystyrene-based resins for ease of handling and reduced costs.

An additional strategy was developed for compounds containing a P1 asparagine, in which an aspartic acid AOMK was linked to a Rink amide resin, thereby producing P1 asparagine upon cleavage. This synthesis produced higher yields of final products than the hydrazine resin approach but could only be used with amino acids with suitable side chain groups (Scheme 1b). A range of N-terminally capped peptide AOMKs containing either P1 amino acids only or P1–P3 amino acids were synthesized at yields (2–15%) that permitted subsequent biological studies (Scheme 1c).

Members of an initial library of AOMK probes containing a single P1 amino acid linked via a hexanoic acid spacer to a tyrosine residue (for iodination) and a biotinylated lysine residue (Fig. 1a) were radio-iodinated and tested against several purified and recombinant cysteine proteases of the CD clan, as members of this clan are known to have high specificity for cleavage at a single defined P1 amino acid residue (Fig. 1b)14. This group included the human caspase-3, which requires a P1 aspartic acid residue; the bacterial Arg- and Lys-gingipains from Porphyromonas gingivalis, which require a P1 arginine or lysine residue; and mouse legumain, which has been reported to require a P1 asparagine14. Among these proteases, Arg-gingipain, Lys-gingipain and caspase-3 were each selectively labeled by the probes containing predicted optimal P1 residues. In addition, the arginine and lysine P1 probes specifically labeled the endogenous Arg- and Lys-gingipains in crude cellular lysates from P. gingivalis, confirming their potential for use in proteomic studies (Supplementary Fig. 2 online). A CA-clan protease, cathepsin L, was also tested and was reactive with many of the probes, consistent with the broad tolerance of amino acid side chains in the P1 position of members of this clan15. As expected, the epoxide-based, papain family–specific probe JPM-OEt labeled only cathepsin L, illustrating the difference in the binding mechanism of AOMK and epoxide-based probes (Fig. 1b). Notably, in addition to being labeled by the P1 asparagine probe, purified human legumain was effectively labeled by the P1 aspartic acid probe (Fig. 1b).

Figure 1: Specificity of diverse P1 amino acid AOMK probes.
figure 1

(a) Structure of the diverse P1 AOMK probes (7af). These reagents contain the reactive 2,5-dimethyl benzoic acid AOMK group linked to a range of P1 elements (see list of X groups) followed by an alkyl spacer, a tyrosine residue for labeling with 125I and a biotin-tagged lysine residue. (b) Labeling of multiple families of cysteine proteases with equal amounts of 125I-labeled versions of the P1 AOMK probes and the probe JPM-OEt.

Legumain is a CD-clan lysosomal enzyme that is thought to have a critical role in antigen processing16. Recent studies suggest legumain is synthesized as a pre-pro-enzyme that undergoes multiple processing events during its maturation17,18,19. It is not clear, however, which of the final or intermediate forms of legumain are proteolytically active within the cell. Labeling of tissue homogenates from wild-type and legumain-deficient knockout mice with Asp-AOMK revealed a single 38-kDa species that was absent in tissues from the legumain-deficient mice (Fig. 2a). Pretreatment with JPM-OEt identified the remaining 30-kDa species as a lysosomal cathepsin (most likely cathepsin B, based on molecular weight). Further labeling studies with Asp-AOMK showed that the 38-kDa probe-reactive enzyme was present in a wide variety of immune cell lines (Fig. 2b). Immunoprecipitation studies confirmed that the 38-kDa species was legumain (Fig. 2c). These results suggest that if intermediate forms of legumain exist, they either are not catalytically active or are rapidly processed to the final 38-kDa form. Thus, our results suggest that a single cleaved form of human legumain is the predominant mature, active form in tissues and cells. Previous studies have relied on indirect methods to measure the activity of human legumain and have therefore been unable to confirm the molecular identity of the native active species. Notably, the purified recombinant mammalian enzyme retains activity but is unable to process itself to the physiologically relevant 38-kDa form18, highlighting the importance of tools that make it possible to study the endogenous enzyme.

Figure 2: ABPs label endogenous legumain and reveal a pH switch that governs legumain specificity.
figure 2

(a) Spleen lysates from wild-type and legumain-knockout mice were pretreated with the indicated inhibitor and then labeled with Asp-AOMK (7e). Probe-labeled proteins were visualized by streptavidin affinity blotting. (b) Labeling of B and T cell lysates with Asp-AOMK (7e) followed by affinity blotting. (c) Lysate from 8.1.6 cells was incubated with Asp-AOMK (7e) and then legumain was immunoprecipitated. I, input; P, pellet after immunoprecipitation; S, supernatant after immunoprecipitation. (d) Lysates from 8.1.6 cells prepared at the indicated pH were incubated with Asn-AOMK (7f) or Asp-AOMK (7e). Probe-labeled proteins were visualized by affinity blotting (above); legumain was visualized by immunoblotting (below). An endogenously biotinylated protein (*) observed at high pH values was present in mock-labeled samples (data not shown).

Recent studies suggest that legumain autoactivates through cleavage at aspartic acid18,19. Given that this cleavage occurs at acidic pH, we hypothesized that a low-pH environment would cause protonation of the side chain of aspartate, eliminating its negative charge and permitting binding to the S1 pocket of legumain. Analysis of labeling at a range of pH values indicated that legumain bound the Asp-AOMK probe at acidic pH, while showing a much broader pH window for activity against Asn-AOMK (Fig. 2d). Notably, legumain's retention of activity against Asn-AOMK at pH values at which it shows no reactivity with Asp-AOMK indicates that alterations in the specificity of legumain occur through the influence of pH on the inhibitor, not the protease. These findings suggest that legumain activation may be governed by a pH-dependent specificity switch that allows the activation to be controlled by trafficking of the protein to areas of acidic pH (such as lysosomes).

The caspases are key regulators of cell death, functioning in cascades that begin with the activation of upstream initiator caspases that subsequently activate downstream effector caspases (for review, see ref. 20). Even though this family of proteases has been the focus of intense studies for more than a decade, there are few chemical tools available to study specific pathways of apoptosis. We decided to test the selectivity of our AOMK probes by labeling a panel of recombinant human caspases. The Asp-AOMK probe efficiently labeled caspase-3, caspase-6, caspase-7 and caspase-8 but not caspase-9 (Fig. 3a). Because the absence of caspase-9 labeling indicated that the enzyme might require an extended peptide binding sequence, we synthesized bEVD-AOMK. This probe showed robust labeling of caspase-9 as well as caspase-3, caspase-7 and caspase-8. Notably, extension of the peptide scaffold reduced labeling of caspase-6, suggesting that this enzyme may have a strong specificity outside of the P1 position that does not tolerate the EVD sequence. Although the commercially available bVAD-FMK produced robust labeling of the purified recombinant caspases, it also yielded high background labeling of contaminants in the catalytic mutant caspase-3 (3-C285A), caspase-8 and caspase-9 samples (Fig. 3a).

Figure 3: AOMK-based probes selectively target caspases in crude proteomes.
figure 3

(a) Activity-based labeling of recombinant, purified caspases using Asp-AOMK (7e), an extended peptide AOMK with P2 and P3 elements, bEVD-AOMK (11), and a commercially available fluoromethyl ketone (bVAD-FMK). (b) Labeling of endogenous caspases in crude cellular extracts from SHSY5Y cells that were made to undergo apoptosis by addition of cytochrome c and dATP. A control in which samples were heated to 90 °C for 5 min before labeling is included. Locations of caspase-3, caspase-7 and caspase-9 are shown based on molecular weight.

To assess probe labeling of endogenous caspases, cytosolic extracts from the neuroblastoma cell line SHSY5Y were treated with cytochrome c and dATP to initiate caspase-9 activation and subsequent activation of caspase-3 and caspase-7 (ref. 21). Treatment of cell extracts with either bVAD-FMK or bEVD-AOMK produced a triplet of labeled proteins around 17–20 kDa, corresponding to activated caspase-3 and caspase-7 as well as to forms of these proteases with and without their N-terminal peptides22,23. However, high background signal from bVAD-FMK obscured visualization of labeled caspase-9 (at 35 kDa), whereas bEVD-AOMK labeling of caspase-9 could be clearly visualized (Fig. 3b). Together these results suggest that AOMK-based probes show reduced background than the more reactive FMK-based reagents. Furthermore, these data show that it is possible to modulate the specificity of an ABP to target subsets of caspase family members. Further studies are currently underway in our laboratory to generate libraries of probes as a means to identify selective reagents that will allow specific components of the apoptotic pathways to be examined in whole-cell systems.

The papain family of cysteine proteases belongs to clan CA, includes cysteine cathepsins and comprises numerous members with diverse cellular functions (for review, see ref. 24). We next applied our optimized solid-phase methodology to develop new AOMK probes to target specific cysteine cathepsins. Using a competition labeling approach in crude tissue extracts, we were able to analyze the potency and selectivity of a small library of compounds synthesized using our solid-phase methodology (Fig. 4a). Quantification of competition data provided IC50 values for all compounds against each of the cathepsin targets found in the extract (Fig. 4b). The commercially available cathepsin inhibitor III (Calbiochem; Fig. 4c) was used as a starting point for probe design, as its reactive group is similar in structure to that of AOMK. Notably, conversion of the O-acyl hydroxamate warhead of cathepsin inhibitor III to the structurally related AOMK group resulted in complete loss of inhibition (Fig 4c). Substitution of the p-methoxy group for the 2,6-dimethyl substitution yielded a highly selective, cell-permeable (data not shown) probe for cathepsin B that had greater potency than the parent compound. This is a significant finding because all currently available cathepsin B–specific inhibitors, such as CA-074, contain a charged carboxylate that prevents them from penetrating membranes and that cannot be modified without loss of selectivity25,26. Further modulation of the P1 and P2 elements produced probes with varying degrees of potency and selectivity for cathepsin B. In particular, replacement of the P1 glycine with a basic arginine resulted in a highly potent and selective cathepsin B inhibitor, Z-FR-AOMK.

Figure 4: Optimization of cathepsin B–specific AOMK probes.
figure 4

(a) Screening of an AOMK probe for selectivity and potency using a competition assay in crude tissue extracts. The compound is added to rat liver extracts and endogenous cathepsins are then labeled with the general probe [125I]DCG-04. (b) Quantification of competition from a, using a log plot of percentage competition versus probe concentration. Crude IC50 values are obtained from the linear portion of the plot. (c) Table of IC50 values (in μM) for a range of peptide AOMKs obtained by competition analysis as shown in a and b. N.I. indicates no measurable inhibition of labeling up to 50 μM.

In conclusion, we outline an optimized solid-phase synthesis method that can be used to generate diverse AOMK-based probes that target multiple cysteine protease families. We have shown that the AOMK warhead is an ideal pharmacophore for profiling cysteine protease activities in complex proteomes and confirmed that highly selective probes can be achieved using this functional group. Work is currently in progress in our laboratory to generate libraries of AOMK probes containing both natural and non-natural amino acids so as to identify additional activity-based labeling reagents that can be used for further dissection of complex proteolytic cascades.

Methods

Synthetic protocols.

See Supplementary Methods online.

Activity-based labeling of recombinant proteases.

Recombinant caspases were prepared as described27. Recombinant proteases were gifts: mouse legumain (H. Chapman, University of California San Francisco), Arg and Lys gingipains (J. Potempa, University of Georgia) and human cathepsin L (V. Turk, Jozef Stefan Institute). Caspase-3, legumain, cathepsin L, and gingipain R and K were diluted into 50 μl of the indicated reaction buffer: 100 mM Tris, 10 mM DTT, 0.1% CHAPS, 10% sucrose, pH 7.4 for caspase-3; 50 mM Tris, 10 mM DTT, 5 mm MgCl2, pH 7.6 for the gingipains; and 50 mM acetate, 2 mM DTT, 5 mm MgCl2, pH 5.5 for cathepsin L and pure legumain. Enzymes were labeled with 125I-labeled P1 AOMK probes (106 total c.p.m.) for 30 min at room temperature (24 °C). Total protein loaded per lane were as follows: 33 ng for caspase-3,500 ng for the gingipains, 500 ng for cathepsin L and 3 μg for legumain. For labeling of multiple caspases with biotin-labeled probes, 100 nM of active site–titrated enzymes was incubated with 10 μM of different ABPs in 100 mM Tris, 10 mM DTT, 0.1% CHAPS, 10% sucrose, pH 7.4 for 90 min at room temperature, resolved by SDS-PAGE and transferred to a polyvinylidene difluoride (PVDF) membrane as described23. Total protein loaded per lane was 33 ng for caspase-3, 60 ng for caspase-8 and 450 ng for caspase-9. Biotinylated proteins were visualized by staining with VectaStain reagent (Vector Labs).

Labeling of caspases in crude cell extracts.

Cytosolic extracts from SHSY5Y cells prepared as described were incubated with 10 μM cytochrome c and 1 mM dATP in the presence of 1 μM bVAD-FMK (MP Biomedicals), bEVD-AOMK or Asp-AOMK28. Samples were incubated at room temperature for 90 min, after which SDS-PAGE sample buffer was added. Samples containing 50 μg of total protein per lane were resolved by SDS-PAGE, and labeled proteins were detected as described earlier for recombinant caspases.

Labeling of gingipains in crude cell lysates.

See Supplementary Methods.

Labeling of legumain in mouse tissue homogenates.

Tissues from mouse organs were Dounce-homogenized in 50 mM sodium acetate, 2 mM EDTA, 5 mM DTT (pH 5.5) and normalized to 1 mg ml−1. Lysates (50 μg total protein) of spleen from wild-type and legumain-knockout mice were incubated either with DMSO controls, JPM-OEt (50 μM), or PMSF (2 mM) for 30 min at 37 °C. Asp-AOMK was then added to a final concentration of 10 μM and incubation continued for an additional 30 min at 37 °C. Samples were resolved by SDS-PAGE and transferred to a nitrocellulose membrane, and biotinylated proteins were visualized using VectaStain reagent (Vector Labs).

Labeling of legumain in immune cell extracts.

Frozen cell pellets from the B cell lines 2.2.93, 8.1.6, Raji and 9.5.3 and the Jurkat T cell line were lysed in 50 mM citrate phosphate buffer, 1% CHAPS, 0.5% Triton, 5 mM DTT for 10 min on ice. After centrifugation at 14,000g for 15 min at 4 °C, total protein concentration of cleared lysates was adjusted to 1 mg ml−1 by addition of citrate phosphate buffer (pH 5.8), 0.1% CHAPS, 5 mM DTT. Lysates (50 μg of total protein) were labeled with Asp-AOMK (10 μM) for 30 min at 37 °C. Samples were resolved by SDS-PAGE and visualized as described above. For subsequent studies, 8.1.6 cells were cultured in RPMI 1640 supplemented with 10% fetal calf serum (FCS), 2 mM glutamine, 100 U of penicillin per ml and 100 μg of streptomycin per ml and maintained in 5% CO2 at 37 °C.

Immunoprecipitation of labeled legumain.

Lysate from the B cell line 8.1.6 was prepared as described above, diluted to a final concentration of 1 mg ml−1 in citrate phosphate buffer (pH 5.8), 0.1% CHAPS, 5 mM DTT, and incubated with 10 μM Asp-AOMK for 30 min at 37 °C18. Legumain was then immunoprecipitated using polyclonal anti-legumain antisera and biotinylated proteins visualized as described above18.

pH-sensitivity studies of legumain labeling.

Equal numbers (107) of 8.1.6 B cells were lysed in 50 mM citrate phosphate buffer, 1% CHAPS, 0.5% Triton, 5 mM DTT of the indicated pH, then normalized to 1 mg ml−1 in citrate phosphate buffer, 0.1% CHAPS, 5 mM DTT of the same pH. Lysates (50 μg of total protein per sample) were labeled with 1 μM of the indicated probe for 30 min at 37 °C and resolved by SDS-PAGE, and biotinylated proteins were visualized as described above. Legumain was visualized using polyclonal antisera18.

Competition labeling and determination of IC50 values.

Rat liver lysates (50 μg total protein in 5 μl reaction buffer (50 mM sodium acetate, 2 mM DTT, 5 mM MgCl2, pH 5.5)) were incubated at room temperature for 30 min with 0.5 μl of each compound (diluted from 50 mM DMSO stocks) to give the desired final concentration. [125I]DCG-04 (106 c.p.m.) was then added and samples incubated at room temperature for 45 min. Samples were resolved by SDS-PAGE and analyzed using a Typhoon 9410 imager (Amersham Biosciences), and labeled bands were quantified with Scion Image (Scion Corp.). IC50 values were determined from the linear portion of a plot of numerical values versus inhibitor concentration.

Accession codes.

BIND identifiers (http://bind.ca/): 262187, 262188, 262189, 262190, 262191, 262192, 262193, 262194, 262195, 262196, 262197, 262198, 262199, 262200, 262201, 262202, 262203, 262204, 262205, 262206, 262207, 262208.

Note: Supplementary information is available on the Nature Chemical Biology website.

scheme 1

Synthesis of Fmoc-protected chloromethyl and bromomethyl ketones (2af) containing a range of amino acid side chains. (b) Solid-phase synthesis of P1 asparagine AOMK peptides. The Fmoc protected Asp-AOMK (4f) was synthesized from the corresponding BMK (2f) and was linked directly to a Rink amide resin through its side chain carboxylate (5f). Solid-phase peptide synthesis and resin cleavage methods outlined in c were used to produce a P1 asparagine AOMK (7f). (c) Solid-phase synthesis of peptide AOMKs using a hydrazine resin. Peptide chloromethyl ketones (2ae) were linked to the resin through a hydrazone linkage (5ae) and extended using the indicated optimized solid-phase peptide synthesis method (6). A range of differentially capped single amino acids and di- and tri-peptides were synthesized. Compounds are numbered and assigned lowercase letters based on the identity of the P1 side chain: a, glycine; b, arginine; c, leucine; d, lysine; e, aspartic acid; f, asparagine. AcOH, acetic acid; DCM, dichloromethane; DIC, N,N'-diisopropylcarbodiimide; Fmoc, 9-fluorenylmethoxycarbonyl; HOBT, 1-hydroxybenzotriazole; TFA, trifluoroacetic acid; THF, tetrahydrofuran; RT, room temperature; Z, benzyloxycarbonyl.