Main

Targeted proteomics is a powerful approach that enables quantitative analysis of peptides from complex biological samples with high sensitivity and specificity1,2. However, a major bottleneck limiting wider application of targeted proteomics has been the identification of optimal proteotypic peptides that are readily detectable by the mass spectrometer as well as the characteristic fragmentation patterns of these peptides.

Because of differences in physiochemical properties, different peptides from the same protein can produce drastically different signal intensities when measured with a mass spectrometer1. Peptides are referred to as 'proteotypic' if they (i) are unique to a given protein, (ii) have good response characteristics in the mass spectrometer and (iii) have a fragmentation pattern with salient features to accurately detect and quantify. Traditional strategies for identifying proteotypic peptides and their fragmentation patterns have relied on the combination of experimental data with bioinformatic analyses. A common approach has been to use peptides catalogued in the course of 'shotgun' proteomic experiments conducted by data-dependent acquisition3,4. This approach assumes that the peptides most frequently identified in shotgun experiments will produce the best response in a targeted proteomics setting. This assumption also underlies the application of machine-learning methods, which aim to predict proteotypic peptides (but not their fragmentation spectra) de novo5,6. Complicating these efforts, a large subset of the human proteome is absent from fragmentation spectra databases, and this deficit is particularly acute for low-abundance proteins such as transcription factors and kinases. To generate such peptide fragmentation data, large-scale efforts aim to synthesize predicted proteotypic peptides and empirically determine their fragmentation patterns7. However, which, if any, of these approaches is best suited for sensitive targeted proteomic analyses is unknown.

Here we report an empirically driven approach for generating both optimal proteotypic peptides and their fragmentation patterns in a scalable, economical and generalizable fashion. Rather than relying on sparsely populated spectral databases3,4, prediction algorithms5,6, costly peptide synthesis7 or the costly purchase of full-length proteins8, we leveraged the rich collection of tagged cDNA clones that are currently available for most human and model-organism proteins9,10 to generate in vitro–synthesized full-length protein samples, followed by tryptic digestion and mass spectrometry analysis using selected reaction monitoring (SRM). Because all monitored tryptic peptides for each protein originate from the same full-length protein molecules, we can compare the relative intensities of different peptides to identify those that provide the most sensitive proxy for the target protein. In addition to determining the relative peptide response, we can identify in parallel the fragmentation patterns of these peptides in a triple-quadruple mass spectrometer using SRM (Fig. 1).

Figure 1: Development of targeted proteomics assays using enriched in vitro–synthesized full-length proteins.
figure 1

(a) Transcription factor family membership for the proteins for which targeted assays were built. (b) Schematic of the synthesis, enrichment, digestion and analysis of proteins to identify proteotypic peptides and their fragmentation patterns. (c) Target protein enrichment and purity were analyzed for 46 samples by immunodetection with an antibody to schistosomal GST (left) and silver staining (right). (d) SRM chromatographic traces from the NFIA peptide EDFVLTVTGK. Insert, magnification of the chromatographic peak marked by the arrowhead. (e) EWSR1 peptide intensities (arbitrary units).

To demonstrate our approach, we studied transcription factors, a diverse class of low-abundance proteins with a paucity of spectral data in public databases (Supplementary Fig. 1). We selected 96 human transcription factor proteins spanning all major structural families11 (Fig. 1a). For each of these proteins, we obtained full-length cDNA clones contained in an in vitro transcription and translation–compatible vector with an in-frame C-terminal Schistosoma japonicum glutathione S-transferase (GST) tag12 (Supplementary Data 1). We then optimized in vitro protein production and purification in a 96-well plate format. We tested different protein production, capture, wash and digestion conditions to develop a protocol that gave maximal protein yield at the highest possible purity (Online Methods). To verify that enriched full-length proteins were produced, we performed silver-staining and western-bloting analyses for 46 of the 96 proteins (Fig. 1c and Supplementary Fig. 2). For nearly all of the tested proteins, the target protein and the two endogenous glutathione-binding proteins GSTM3 and EEF1G were the top three most intense bands as determined by silver staining, indicating that SRM signal contamination should be minimal. Of the tested clones, 96% (44 of 46 clones) produced highly enriched proteins with the correct molecular weight. The remaining two samples produced multiple species of different molecular weights, likely originating from alternative methionine start codons.

For each protein, we selected peptides and fragment ions to measure using Skyline13, an open-source application for building SRM methods and analyzing the resulting mass spectrometry data. We focused our analysis on predicted fully tryptic peptides with lengths of 7–23 amino acids. For each doubly charged monoisotopic precursor, we monitored singly charged monoisotopic y3 to yn − 1 product ions using a TSQ-Vantage triple-quadrupole mass spectrometer. We imported these measurements into Skyline to identify the relative peptide responses and their fragmentation patterns (Fig. 1d,e). An annotated Skyline file containing the measured peptides and fragment ions for all 96 proteins is available at http://proteome.gs.washington.edu/supplementary_data/IVT_SRM/.

To quantify the amount of each protein synthesized, we spiked heavy isotope–labeled forms of the schistosomal GST peptides LLLEYLEEK and IEAIPQIDK into each in vitro synthesis reaction. We measured the light-to-heavy isotope ratio of these two peptides and calibrated this ratio using an absolute quantification curve containing the same amount of the heavy isotope–labeled peptides but different known quantities of the light isotope–labeled peptide (Supplementary Fig. 3 and Supplementary Note 1). Using this approach we determined that all of the 96 tested proteins produced at least 0.5 nM of product (Fig. 2a).

Figure 2: Targeted assays can be efficiently developed using in vitro–synthesized proteins and applied to measure proteins in vivo.
figure 2

(a) Absolute quantity of each in vitro–synthesized protein sample, as measured using a tryptic peptide contained in the C-terminal schistosomal GST tag. (b) Number of peptides per protein empirically assessed with salient features to accurately detect and quantify the target proteins (peptides with a quality score of either 1 or 2). (c) Proteotypic peptides identified using in vitro–synthesized CTCF were monitored in K562 nuclear extracts. The relative contribution of each fragment ion to each peptide peak is displayed as different colors. (d) For each CTCF proteotypic peptide, the relative signal intensity observed using in vitro–synthesized protein is displayed alongside the relative signal intensity observed using K562 nuclear extract peptides not observed (n.o.) in K562 nuclear extracts are indicated. (e) Measured relative abundance of four transcription factors between the fibroblast (BJ), hepatic carcinoma (HepG2), erythroleukemia (K562) and neuroblastoma (SKNSH) human cell lines. Error bars, s. d. (n = 6).

To determine peptide-signal quality we manually analyzed chromatographic data for each peptide. Each peptide was given a quality score between 1 and 4, with 1 being the highest quality (Online Methods). For subsequet analysis we considered only peptides with a quality score of either 1 or 2. On average we identified eight peptides per protein with a quality score of 1 or 2. All but two of the proteins assayed (CEBPG and HMGA1) had at least one peptide with a quality score of 1 or 2 (Fig. 2b and Supplementary Data 1). Although sufficient quantities of both CEBPG and HMGA1 protein were produced using our in vitro approach (Supplementary Fig. 2) and the proteins were sufficiently digested as indicated by the mass spectrometry responses of the GST peptides, none of the monitored tryptic peptides from these two proteins gave a good response in the mass spectrometer. This suggests that a small minority of transcription factor proteins may not be amenable to proteomic analysis using trypsin-based digestion.

To determine our fragmentation-pattern quality, we compared our observed peptide fragmentation patterns with those contained in the US National Institute of Standards and Technology (NIST) spectral database. Of the 760 peptides in our dataset with a quality score of 1 or 2, only 18% (136 peptides) were represented in the NIST database (Online Methods). Of these, all had high spectral similarity scores, with 93% having dot-products >0.85 (Supplementary Fig. 4). This finding corroborates both our data and the NIST database and highlights the scarcity of proteotypic peptides in large spectral databases.

We next determined the utility of predictor algorithms and shotgun analyses to identify optimal proteotypic peptides. A comparison of our empirical ranking of proteotypic peptides with peptide rank predictions from the ESPPredictor algorithm6 revealed Spearman correlations from −0.45 to 0.85 with an average correlation of 0.47 (Supplementary Data 2 and Supplementary Fig. 5). Similarly, roughly half of the optimal proteotypic peptides from our experiments were undetected in shotgun analyses of the identical samples (Supplementary Fig. 6). Although these approaches are better than selecting proteotypic peptides at random, our results suggest that current predictor algorithms and spectral counting approaches provide imperfect ranking and identification of optimal proteotypic peptides, potentially limiting the utility of large-scale peptide-synthesis efforts that rely on such approaches as a first-round filter7.

Finally, we sought to confirm the utility of proteotypic peptides identified using our approach for in vivo analyses and how the in vitro–derived intensity rankings compared with those from complex biological samples. To test this, we first monitored all 12 of the quality score 1 and 2 peptides from the genomic master regulatory transcription factor CTCF in trypsin-digested nuclear lysate from erythroleukemia cells (K562). Using the fragmentation patterns identified in vitro, we identified corresponding chromatographic peaks for six of these CTCF peptides in K562 nuclear extract (Fig. 2c). The relative intensity of these peptides in vitro and in vivo closely matched, confirming the relevance of the rank order of peptides identified empirically using in vitro–synthesized protein (Fig. 2d). Next, we selected top-ranking peptides from four transcription factors and used these to generate nuclear abundance measurements of these factors across four distinct cell types (Fig. 2e). The relative abundance measurements were consistent with previous reports on the tissue distribution of these transcription factors using RNA abundance14,15.

In summary, we demonstrated and validated a rapid and cost-efficient method for empirical identification of optimal proteotypic peptides and their fragmentation patterns using in vitro–synthesized proteins. Our method can be applied to generate assays to identify and quantify structurally diverse low-abundance proteins, such as human transcription factors, in unfractionated cellular extracts.

Methods

Clones and plasmids.

All of the clones used are from the pANT7_cGST clone collection distributed by the Arizona State University Biodesign Institute plasmid repository. These full-length cDNA clones contain a T7 transcriptional start sequence as well as an internal ribosome entry site (IRES), which is compatible with in vitro transcription-translation reagents11. Additionally, each clone contains an in-frame fused C-terminal Schistosoma japonicum GST tag. Each bacterial stock clone was grown overnight in 5 ml of Luria broth with 100 μg ml−1 ampicillin (LB-amp). Plasmid DNA was extracted using the manufacture mini-prep protocol with the exception of an additional wash with PE buffer (Qiagen). All plasmid stocks were Sanger sequenced (University of Washington High Throughput Genomics Unit) using an M13 priming site upstream of the T7 promoter to confirm the identity of the insert and to ensure that there was no contamination of the plasmid stocks.

Peptides.

We obtained 0.1 mg of FasTrack crude 'heavy' [13C615N2]L-lysine–labeled LLLEYLEEK and IEAIPQIDK peptides to use as internal standards (Thermo). The LLLEYLEEK peptide was resuspended in 75% acetonitrile and 0.1% formic acid in H2O. The IEAIPQIDK peptide was resuspended in 5% acetonitrile in H2O. Unlabeled peptides provided at a concentration of 5 pmol μl−1 (assessed by the manufacturer by amino acid analysis) of AQUA Ultimate light LLLEYLEEK and IEAIPQIDK peptides were obtained to use as calibration standards (Thermo).

Protein production and purification.

Protein production and purification was optimized to be performed in 96-well plate format. Different protein production conditions, capture conditions, wash conditions and digestion conditions were tested to identify a protocol that gave maximal protein yield at the highest possible purity. The final protocol takes one person 2 d to transform a 96-well plate of plasmids into desalted peptide samples ready for mass-spectrometry analysis with a cost of less than $20 per protein (Supplementary Fig. 7).

Protein production conditions.

Proteins were synthesized from plasmid DNA using the Pierce Human In vitro Protein Expression kit (Thermo) according to the manufacturer's protocol with some slight modifications. Briefly, 1 μg of plasmid DNA was transcribed at 32 °C for 70 min in a 20-μl transcription reaction supplemented with 0.3 μl RNase inhibitors (Thermo). Two microliters of the transcription reaction was then added to a 23 μl translation reaction mix and incubated at 30 °C for 2 h. The translation reaction was then spiked with an additional 2 μl of the transcription reaction and incubated at 30 °C for an additional 2 h.

Protein capture conditions.

To enrich the GST-fusion protein, we used 2 ml of glutathione sepharose 4B beads (GE), washed 3 times with 15 ml 1× Dulbecco's phosphate-buffered saline (DPBS; Gibco) and resuspended in 12.5 ml of 1× DPBS. A 125-μl aliquot of the washed bead slurry was added to each well of eight 12-well strip-tubes such that each well received 20 μl of packed beads. Completed translation reactions were added to the beads and the bead-protein mixture was rocked end-over-end for 16 h at 4 °C.

Bead wash conditions.

Bead washing was staggered to ensure that only two 12-well strip-tubes were washed at a given time. By limiting the number of tubes washed at a time, it enabled the total wash time for each reaction to be reduced to less than 25 min. The bead-protein mixture was sedimented at 500g for 2 min using a swinging plate rotor. The supernatant was removed and 150 μl of wash buffer (1× DPBS supplemented with 863 mM NaCl) was added to the beads. The beads were mixed by inverting several times and sedimented at 500g for 2 min. The beads were washed twice with 150 μl wash buffer (1× DPBS supplemented with 863 mM NaCl) each and twice with 150 μl 50 mM ammonium bicarbonate (pH 7.8) each. After the last wash, the beads were resuspended in 100 μl Elution Buffer (0.05% PPS silent surfactant (Protein Discovery), 5 nM heavy isotope–labeled GST peptide LLLEYLEEK (+8 Da) (Thermo), 5 nM heavy isotope–labeled GST peptide IEAIPQIDK (+8 Da) (Thermo) and 50 mM ammonium bicarbonate (pH 7.8)) and stored at 4 °C until all eight 12-well strip-tubes had been washed. Ten microliters of each enriched protein sample was added to 4 μl 4× LDS buffer (Invitrogen) and saved for silver staining and western blotting.

Protein digestion.

Bead bound protein samples were boiled at 95 °C for 5 min, reduced with 5 mM dithiothreitol (DTT) at 60 °C for 30 min and alkylated with 15 mM iodoacetic acid (IAA) at 25 °C for 30 min in the dark. Proteins were then digested with 400 ng trypsin (Promega) at 37 °C for 2 h while shaking. Beads were then sedimented at 500g for 2 min and the supernatant, which contained the digested peptides, was transferred to a new 96-well collection plate. The beads were washed once with 150 μl 50 mM ammonium bicarbonate pH 7.8, and the supernatant from this wash was combined with the previous supernatant. The pH of the supernatant sample was adjusted to <3.0 by 5 μl of 5 M HCl and incubated at 25 °C for 20 min. The digested samples were desalted using a 96-well Oasis MCX plate 30 mg per 60 μm (Waters) following the manufacturer's protocol with minor modifications. Briefly, the cartridge was conditioned using 1 ml methanol, 1 ml 10% ammonium hydroxide in H2O, 2 ml methanol and finally 3 ml 0.1% formic acid in H2O. The samples were then loaded onto the cartridge and washed with 1 ml 0.1% formic acid in H2O and 1 ml of 0.1% formic acid in methanol. The peptides were eluted from the cartridge with 600 μl 10% ammonium hydroxide in methanol, collected in a 1 ml round-bottom 96-well collection plate and evaporated using a SpeedVac (Labconco) set to 50 °C. Peptide samples were evaporated down to 10–30 μl of volume then resuspended in 50 μl 0.1% formic acid in H2O. These peptide samples were stored at −20 °C until injected into the mass spectrometer.

Silver staining and immunoblotting.

Undigested protein extract from each of the fractions was boiled in 1× LDS buffer (Invitrogen) and separated on a 4–12% bis-Tris denaturing and reducing SDS-PAGE (Invitrogen). Gels were then subjected to either silver staining (Invitrogen) or transferred onto a nitrocellulose membrane (Bio-Rad) for immunoblotting. Membranes were blocked with 5% non-fat dry milk (Safeway) in TBS-tween buffer and probed for schistosomal GST (GE 27-4577-01). All primary incubations were done at 4 °C overnight using a 1:1,000 dilution. Secondary incubations were performed in 5% non-fat dry milk in TBS-tween using 1:10,000 diluted peroxidase-conjugated rabbit anti-goat IgG (H+L) (Pierce). Membranes were visualized using an ECL plus western blotting kit (Amersham) and detected with radiographic film (Thermo).

Nuclear protein extraction.

Nuclear proteins from K562, HepG2 and SKNSH cancer cell lines and the BJ fibroblast cell line were isolated in three biological replicates as previously described16. BJ cells were grown in MEM (Gibco) supplemented with 10% fetal bovine serum (PAA), non-essential amino acids (Gibco), sodium pyruvate (Gibco), 1.5 mg ml−1 NaHCO3, penicillin and streptomycin (Gibco). HepG2 cells were grown in MEM supplemented with 10% FBS, non-essential amino acids, sodium pyruvate, penicillin and streptomycin. K562 and SKNSH cells were grown in RPMI (Gibco) supplemented with 10% FBS, sodium pyruvate, L-glutamine (Gibco), penicillin and streptomycin. SKNSH cells were treated with 6 μM retinoic acid for 48 h before they were collected. K562 nuclear extraction was performed by resuspending cells at 2.5 × 106 cells per ml in buffer A (15 mM Tris pH 9.0, 15 mM NaCl, 60 mM KCl, 1 mM EDTA pH 8.0, 0.5 mM EGTA pH 8.0, 0.5 mM spermidine) containing 0.05% NP-40 (Roche). After an 8-min incubation on ice, nuclei were pelleted at 400g for 7 min and washed once with buffer A. SKNSH, HepG2 and BJ nuclei were isolated in a similar fashion, but with the use of cell line–specific NP-40 concentrations and cytoplasmic lysis times (SKNSH was 0.05% NP-40 for 5 min; HepG2 was 0.1% NP-40 for 8 min; and BJ was 0.5% NP-40 for 40 min). Nuclei were then resuspended in buffer A containing 0.2% NP-40, sonicated at setting output 3 for 30 s, digested with benzonase for 15 min at 4 °C, digested with DNaseI for 15 min at 37 °C and finally digested with trypsin. Samples were brought to 6 mM MgCl2 before digestion with 0.375 U μl−1 benzonase (Fisher Scientific). Samples were brought to 6 mM CaCl2 and 90 mM NaCl before digestion with DNaseI (Sigma). DNaseI digestion reactions were stopped using 50 mM EDTA. Nuclear protein samples were digested with trypsin as described above using a 50:1 protein:trypsin ratio. After digestion and MCX cleanup, each sample was resuspended in 0.1% formic acid in H2O to a final concentration of 10,000 nuclei per μl.

Targeted proteomic mass spectrometry.

Peptide samples were analyzed with a TSQ-Vantage triple-quadrupole instrument (Thermo) using either a nanoLC separation system (Eksigent) or a nanoACQUITY UPLC (Waters). A 5-μl aliquot of each sample was separated on a 16-cm-long 75 μm inner diameter packed column (Polymicro Technologies) using Jupiter 4u Proteo 90A reverse-phase beads (Phenomenex). Peptides were separated using a 27.5-min gradient from 2% acetonitrile in 0.1% formic acid to 23% acetonitrile in 0.1% formic acid. The gradient was followed by a wash for 10.5 min at 80% acetonitrile in 0.1% formic acid and a column re-equilibration at 2% acetonitrile in 0.1% formic acid for 12 min. Ions were isolated in both Q1 and Q3 using 0.7 FWHM resolution. Peptide fragmentation was performed at 1.5 mTorr in Q2 using calculated peptide specific collision energies17. Data was acquired using a scan width of 0.002 mass to charge ratio (m/z) and a dwell time of 10 ms.

Each protein sample was injected separately. For the target protein, all monoisotopic, +2 charge state, fully tryptic peptides from 7 to 23 amino acids in length were tested. In addition, the heavy and light forms of the schistosomal GST peptides LLLEYLEEK and IEAIPQIDK and the light form of the endogenous glutathione-binding protein GSTM3 peptide IAAYLQSDQFCK were tested. Peptides that flanked the target and fusion protein were not tested. For all peptides, the monoisotopic, +1 charge state y3 to yn − 1 fragment ions were monitored. All cysteines were monitored as carbamidomethyl cysteines. All methods were designed such that no more than 240 transitions were monitored in a given run. Quality control runs were acquired after every 8 injections to monitor column stability.

SRM data analysis.

Targeted proteomic data were analyzed using the software package Skyline12 (http://proteome.gs.washington.edu/software/skyline/). Chromatographic data from each peptide were manually analyzed to determine the quality of the peptide signal. Scoring of peptide quality was done by assessing the following requirements: (A) a prominent chromatographic peak with a signal intensity of at least 60,000 (total peak area under the curve (AUC) for all contributing fragment ions (arbitrary units)); (B) two or more data points were collected across the peak; (C) three or more fragment ions not including y3 co-eluted to contribute to this peak signal; and (D) chromatographic peak had a Gaussian elution profile. Based on these requirements, peptides were given a quality score between 1 and 4 with 1 being the highest score. Peptides that had chromatographic traces that met all of these requirements were given a quality score of 1. Peptides were given a quality score of 2 if they met requirement A with a signal intensity of at least 20,000 and requirement B but either had only three fragment ions including y3 contributing to the peak or had an abnormal peak shape. Peptides were given a quality score of 3 if (i) more than one chromatographic peak was detected that met requirements B, C and D; (ii) requirement B was not met; or (iii) if requirements A and D were not met. Peptides not classified as having a quality score of 1, 2 or 3 were given a quality score of 4.

Chromatographic peak intensities from all monitored transitions of a given peptide were integrated and summed to give a final peptide peak height. Fragment ion chromatographic traces that were clearly contaminated by some other ion were removed from this analysis (these fragment ions are noted in Supplementary Data 1 as an absence of a monitored fragment ion for a given peptide).

Absolute quantification.

Absolute quantification of the GST peptides LLLEYLEEK and IEAIPQIDK was performed using a calibration curve of light isotope–labeled peptides with each sample containing the same amount of heavy peptide. The calibration points used were 40 nM, 12.5 nM, 5 nM, 2.5 nM, 1 nM, 0.5 nM, 0.25 nM and 0.1 nM each of the light LLLEYLEEK and IEAIPQIDK peptides. All peptide standards were mixed with identical quantities of heavy isotope–labeled LLLEYLEEK and IEAIPQIDK peptides (5 nM each) and a Bovine QC standard mix (25 nM) (Michrome) in 2% acetonitrile and 0.1% formic acid in water. Peptide standards were measured in triplicate and a linear regression of the data points was used to calibrate the GST peptide light-to-heavy ratio of all other samples.

Relative quantification between nuclei.

Six replicate measurements comprising two technical replicates each of three biological replicates were made for each protein in each of the four cell types. The peptides monitored were the HIST4 peptide DNIQGITKPAIR, the GATA2 peptide GAECFEELSK, the CTCF peptide CPDCDMAFVTSGELVR, the CREB1 peptide ILNDLSSDAPGVPR and the EZH2 peptide EFAAALTAER. For each replicate, the intensity of the target peptide was normalized to the intensity of the HIST4 peptide to control for any variance in the autosampler and/or chromatography. Additionally, as the amount of histone 4 protein should be constant between the four cell types, this normalization should correct for any errors in the measurement of nuclei used for digestion. The mean and s.d. of these normalized intensities were then calculated for each protein in each of the cell types.

Shotgun proteomic mass spectrometry.

Peptide samples were analyzed with an LTQ-VELOS instrument (Thermo) using an 1100 binary pump and autosampler (Agilent). Five microliters of each sample was separated on a 16-cm-long 75-μm inner diameter packed column (Polymicro Technologies) using Jupiter 4u Proteo 90A reverse-phase beads (Phenomenex). Peptides were separated using a 25-min gradient from 8.75% acetonitrile in 0.1% formic acid to 33% acetonitrile in 0.1% formic acid. The gradient was followed by a wash for 15 min at 65% acetonitrile in 0.1% formic acid and a column re-equilibration at 8.75% acetonitrile in 0.1% formic acid for 15 min. Spectra were acquired in data-dependent acquisition (DDA) mode. Raw spectral files were searched using the Sequest algorithm18 and spectra identified using Percolator19 with a false discovery rate cutoff of 1% were used for analysis.

Database spectra analysis.

The 2011_05_26 release of the Homo sapiens Ion Trap library of peptide tandem mass spectra was downloaded from NIST (downloaded from http://peptide.nist.gov/ on 12 September 2011). Of the 1,421 peptides monitored in our dataset, 189 had spectra available in the NIST database. Dot-products were calculated using Skyline for all peptides with four or more monitored fragment ions (186 peptides)20.

ESPPredictor scores.

ESPPredictor scores6 were calculated using the Gene Pattern web-tools interface (http://www.broadinstitute.org/cancer/software/genepattern/modules/ESPPredictor.html). Proteins that had eight or more peptides, with one-third of these peptides having a quality score of 1 or 2, were analyzed using the ESPPredictor algorithm (75 total proteins).