Transcript abundance can be determined by various methods, including reverse transcription (RT)-PCR, microarray analysis, sequencing of expressed sequence tags (ESTs), serial analysis of gene expression (SAGE) and massively parallel signature sequencing (MPSS)1,2, most of which rely on 3′ end–related sequences. But for the identification of transcription start sites (TSSs) and their associated promoters, 5′ end–specific signature sequences are required for higher annotations of expression profiles. Therefore, we and others began cloning of short sequence tags from the 5′ ends of cDNAs, using cap analysis of gene expression (CAGE)3 and 5′-SAGE4,5. In these techniques linkers are attached to the 5′ ends of full-length enriched cDNAs to introduce a recognition site for the restriction endonuclease MmeI adjacent to the 5′ ends. MmeI cleaves cDNAs at a sequence 20 and 18 nucleotides away (3′) from its recognition site, creating a two-base overhang. After amplification, the sequencing tags are concatenated for high-throughput sequencing (Fig. 1). Here we present a CAGE protocol that has been used extensively for high-throughput analysis of mouse and human transcripts. The method includes new features for improved library construction, such as the use of random priming6 for gene discovery from nonpolyadenylated RNA, simplified full-length cDNA enrichment by multiwell filtration-based cap-trapping and a pooling strategy for high-throughput CAGE library preparation. The application of this CAGE protocol will contribute to genome annotation, gene discovery and expression profiling7.

Figure 1: Preparation of CAGE libraries.
figure 1

Key steps in CAGE library preparation are given along with references to the related steps in the protocol.

Materials

Reagents

  • (+)-biotin hydrazide long-arm variant (Vector Laboratories)

  • Biotin solution D: 4 M guanidine thiocyanate, 25 mM sodium citrate (pH 7.0), 0.5% sodium N-lauroyl sarcosinate (Sarkosyl), 1.5% (wt/vol) freshly dissolved biotin

  • 1×BW buffer (Dynabeads M-280 streptavidin bind and wash buffer): 5 mM Tris-HCl (pH 7.5), 0.5 mM EDTA (pH 8.0), 1 M NaCl, 0.1 mg/ml bovine serum albumin (BSA)

  • Buffer PE (pH 7.7), ethanol-based wash buffer (Qiagen)

  • CL4B buffer: 1 mM EDTA, 10 mM Tris-HCl (pH 8.0), 100 μg/ml BSA, 0.1% sodium azide

  • CTAB-urea solution: 1% cetyltrimethylammonium bromide (CTAB; cationic detergent)8, 4 M urea, 50 mM Tris-HCl (pH 7.0), 1 mM EDTA

  • dNTP–5m-dCTP mix: solution containing dATP, dGTP, dTTP and 5-methyl-dCTP, each at a concentration of 10 mM

  • Dynabeads M-280 streptavidin (10 mg/ml, 6.7 × 108 beads/ml; Dynal)

  • 2× GC buffer I from TaKaRa LA Taq thermostable DNA polymerase kit (Takara)

  • LoTE (low-salt Tris-EDTA) buffer: 3 mM Tris-HCl (pH 7.5), 0.2 mM EDTA (pH 7.5)

  • Oligonucleotides (see Supplementary Table 1 online)

  • Polyacrylamide gel elution buffer: 0.5 M NH4OAc, 10 mM Mg(OAc)2, 1 mM EDTA (pH 8.0), 0.1% SDS

  • Release buffer (prepare fresh): 50 mM NaOH, 5 mM EDTA

  • Sephacryl S-400 HR resin (Amersham Biosciences)

  • Sodium periodate (NaIO4), reagent grade (ICN)

  • Sorbitol-trehalose mix: 2:1 (vol/vol) of 4.9 M sorbitol and saturated solution (80%) of D-(+)-trehalose (>99.5% by high performance liquid chromatography (HPLC); Fluka)

  • Streptavidin-agarose (Upstate)

  • 0.1× TE buffer: 1 mM Tris-HCl (pH 7.5), 0.1 mM EDTA

  • Wash buffer 1: 4.5 M NaCl, 50 mM EDTA (pH 8.0); wash buffer 2: 0.3 M NaCl, 1 mM EDTA; wash buffer 3: 0.4% SDS, 0.5 M NaOAc, 20 mM Tris-HCl (pH 8.5), 1 mM EDTA; wash buffer 4: 0.5 M NaOAc, 10 mM Tris-HCl (pH 8.5), 1 mM EDTA

For additional reagents and equipment, see Supplementary Methods online.

Procedure

First-strand cDNA synthesis

  1. 1

    To 50 μg of total RNA, add NaCl to a final concentration of 0.25 M and combine with an equal volume of isopropanol. Precipitate the RNA at −20 °C for 30 min and centrifuge at 15,000g for 20 min. Wash the pellet twice with 80% ethanol and dissolve the pellet in 20 μl of water.

    Steps 1–4 describe reverse transcription using random primers; a protocol for reverse transcription using oligo(dT) primers is available in Supplementary Methods online.

    Critical Step

    Use siliconized tips and tubes for RNA or single-stranded DNA, and work under RNase-free conditions until the completion of the biotinylation reaction. We commonly use total RNA for the preparation of CAGE libraries, as the cap-trapper method is very effective in removing rRNA contaminations from the final CAGE libraries. Moreover, we have observed that the full-length gene discovery rates can be improved by omitting mRNA purification steps.

  2. 2

    Transfer the RNA solution to 96-well PCR microtiter Plate A, and add 2 μl of 6 μg/μl random primer N20. Use different microtiter wells for each RNA sample.

    Steps 1–30 of the CAGE library preparation are performed in 96-well plates as outlined in Supplementary Figure 1 online. After Step 30, samples may be mixed to prepare pooled CAGE libraries, typically containing three to six samples per library depending on experimental needs and the complexity of the starting materials. Note, however, that this protocol does not require the preparation of pooled libraries or the use of 96-well plates. All steps can be performed for a single library preparation using the same volumes.

  3. 3

    In a separate 96-well PCR microtiter Plate B, prepare a reaction mix without RNA (amounts given are per microtiter well).

  4. 4

    Heat Plate A to 65 °C for 10 min and then place immediately on ice. Quickly transfer the contents from Plate B to Plate A. Carry out reverse transcription in a thermal cycler as follows: 30 s at 25 °C, 30 min at 42 °C, 10 min at 50 °C, 10 min at 56 °C; hold at 4 °C until further processing.

Treatment with Proteinase K

  1. 5

    To each well in Plate A add an aliquot of a master mix containing 2 μl 0.5 M EDTA and 3 μl of 10 μg/μl Proteinase K (QIAGEN). Incubate the reactions at 45 °C for 20 min.

Purification of first-strand cDNA by precipitation with CTAB

  1. 6

    Assemble the QIAvac 96 purification manifold. Aliquot 30 μl of 5 M NaCl to each of the QIAGEN collection microtubes (Plate C).

  2. 7

    Transfer the reaction from Plate A to the collection microtubes (Plate C). Wash Plate A twice with 150 μl of CTAB-urea solution, each time transferring the solution to the corresponding collection microtubes.

    We use CTAB8 purification to remove polysaccharides and other byproducts created by protein digestion. We do not recommend use of ethanol here, as it may result in formation of high-molecular-weight precipitates that inhibit the cap-trapping step8.

  3. 8

    Incubate Plate C at 65 °C for 10 min in an air incubator and then allow it to cool to 15–25 °C for 10 min.

  4. 9

    Transfer the samples from Plate C to the QIAquick 96-well plate (QIAGEN) and apply vacuum to remove the solution. Wash once with 1,000 μl of buffer PE applying vacuum and wash again using 700 μl of buffer PE. Carry out the last wash with 1,000 μl of 80% ethanol and apply vacuum for 10 min to remove the ethanol completely.

  5. 10

    Insert a fresh set of QIAGEN collection microtubes (Plate D). Apply 60 μl water prewarmed to 65 °C to the 96-well plate and recover the resuspended DNA in Plate D using vacuum. Repeat, using an additional 60 μl of water prewarmed to 65 °C.

    This procedure should yield 100 μl of RNA-cDNA hybrid solution.

    Pause point

    Store the sample at −20 °C.

Oxidation of the diol groups of RNA

  1. 11

    Set up the following reaction in Plate D and incubate on ice for 45 min in absolute darkness until stopping the reaction. Stop the reaction by adding 2 μl of 80% glycerol.

    The diol groups at both ends of mRNA are oxidized to prepare for the reaction with biotin hydrazide. Note that the 5′ end of mRNA can be biotinylated only if it has an intact cap structure.

  2. 12

    To purify the RNA-cDNA hybrid from the reaction, assemble a Microcon YM-100 centrifugal filter unit into the Microcon 96-well retentate assembly plate.

  3. 13

    Apply the reaction (Step 11) to the centrifugal filter unit. Centrifuge at 500g for 10 min at 15–25 °C. Apply 200 μl of water and centrifuge for 10 min. Repeat this wash twice.

  4. 14

    Invert the column and recover the nucleic acid (5 μl) by centrifugation at 1,000g for 1 min at 15–25 °C. Apply 45 μl of water to the column and shake gently to wash the column. Recover the solution by centrifugation at 1,000g for 1 min. Transfer the recovered solution (total 50 μl) to the QIAGEN collection microtubes (Plate E).

Biotinylation and RNase I digestion

  1. 15

    Mix the following components and incubate at 15–25 °C for 10–16 h.

    The biotin-streptavidin interaction can be used for an effective full-length cDNA selection, also denoted as cap-trapper method9,10.

  2. 16

    Mix the following components and incubate at 37 °C for 30 min.

    RNase I cleaves single-stranded RNA but does not cleave double-stranded full-length RNA-cDNA hybrids.

  3. 17

    Add 3 μl of 10% SDS and 3 μl of 10 μg/μl Proteinase K to the RNase I reaction and incubate at 65 °C for 1 h. Place on ice, add 250 μl 0.1× TE buffer and purify the sample as described in Steps 12–14, using 0.1× TE buffer instead of water. Collect the samples in a fresh plate (Plate F).

Capture and purification of full-length RNA-cDNA hybrid

  1. 18

    Prepare the streptavidin-agarose conjugate for the capture-release step.

    1. i

      For each sample, mix 250 μl of streptavidin-agarose slurry with 2.5 μl tRNA (20 μg/μl). Incubate the master mix on ice for 20 min with occasional shaking.

    2. ii

      Add 250 μl of the treated streptavidin-agarose slurry to each well of the 96-well UNIFILTER GF/C 800 filter microplate (Plate G). Centrifuge at 540g for 20 s.

    3. iii

      Apply 200 μl wash buffer 1 and centrifuge again. Repeat the wash step twice.

  2. 19

    To bind full-length biotinylated cDNA to the beads, add the 50 μl cDNA sample (Plate F) to the prepared streptavidin-agarose conjugate in Plate G. Recover the remaining sample from Plate F with an additional 50 μl of wash buffer 1 and apply to the corresponding wells in Plate G.

  3. 20

    Incubate Plate G at 15–25 °C for 30 min with occasional gentle shaking.

  4. 21

    Centrifuge the plate at 540g for 20 s at 15–25 °C. Apply 200 μl of wash buffer 1 and centrifuge again; repeat the wash step. Wash once with wash buffer 2, twice with 200 μl wash buffer 3 and three times with 200 μl wash buffer 4.

  5. 22

    Apply 100 μl of freshly prepared release buffer to the washed beads in Plate G and incubate at 15–25 °C for 5 min. Recover by centrifugation. Repeat the release step twice and combine the released supernatant fractions in Plate H.

  6. 23

    To neutralize the released fractions and treat with RNase I, mix the following components on ice and then incubate at 37 °C for 10 min.

    Critical Step

    Handle the cDNA gently until the CAGE linker is ligated to ssDNA.

  7. 24

    Add 2 μl Proteinase K (10 μg/μl) and 9 μl of 10% SDS to each sample; incubate at 65 °C for 1 h. Purify the ssDNA from the reactions as described in Steps 12–14, using 0.1× TE buffer instead of water for the wash.

  8. 25

    Further purify the ssDNA by fractionation on Sephacryl S-400 HR resin.

    1. i

      Prepare S-400 spin columns by aliquoting 500 μl of Sephacryl S-400 HR resin into each well of the 96-well UNIFILTER GF/C 800 microtiter plate (Plate I). Centrifuge at 580g at 4 °C for 1 min. Place the UNIFILTER GF/C 800 filter plate on new microtiter plate to collect waste.

    2. ii

      Apply 50 μl of single-stranded DNA (ssDNA) sample from Plate H to the center of the S-400 packed resin in Plate I. Apply an additional 50 μl 0.1× TE buffer and centrifuge the plate at 580g at 4 °C for 2 min. This process should yield a 100-μl sample.

    Critical Step

    This purification is important to remove all contaminants before CAGE adaptor ligation.

  9. 26

    Purify the sample as described in Steps 12–14, recovering the purified sample in Plate J.

  10. 27

    Transfer an 5 μl aliquot of the recovered solution from Plate J to Plate K (Step 26). Heat the ssDNA solution at 65 °C for 5 min and then place immediately on ice.

    Critical Step

    Denaturation steps at 65 °C for ssDNA and 37 °C for CAGE linkers are essential to break secondary structures and linker-linker interaction before use.

Preparation and ligation of the double-stranded CAGE linkers

  1. 28

    Mix the AATAG-GN5, AATAG-N6 and AATAG-down oligonucleotides (Supplementary Table 1) at a ratio 4:1:5 to prepare a solution containing 2 μg/μl of the mixture of oligonucleotides in 100 mM NaCl. Anneal the oligonucleotides as follows: 5 min at 65 °C, 5 min at 45 °C, 10 min at 37 °C, 10 min at 25 °C.

    Biotinylated primers are used here to facilitate purification of amplified tags at later stages. Optionally, different labels can be introduced at this point to mark the origin of a CAGE tag in pooled CAGE libraries. A list of 15 different CAGE linkers is provided in Supplementary Table 2 online that introduce a MmeI site next to the 5′ end of the cDNA, a 5-base-pair (bp) signature and an XmaJI cloning site. Labeling of tags is not required for the preparation of individual CAGE libraries.

    Critical Step

    Oligonucleotides should be of highest quality and it may be necessary to verify that they are phosphorylated where required (AATAG-down and XmaJI-up). Further, we recommend purifying all oligonucleotides before use with PAGE to ensure that no truncated linkers are used.

  2. 29

    Dispense 2.5 μl (0.08 μg/μl) aliquots of CAGE linker to Plate L, incubate at 37 °C for 5 min and then place on ice. Transfer the linker from Plate L to the ssDNA in Plate K on ice and add the ligation kit solutions to make up the following reaction. Carry out ligation at 16 °C overnight.

    Critical Step

    See Step 27.

  3. 30

    Add the following components to the ligation reaction and incubate at 45 °C for 15 min.

    After Proteinase K treatment, retain 20 μl of the 30-μl sample at −20 °C to repeat the CAGE tag production procedure if necessary.

    As an option, from this point the samples can be mixed: for example, three to six samples can be used in the preparation of one pooled CAGE library. After sequencing, individual tags can be assigned to their parental RNA source by the 5-bp labels in the linkers.

    Note that all the following steps are described for only one sample whether this sample represents a mixed or an individual CAGE library.

    Critical Step

    Broken ssDNA can ligate to CAGE linkers resulting in CAGE tags that do not correspond to cap sites. Make sure that all reagents are of the highest possible grade and nuclease-free.

  4. 31

    Extract the DNA with phenol-chloroform; back extract using CL4B buffer and chloroform. The first extraction should yield 60 μl and the second 40 μl of solution.

  5. 32

    Purify the cDNA from the ligation mix using Sephacryl S-400 HR resin (see Supplementary Methods). Dissolve the resulting precipitated cDNA in 34 μl of water.

    Critical Step

    This purification of cDNA is essential to remove excess linkers and linker dimers.

Synthesis of second-strand cDNA

  1. 33

    Combine the following reagents and incubate the reaction at 65 °C for 1 min. Add 2 μl of ELONGASE enzyme mix (Invitrogen) and mix by gentle vortexing.

  2. 34

    Carry out the extension reaction as follows: 5 min at 65 °C, 30 min at 68 °C, 10 min 72 °C; hold at 4 °C.

    Three microliters of the reaction mixture can be used to measure second-strand synthesis yields by radioactive9 or nonradioactive methods11.

    Troubleshooting

  3. 35

    After second-strand synthesis, set up a reaction with Proteinase K as follows and incubate at 45 °C for 15 min. Then extract the double-stranded DNA (dsDNA) with phenol-chloroform (1:1; vol/vol); back extract using 40 μl of LoTE buffer and chloroform.

  4. 36

    Purify the reaction products on Sephacryl S-400 HR resin as described in the Supplementary Methods. Precipitate the dsDNA with isopropanol and dissolve the pellet completely in 20 μl of LoTE buffer.

    Pause point

    Samples can be stored in ethanol at −20 °C.

MmeI digestion of dsDNA

  1. 37

    Combine the following components and incubate at 37 °C for 30 min.

    Critical Step

    An excess of MmeI in the reaction blocks cleavage. Prepare 100×-diluted solution of MmeI by mixing with the enzyme with NEB buffer 4 and water. Reactions using MmeI should be performed at or near-stoichiometric concentrations.

  2. 38

    Treat with Proteinase K and extract with phenol-chloroform (as described in Step 35). Recover the DNA by precipitation with isopropanol and dissolve the pellet completely in 2 μl of LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

Preparation and ligation of second linker XmaJI

  1. 39

    Prepare double-stranded second linker XmaJI by annealing 20 μg of each oligonucleotide XmaJI-up and XmaJI-down (see Supplementary Table 1) in 100 μl of 100 mM NaCl: 5 min at 65 °C, 5 min at 45 °C, 10 min at 37 °C, 10 min at 25 °C; hold at 4 °C.

    The XmaJI linker has a 2-bp overhang to be ligated to the cohesive end created by MmeI digestion.

    Critical Step

    See Step 28.

  2. 40

    To create construct suitable for amplification, anneal the XmaJI linker to the MmeI digested full-length cDNA: incubate the mixture at 65 °C for 2 min and then immediately place on ice.

  3. 41

    Ligate the XmaJI linker to full-length cDNA by combining the following components. Incubate at 16 °C overnight and then heat the reaction at 65 °C for 5 min and place on ice.

  4. 42

    Purify the dsDNA by biotin selection, releasing the captured DNA from the beads using an excess of free biotin10,12.

    1. i

      Prepare magnetic Dynabeads M-280 streptavidin by blocking nonspecific binding sites: wash 200 μl streptavidin beads three times with 200 μl of 1× BW buffer. Resuspend the beads into 100 μl of 2× BW buffer, add 2 μl of tRNA (20 μg/μl) and incubate with rotation at 15–25 °C for 15 min.

    2. ii

      Bind the ligated DNA to the prepared beads according to the manufacturer's instructions.

    3. iii

      Purify the CAGE construct by washing the beads twice with 1× BW buffer including 1× BSA at 0.1 mg/ml final concentration, twice with 1× BW buffer and twice with 200 μl of LoTE buffer.

    4. iv

      Collect streptavidin beads using a magnetic particle concentrator, add 50 μl of biotin solution D and incubate at 45 °C for 30 min.

    5. v

      Collect the beads using a magnetic particle concentrator and transfer the eluted dsDNA into a fresh 1.5-ml siliconized microcentrifuge tube. Repeat the elution cycle three times, then wash once with 50 μl of LoTE buffer and combine all fractions.

    6. vi

      Precipitate the DNA with isopropanol and dissolve the pellet completely in 50 μl of LoTE buffer.

dsDNA purification by column chromatography and RNase I treatment

  1. 43

    Purify the double-stranded CAGE construct using MicroSpin G-50 spin columns (Amersham Biosciences), according to the manufacturer's instructions.

  2. 44

    Precipitate the purified DNA with isopropanol and dissolve the pellet completely in 45 μl of LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

  3. 45

    To further purify the tag, treat with RNase I as follows to remove tRNA. Carry out the RNase I digestion at 37 °C for 15 min.

  4. 46

    Treat the reaction with Proteinase K and extract with phenol-chloroform (as described in Step 35). Recover the DNA by precipitation with isopropanol and dissolve the pellet completely in 24 μl of LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

First amplification

  1. 47

    Set up three separate amplification reactions in 0.6 ml PCR tubes to determine the optimal number of amplification cycles required (usually 15, 20 and 25 cycles).

    To determine what number of cycles produces the best yield, analyze the reaction products by polyacrylamide gel electrophoresis. We advise to keep the number of cycles as low as possible. Note that in the amplification steps biotinylated primers are used to facilitate easy removal of linker fragments from purified CAGE tags ( Fig. 1 ).

  2. 48

    For the first amplification use the following program, and increase the cycle number as appropriate (cycle number of 15 is shown here).

  3. 49

    Set up and carry out 20 separate amplification reactions with 16 μl of template DNA from Step 46, using the optimal number of cycles as determined in Step 48. Store the remaining DNA solution (8 μl) at −20 °C to repeat the process in case of sample loss.

    Critical Step

    Use the lowest PCR cycle number possible to minimize the chance of PCR-induced sequence errors.

Purification of PCR products

  1. 50

    Pool the PCR products, treat with Proteinase K and extract with phenol-chloroform (as described in Step 35). Recover the DNA by precipitation with isopropanol and completely dissolve the pellet in 50 μl of LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

  2. 51

    Purify the dsDNA by column chromatography using MicroSpin G-50 spin column according to the manufacturer's instructions.

  3. 52

    Precipitate the purified dsDNA with isopropanol and dissolve the pellet completely in 40 μl of LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

  4. 53

    Mix the DNA with loading buffer and apply the entire sample to a 12% polyacrylamide gel. Run the gel at 170 V for 3 h, stain and cut out the 125-bp band.

    Troubleshooting

  5. 54

    Elute the DNA from the PAGE gel using polyacrylamide gel elution buffer (see Supplementary Methods).

  6. 55

    Extract the pooled fractions of DNA recovered from the gel with phenol-chloroform and back extract with LoTE buffer and chloroform. Recover the DNA by precipitation with isopropanol under standard conditions. Completely dissolve the pellet in 24 μl of LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

Second amplification

  1. 56

    Set up and carry out 20 amplification reactions using the conditions described for the first amplification reaction (Steps 47–49). Use 16 μl of the DNA samples (Step 55), 0.8 μl per reaction as template for amplification. Store the remaining 8 μl of DNA solution at −20 °C as a backup for rescuing the sample, if necessary.

    The PCR cycle number should be determined experimentally. In our experience, the optimal amplification cycle number in the second PCR is between 5 and 11.

    Critical Step

    Use the lowest PCR cycle number possible to minimize the chance of PCR-induced sequence errors.

  2. 57

    Purify the amplification products as described for the products of the first PCR (Steps 50–52), but dissolve the pellet in 25 μl of LoTE buffer.

    Here 0.5 μl of DNA solution may be used for dilution to determine the dsDNA concentration by measuring the optical density or using any other method 11 . On average we obtain 15 μg (5–25 μg) of 125-bp tag DNA.

  3. 58

    Set up the following digestion reaction with XmaJI and incubate at 37 °C for 1 h.

    XmaJI digestion releases CAGE tags with compatible ends for concatenation.

  4. 59

    Treat the products with Proteinase K and extract with phenol-chloroform (as described in Step 35). Recover the DNA by precipitation with isopropanol and dissolve the pellet completely in 10 μl LoTE buffer.

    Pause point

    Samples can be stored in isopropanol at −20 °C.

Purification of released CAGE tags

  1. 60

    Purify the CAGE tags by biotin selection.

    1. i

      Prepare 300 μl magnetic Dynabeads M-280 streptavidin as described in Step 42(i), but do not add tRNA.

    2. ii

      Bind the CAGE tags to the beads according to the manufacturer's instructions.

    3. iii

      Collect the supernatant and wash the beads once with 50 μl 1× BW buffer.

    4. iv

      Release the captured CAGE tags from the beads with an excess of biotin as described in Steps 42(iv) and 42(v).

    5. v

      Extract the pooled CAGE tags as in Step 55, but dissolve the pellet completely in 40 μl of LoTE buffer.

    It is essential to obtain pure CAGE tags completely free of interfering byproducts such as linker fragments that can inhibit the concatenation reaction.

  2. 61

    Mix the DNA with loading buffer and apply the entire sample to a 12% polyacrylamide gel. Run the gel at 170 V for about 2 h, stain and cut out the 38-bp DNA band containing CAGE tags. Recover the DNA as described in Step 54. Because the tags are short (38 bp), elute by incubating at 37 °C instead of at 65 °C. Dissolve the DNA in 4 μl of LoTE buffer.

    Analyze dsDNA concentration as described in Step 57.

Concatenation of CAGE tags

  1. 62

    Set up the concatenation reaction and incubate at 16 °C for 20 min.

    Concatenation permits efficient cloning of 15 or more tags, which are then subjected to sequencing.

  2. 63

    Treat the products with Proteinase K and extract with phenol-chloroform (as described in Step 35). Recover the DNA by precipitation with isopropanol and dissolve the pellet completely in 21 μl of LoTE buffer.

    Analyze 1 μl of the ligated tags by electrophoresis through a 1% agarose gel to determine size of the product. In our experience, >100 ng DNA and fragments of >50 bp can be observed in a 1% agarose gel by staining with GelStar (Cambrex, Inc.).

    Troubleshooting

  3. 64

    Mix the DNA with loading buffer and apply the entire sample to a 12% polyacrylamide gel. Run the gel at 170 V for 3 h, stain and cut out the DNA in size range of >500 bp. Recover the concatenated DNA as described in Step 54 (longer elution times are beneficial). Dissolve every fraction in 2 μl of water.

    Alternatively, HPLC can be used for purification of DNA fragments as outlined in Supplementary Table 3 online.

  4. 65

    To clone the concatemers into the XbaI site of linearized plasmid pZErO-2 (Invitrogen), digest the plasmid with XbaI and treat the linearized DNA with calf intestinal alkaline phospatase. Then set up the following ligation reaction. Incubate the reaction at 16 °C overnight.

    Note that digestion with XbaI and XmaJI produce fragments with compatible cohesive overhanging ends that allow them to be cloned. To minimize background, prepare the vector using the Plasmid-Safe ATP-Dependent DNase Kit (EPICENTRE).

  5. 66

    Treat the products with Proteinase K and extract with phenol-chloroform (as described in Step 35). Recover the DNA by precipitation with isopropanol and dissolve the pellet in 5.5 μl of water.

  6. 67

    Transform 20 μl of electrocompetent E. coli (for example, Electromax DH10B), using 1 μl of the recombinant DNA plasmid (Step 66). Plate transformants on LB plates containing kanamycin and streptomycin. Store the remaining plasmid DNA from Step 66 at −20 °C as a backup.

    Troubleshooting

  7. 68

    Carry out sequencing of the clones under conditions suitable for G-C–rich regions and palindromic secondary structures13,14.

    Troubleshooting

Troubleshooting

[Step 34]

Problem: There is insufficient enrichment of full-length cDNA.

Solution: Ensure that high-quality RNA is being used for the preparation of CAGE libraries. Verify the efficiency and specificity of the cap-trapping reaction. When using total RNA, the ribosomal cDNA band should disappear after cap-trapping. Confirm that the cDNA size does not change as a result of degradation caused by bad reagent quality. Make sure that all reagents are nuclease-free. Handle the DNA gently after full-length cDNA selection (after Step 17), as shearing the DNA may result in DNA degradation, particularly for oligo(dT)-primed cDNA.

[Step 53]

Problem: There is low resolution of the polyacrylamide gel.

Solution: Perform a test run of the gel to troubleshoot the separation range on your system.

[Step 63]

Problem: The concatemers are too short.

Solution: Reasons for short concatemers include low ligase activity, contamination of partially digested tags, the presence of linkers and other factors inhibiting ligation reaction. Verify the condition and perform additional concatenation reaction under improved conditions. Keep ligation reaction time brief to avoid ligating all the active ends.

[Step 67]

Problem: There is a high rate of clones without an insert.

Solution: Check the vector used for cloning. Vector pZErO-2 should be prepared by restriction digestion with XbaI, including dephosphorylation of the 5′ ends.

Problem: The titer of the library is low.

Solution: Perform additional transformations using the saved ligation reaction products from Step 67. Additional tags can be prepared from the materials saved at Steps 49 and 56. First, perform amplification from saved DNA in Step 56, before second PCR amplification. If even more tags are needed, use DNA saved in Step 49.

[Step 68]

Problem: There is a high content of linker sequences or linker dimers in the library.

Solution: Check whether the open tip mini column works well (Step 32). Perform chromatography purification using Sephacryl S-400 HR resin twice to remove the linker excess. Repeat library preparation with the materials put aside at Step 30. It is advisable to test the column system in advance.

Problem: The sequenced library is highly redundant.

Solution: Note that depending on the sample, some RNA preparations contain highly expressed transcripts. Otherwise, make sure to keep the number of PCR amplification cycles as small as possible (Steps 48 and 56). Be sure to use nondegraded high-quality RNA samples for the CAGE procedure (Step 1).

Comments

Current methods in expression profiling focus mainly on sequence information from the 3′ ends of mRNA. Although convenient to use and relatively easy to obtain, 3′ end–related sequences are of little use where expression levels are to be correlated to the mechanisms that control their transcription. Thus, we and others have suggested that attention be shifted from 3′ end–based methods to new approaches using signature sequences from the true 5′ ends of mRNAs. The discovery and mapping of TSSs, together with promoter identification in genome sequences, have become possible with CAGE and 5′-SAGE. As a consequence, we are now compiling the tools necessary for analysis of expression patterns based on 5′ end–derived sequence tags. It will be essential to encourage as many laboratories as possible to use these new approaches to build reference data sets for easier annotation and analysis of CAGE and/or 5′-SAGE tags. This goal can be achieved only by a large-scale production of many more CAGE or 5′-SAGE libraries, as many of the entries presently available in the public databases do not have correct 5′-end sequences. We therefore hope that this detailed protocol will allow the research community to create a large reference set of annotated mouse, human and any other eukaryotic organism CAGE tags. These reference tags make a substantial contribution to other approaches as well, for example, tiling arrays that cannot be used to identify the borders of transcripts. More than 14,400,000 CAGE tags have been obtained from 169 CAGE libraries to date leading to a massive discovery of alternative TSSs and the identification of many new transcripts15,16. These results have proven the value of the CAGE approach, which is finding increased applications in many areas of study.

Note: Supplementary information is available on the Nature Methods website.

Source

This protocol was provided directly by the authors listed on the title page. For further details on standard molecular procedures, see Sambrook, J. & Russell, D.W., Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, USA, 2001; http://www.cshlpress.com/link/molclon3.htm).