Main

Sophisticated mechanisms regulating RNA may explain the gap between the great complexity of cellular functions and the limited number of primary transcripts. Regulation by miRNAs underscores this possibility, as each miRNA is believed to bind directly to many mRNAs to regulate their translation or stability1,2, and thereby control a wide range of activities, including development, immune function and neuronal biology3,4,5. Many miRNAs are evolutionarily conserved, although others are species-specific (including human miRNAs not conserved in chimpanzee)6, consistent with roles ranging from generating cellular to organismal diversity.

Despite their biological importance, determining the targets of miRNAs is a major challenge. The problem stems from the discovery that functional mRNA regulation requires interaction with as few as 6 nucleotides of miRNA seed sequence7. Such 6-mers are present on average every 4 kilobases (kb), so that miRNAs could regulate a broad range of targets; however, the full extent of their action is not known. Bioinformatic analysis has greatly improved the ability to predict bona fide miRNA binding sites8,9,10, principally by constraining searches for evolutionarily conserved seed matches in 3′ untranslated region (UTR). Nonetheless, different algorithms produce divergent results with high false-positive rates3,10,11,12. In addition, many miRNAs are present in closely related miRNA families, complicating interpretation of loss-of-function studies in mammals13,14, although such studies have been informative for several miRNAs3,15,16,17. miRNA overexpression or knockdown studies, most recently in combination with proteomic studies11,18, have led to the conclusion that individual miRNAs generally regulate a relatively small number of proteins at modest levels (<2-fold), although the false-positive rate of target predictions remains high (up to 66%)11, and the data sets analysed have been of limited size (5,000 proteins). Similar high false-positive rates have been observed when miRNAs were co-immunoprecipitated with Ago proteins19,20,21,22,23. A critical caveat common to all of these studies is their inability to definitively distinguish direct from indirect miRNA–target interactions. At the same time, as therapeutic antisense strategies become more viable17,24,25, knowledge of direct miRNA target sites has become increasingly important.

Recently, we developed HITS-CLIP to directly identify protein–RNA interactions in living tissues in a genome-wide manner26,27. This method28,29 uses ultraviolet irradiation to covalently crosslink RNA–protein complexes that are in direct contact (approximately over single ängstrom distances) within cells, allowing them to be stringently purified. Partial RNA digestion reduces bound RNA to fragments that can be sequenced by high-throughput methods, yielding genome-wide maps and functional insights26,30. Recent X-ray crystal structures of an Ago–miRNA–mRNA ternary complex31 suggest that Ago may make sufficiently close contacts to allow Ago HITS-CLIP to simultaneously identify Ago-bound miRNAs and the nearby mRNA sites. Here we use Ago HITS-CLIP to define the sites of Ago interaction in vivo, decoding a precise map of miRNA–mRNA interactions in the mouse brain. This provides a platform that can establish the direct targets on which miRNAs act in a variety of biological contexts, and the rules by which they do so.

Ago RNA targets in the mouse brain

HITS-CLIP experiments rely on a means of purifying RNA-binding proteins (RNABPs)26,27,28,29. To purify Ago bound to mouse brain RNAs, we ultraviolet-irradiated P13 neocortex and immunoprecipitated Ago under stringent conditions. After confirming the specificity of Ago immunoprecipitation (Fig. 1a), we radiolabelled RNA, further purified crosslinked Ago–RNA complexes by SDS–polyacrylamide gel electrophoresis and nitrocellulose transfer, and visualized them by autoradiography. We observed complexes of two different modal sizes (110 kDa and 130 kDa; Fig. 1b and Supplementary Fig. 1), suggesting that Ago (97 kDa) was crosslinked to two different RNA species. Polymerase chain reaction with reverse transcription (RT–PCR) amplification revealed that the 110-kDa lower band harboured 22-nucleotide crosslinked RNAs and the upper band both 22-nucleotide and larger RNAs (Fig. 1c). These products were sequenced with high-throughput methods26 and found to correspond to miRNAs and mRNAs, respectively (Supplementary Table 1), suggesting that Ago might be sufficiently close to both miRNA and target mRNAs to form crosslinks to both molecules in the ternary complex (Fig. 1d). Such a result would allow the search for miRNA binding sites to be constrained to both the subset of miRNAs directly bound by Ago and to the local regions of mRNAs to which Ago crosslinked, potentially reducing the rate of false-positive predictions of miRNA binding.

Figure 1: Argonaute HITS-CLIP.
figure 1

a, Immunoblot (IB) analysis of Ago immunoprecipitates (IP) from P13 mouse neocortex using pre-immune IgG as a control or anti-Ago monoclonal antibody 2A8 blotted with 7G1-1* antibody (Supplementary Methods). b, Autoradiogram of 32P-labelled RNA crosslinked to mouse brain Ago purified by immunoprecipitation. RNA–protein complexes of 110 kDa and 130 kDa are seen with 2A8 but not control immunoprecipitation. c, PCR products amplified after linker (36 nucleotides) ligation to RNA products excised from b. Products from the 110-kDa RNA–protein complex were 22-nucleotide miRNAs, and those from 130-kDa complexes were predominantly mRNAs. d, Illustration showing proposed interpretation of data in c. Ago (drawn based on structure 3F73 in the Protein Data Bank)31 binds in a ternary complex to both miRNA and mRNA, with sufficiently close contacts to allow ultraviolet crosslinking to either RNA—mRNA tags will be in the immediate vicinity of miRNA binding sites. e, f, Reproducibility of all Ago–miRNA tags (e; shown as log2(normalized miRNA frequency) per brain) or all tags within Ago–mRNA clusters (f; see Supplementary Figs 2 and 3). We estimated that sequencing depth was near saturation (Supplementary Fig. 16). g, Location of reproducible Ago–mRNA tags (tags in clusters; BC ≥2) in the genome. Annotations are from RefSeq: ‘others’ are unannotated EST transcripts; non-coding RNAs are from lincRNAs or FANTOM3. h, Top panel: the position of robust Ago–mRNA clusters (BC = 5) in transcripts is plotted relative to the stop codon and 3′ end (presumptive poly(A) site, as indicated). Data are plotted as normalized density relative to transcript abundance for Ago–mRNA clusters (blue) or control clusters (red) (s.d. is shown in light colours; see Supplementary Methods). Regions with significant enrichment relative to control are indicated with black bars (>3 s.d.; P < 0.003). Cluster enrichment 1 kb downstream from the stop codon appears to be due to a large number of transcripts with 1 kb 3′ UTRs (data not shown). Bottom panel: all individual clusters (BC = 5) are shown (each is a different colour).

PowerPoint slide

To differentiate robust from nonspecific or transient Ago–RNA interactions, we compared the results from biological replicate experiments done with two different monoclonal antibodies (Supplementary Figs 1–3). The background was further reduced by in silico random CLIP, a normalization algorithm that accounted for variation in transcript length and abundance (Supplementary Figs 4 and 5). The set of Ago-crosslinked miRNAs and mRNAs was highly reproducible. Among biological triplicates or among 5 replicates done with two antibodies, the Pearson correlation coefficient was R2 > 0.9 and >0.83, respectively, for Ago–miRNA CLIP (Fig. 1e and Supplementary Fig. 2) and R2 ≥ 0.8 and ≥0.65, respectively, for Ago–mRNA CLIP (Fig. 1f and Supplementary Fig. 3). We identified 454 unique miRNAs crosslinked to Ago in mouse brain, with Ago–miR-30e being the most abundant species (14% of total tags; Supplementary Fig. 2); these results were consistent with previous estimates assessed by cloning frequency32 or bead-based cytometry33, although the correlation with published results (R2 = 0.2–0.32; Supplementary Fig. 6) was not as high as among our biological replicates. These discrepancies might be due to differences in the ages of brain used, regulation of Ago–mRNA interactions, and/or increased sensitivity allowed by stringent CLIP conditions and consequent improved signal/noise. To facilitate the analysis of large numbers of Ago–mRNA CLIP tags (1.5 × 106 unique tags; Supplementary Table 1) we analysed overlapping tags (clusters)26, which were normalized by in silico random CLIP and sorted by biological complexity26 (‘BC’, a measure of reproducibility between biological replicates; see Supplementary Figs 5 and 7). A total of 1,463 robust clusters (BC = 5; that is, harbouring CLIP tags in all five biological experiments using both antibodies) mapped to 829 different brain transcripts, and 990 clusters had at least 10 tags (Supplementary Fig. 7).

Ago–mRNA HITS-CLIP tags were enriched in transcribed mRNAs (Fig. 1g). The pattern of tags mirrored the results of functional assays with miRNAs34, which show no biological activity when seed sites are present in 5′ UTRs (1% Ago–mRNA tags), and high efficacy in 3′ UTRs (40% tags including 8% within 10 kb downstream of annotated transcripts, regions likely to have unannotated 3′ UTRs26,35). In addition, an extensive set of tags was identified in other locations, including coding sequence (25%), a location for which there is emerging evidence of miRNA regulation36,37,38,39,40, introns (12%), and non-coding RNAs (4%), indicating that these sites may provide new insights into miRNA biology. Within mRNAs, Ago–mRNA clusters were highly enriched in the 3′ UTR (60%; Supplementary Fig. 8), especially around stop codons (with a peak 50 nucleotides downstream) and at the 3′ end of transcripts (70 nucleotides upstream of presumptive poly(A) sites; P < 0.003, Fig. 1h), consistent with bioinformatic observations from microarray data34. Taken together, these data suggested that Ago–mRNA clusters might be associated with functional binding sites.

To examine the relationship between Ago–mRNA clusters and potential sites of miRNA action, we first performed an unbiased search for all 6–8-nucleotide sequence motifs within Ago–mRNA clusters using linear regression analysis (Supplementary Methods). The six most enriched motifs corresponded to seed sequences present within the most frequently crosslinked miRNAs in Ago–miRNA CLIP (Ago–miRNAs; Supplementary Table 2). The most significant match corresponded to the seed sequence of miR-124, a well-studied brain-specific miRNA (P = 8.3 × 10-58; because miR-124 is only the eighth most frequently crosslinked Ago–miRNA, these data may indicate over-representation of miR-124 seed matches in the genome (Supplementary Tables 2 and 3), or contributions from currently unknown rules of Ago binding). To define more precisely the region of mRNA complexed with Ago, we examined the width of 61 robust Ago–mRNA clusters (BC = 5; total tags >30) relative to their peaks (determined by cubic spline interpolation; Supplementary Methods). We found that Ago bound within 45–62 nucleotides of cluster peaks ≥95% of the time (Fig. 2a), and we defined this region as the average Ago–mRNA footprint.

Figure 2: Distribution of mRNA tags correlates with seed sequences of miRNAs from Ago CLIP.
figure 2

a, Ago–mRNA cluster width and peaks. The peaks of 61 robust clusters (BC = 5, peak height >30, with single peaks) were determined, and the position of tags (brown lines and fraction plotted as brown graph) and width of individual clusters (green lines and fraction plotted as green graph) are shown relative to the peaks (Supplementary Methods). The minimum region of overlap of all clusters (100%) was within -24 and +22 nucleotides of cluster peaks, and ≥95% were within -30 and +32 nucleotides, suggesting that the Ago footprint on mRNA spans 62 nucleotides (or, more stringently, 46 nucleotides). b, A linear regression model was used to compare miRNA seed matches enriched in the stringent Ago footprint region with the frequency of miRNAs experimentally determined by Ago–miRNA HITS-CLIP (see also Supplementary Fig. 9). c, The positions of conserved core seed matches (in position 2–7) from the top 30 Ago-bound miRNAs (independent of peak height; purple colours; including miR-124, red; or, bottom 30 miRNAs, expressed at extremely low levels in brain (Supplementary Fig. 2); black to grey colours) are plotted relative to the peak of 134 robust clusters (BC = 5, peak height >30). A total of 118 of 171 seed matches are within the Ago 62-nucleotide footprint. d, The positions of conserved miR-124 seed matches (bottom panel; each is represented by a different colour) were plotted relative to the peak position of all Ago–mRNA clusters (BC ≥2). Top panel: distribution of mir-124 seed matches (plotted relative to cluster peak, normalized to number of clusters; red graph). The pale colour indicates s.d. Excess kurtosis (k) indicates that seed sites are present in a sharp peak relative to a normal distribution.

PowerPoint slide

Within Ago–mRNA footprints (11,118 clusters; BC ≥ 2) we found a high correlation between the frequency of Ago–miRNAs and the frequency of their seed matches (Fig. 2b and Supplementary Fig. 9). The seed matches were near the peaks of 134 robust clusters (Fig. 2c and Supplementary Table 3), and their distribution was leptokurtic (with a higher and sharper peak than a normal distribution; excess kurtosis (k) = 1.08, versus 0 in a normal distribution). As a control, seed matches for a ‘negative’ group of miRNAs (the lowest ranks from the Ago–miRNA list) were uniformly distributed across these clusters (Fig. 2c; k = -1.35, versus -1.2 in uniform distribution). Taken together, these results indicate that the Ago–mRNA footprint is rich in, and may predict, miRNA binding sites with enhanced specificity over other approaches (at least a threefold improvement in false positives; Supplementary Methods).

Ago HITS-CLIP and miR-124

To explore further the relationship between the Ago–mRNA footprint and miRNA binding we focused on miR-124 sites. There was a marked enrichment of conserved miR-124 seed matches in Ago–mRNA clusters (BC ≥2; Fig. 2d, estimated false-positive rate of 13%; Supplementary Methods). Eighty-six per cent of the predicted miR-124 binding sites were present within the Ago footprint region, again in a tight peak region showing leptokurtic distribution (Fig. 2d; k = 11.63). Although some predicted seeds outside of the Ago footprint might correspond to false positives, we noted small secondary peaks at ±50 nucleotides outside the Ago footprint (Fig. 2d), indicating the possibility of cooperative secondary miR-124 binding sites in some transcripts, consistent with previous data34. Relative to more stringent analyses (Fig. 2c), our analysis at this threshold (BC ≥ 2) was more sensitive and sufficiently specific such that we used it for subsequent analyses (Supplementary Fig. 7).

We searched published examples of miR-124-regulated transcripts for Ago–mRNA clusters harbouring miR-124 seed matches within the predicted 62-nucleotide Ago footprint (referred to as Ago–miR-124 ternary clusters). We identified such ternary clusters in 5 of 5 transcripts in which miR-124 seed sites had been well defined by functional studies (including mutagenesis of seed sequences in full-length 3′ UTR; Fig. 3 and Supplementary Fig. 10). In each of these transcripts, there were many predicted miRNA target sites in the 3′ UTR relative to the small number of Ago–mRNA ternary clusters found, suggesting that there may be a significant number of false-positive predictions from bioinformatic algorithms (Fig. 3 and Supplementary Fig. 10). For example, the 3′ UTR of Itgb1 mRNA has 50 predicted miRNA target sites, including two miR-124 sites, but only 5 Ago–mRNA ternary clusters (Fig. 3a). Using the Ago footprint to predict which miRNAs bound at these sites (Ago ternary map; Supplementary Methods) we identified three as miR-124 sites, one of which was not predicted computationally because the seed sequence is not conserved (Fig. 3a); similar observations were made in the Ctdsp1 3′ UTR (Supplementary Fig. 10). Previous luciferase assays demonstrated that miR-124 suppression of Itgb1 (to 35% control levels) was partially reversed (to 85% of control levels) by mutating both of the two predicted seed sequences41; our observation of an Ago–miR-124 ternary cluster at this third non-conserved site may explain the partial rescue. Conversely, in the Ptbp1 3′ UTR, the absence of any Ago footprint at a predicted miR-9 seed site was consistent with previous studies which found this site to be non-functional42 (Fig. 3b). Additionally, Ptbp1 has seven predicted miR-124 seed sites, of which five were previously tested and only two found to be functional in luciferase assays42; only these two sites harboured Ago–miR-124 ternary clusters (Fig. 3b).

Figure 3: Ago–miRNA ternary clusters in validated miR-124 mRNA targets.
figure 3

a, Ago–mRNA CLIP tags (top panel, raw tags, one colour per biological replicate; second panel, ternary map of Ago–mRNA normalized clusters around top predicted miRNA sites) compared with predicted miRNA sites (using indicated algorithms) for the 3′ UTR of Itgb1 (third panel, see Supplementary Methods; colours indicate predicted top 20 miRNAs as in Fig. 6; grey bars indicate miRNAs ranking below the top 20 (per heat map) in Ago–miRNA CLIP). All predicted miR-124 6–8-mer seeds (conserved (red) or non-conserved (yellow) sites) are shown. The bottom panel shows data from luciferase assays in which mutagenesis of predicted miR-124 seeds at the indicated positions had the indicated effects on rescuing miR-124-mediated suppression (35% of baseline luciferase levels)41. b, Ago–miRNA ternary maps compared with previously reported functional data for Ptbp1 (ref. 42) (see Supplementary Fig. 10 for maps of Ctdsp1 (ref. 44) and Vamp3 (ref. 44)).

PowerPoint slide

To extend these observations, we compiled brain-expressed transcripts from a meta-analysis of five published microarray experiments which identified transcripts downregulated after miR-124 overexpression in HeLa and other cell lines (Supplementary Methods)7,11,21,42,43. Brain-expressed transcripts that had predicted 3′ UTR miR-124 seeds were also suppressed at the mRNA and protein level by miR-124 overexpression (Fig. 4), consistent with previous experiments11,18. However, transcripts with Ago–miR-124 ternary clusters had a significantly greater tendency to be downregulated at the mRNA and protein level (P < 0.01, Kolmogorov–Smirnov test, Fig. 4). Ago–miR-124 ternary clusters had much greater predictive value (true positive rate = 73%) and specificity (92.5%) than analysis of conserved seed sequences alone (Supplementary Fig. 11). We validated these studies experimentally by examining Ago–mRNA clusters that appeared de novo in HeLa cells (which do not express significant (<0.03%) miR-124) after miR-124 transfection. Applying these data to the meta-analysis, mRNAs for which 3′ UTRs harboured new Ago–miR-124 ternary clusters after miR-124 transfection showed an even greater enrichment in miR-124-dependent changes in transcript (Fig. 4c) and protein (Fig. 4d) levels (P < 0.01, Kolmogorov–Smirnov test).

Figure 4: Meta-analysis of Ago–mRNA clusters in large-scale screens of miR-124-regulated targets.
figure 4

a, Transcripts with predicted (conserved) miR-124 seeds (purple line) showed miR-124 suppression relative to all transcripts expressed in brain and cell lines (blue line) or those with no miR-124 seed sequences (green line). Transcripts with Ago–miR-124 ternary clusters (containing both miR-124 seed sequences and Ago–mRNA CLIP tags; red line) showed further miR-124 suppression. b, Similar results were seen when analysing miR-124-dependent protein suppression (identified by SILAC)11, with discrimination by the presence of Ago–miR-124 ternary clusters especially evident where there were smaller numbers of transcripts showing larger changes (log2 <-0.4; inset). c, Transcripts expressed in miR-124-transfected HeLa cells that harbour new Ago–miR-124 clusters (red line; or a subset of transcripts also harbouring Ago–miR-124 clusters in mouse brain; yellow line), compared with previous analysis7 of regulated transcripts in miR-124-transfected HeLa cells. d, As in c, plotted for predicted protein levels, compared with previous data11.

PowerPoint slide

We next examined Ago–miR-124 ternary clusters present in validated individual transcripts identified from among 168 candidate miR-124-regulated transcripts7. These targets were analysed previously44 using a rigorous three-part strategy to validate experimentally 22 of them (although miR-124 binding sites were not generally defined). Sixteen out of twenty-two harboured Ago–miR-124 ternary clusters in the 3′ UTR (Fig. 5), and in five additional transcripts with low expression levels, Ago–mRNA CLIP tags were identified at predicted miR-124 seed sites. For transcripts of even moderate abundance (with normalized probe intensities ≥700; average for P13 brain transcripts 1,255; Supplementary Methods), we identified all ten predicted targets (Fig. 5). These data indicate that identifying Ago–miR-124 ternary clusters markedly enhances the sensitivity and specificity of detecting bona fide miR-124 targets.

Figure 5: Ago–miR-124 ternary maps in brain and transfected HeLa cells.
figure 5

a, Comparison of 22 validated miR-124 targets44 with Ago–miR-124 ternary clusters in brain transcripts (BC ≥2); miR-124 clusters are shown graphically above each 3′ UTR, as in Fig. 3, and corresponding conserved (red) or non-conserved (yellow) miR-124 seeds (below) were identified in 16 of 22 validated transcripts (gene names shaded red). In 5 additional transcripts (gene names shaded orange) clusters were below normalization threshold, but individual Ago–mRNA CLIP tags are shown. Tom1l1 had no detectable Ago binding, although notably a brain-expressed antisense transcript (Cox11) is present. Previously reported changes in miR-124-dependent transcript levels from miR-124-transfected HeLa cells7, and levels of the same transcript (‘Tx’) in mouse brain (normalized probe intensities), are shown. b, Columns represent Ago–miR-124 ternary clusters in HeLa cells transfected with either miR-124 (‘+miR-124’) or a control miRNA (‘-miR-124’). Seventeen de novo Ago–miR-124 ternary clusters (red; present only after miR-124 transfection) are shown, together with clusters present in both control and miR-124-transfected cells (purple), or present in control HeLa cells alone (blue).

PowerPoint slide

We examined this set of 22 targets44 for Ago–mRNA clusters that appeared in the 3′ UTR after miR-124 transfection (Fig. 5). Notably, from among many potential miR-124 seed sites, 17 de novo Ago–miR-124 ternary clusters appeared after transfection and 14 of 17 were at precisely the positions predicted from the brain Ago–miR-124 maps. Genome-wide, we identified 204 de novo Ago–miR-124 clusters with conserved seeds in mouse brain transcripts; of these 98 were independently identified as mouse brain Ago–miR-124 ternary clusters. Taken together, these results confirmed that the Ago ternary map identifies functional sites of miRNA regulation.

Predicting miRNA functional networks

On the basis of the robust correlation between previously validated miR-124 functional sites and Ago HITS-CLIP, we examined brain Ago–mRNA clusters to predict binding maps for the 20 most abundant Ago–miRNAs (Supplementary Methods). These maps (Fig. 6a) show that Ago binds to target transcripts at very specific sites: on average there are only 2.6 Ago–mRNA clusters (BC ≥2) per Ago regulated transcript (2.12 per 3′ UTR) and each miRNA binds an average of only 655 targets (Supplementary Fig. 14). To explore the potential of Ago HITS-CLIP maps to define miRNA regulated transcripts, we examined the functions encoded by the predicted targets of these 20 miRNAs using gene ontology (GO) analysis: comparison of these results with predictions made using GO analysis of TargetScan predictions (Supplementary Fig. 15) demonstrates that the false discovery rate and ‘quality’ of the protein network deteriorate substantially when the Ago–mRNA map is not used. Target predictions from the Ago HITS-CLIP map suggest that diverse neuronal functions are regulated by different sets of miRNAs (Fig. 6b). The largest set of miRNA-associated functions, ‘neuronal differentiation’, illustrates interwoven but distinct pathways predicted to be regulated by three miRNAs expressed in neurons (Fig. 6c). The Ago–RNA ternary map corresponds remarkably well with the current view of miR-124, miR-125 and miR-9 biology, including actions to promote neurite outgrowth and differentiation by inhibiting Ago–miR-124 targets (Fig. 6c; discussed in Supplementary Fig. 15).

Figure 6: Ago–miRNA ternary maps.
figure 6

a, Genome-wide views of Ago–miRNA ternary maps for the top 20 miRNAs from Ago HITS-CLIP (colours represent individual miRNA targets as indicated in b) are shown for the Itgb1 gene (top panel), the local gene region (middle panel; all transcripts are expressed in P13 brain except those outlined in grey boxes), showing tags in neighbouring 3′ UTRs, and for all of chromosome 8 (bottom panel). b, Heat map derived from gene ontology (GO) analysis of transcripts identified as targets of each of the top 20 miRNAs. The tree shows the hierarchical clustering of miRNAs based on GO (Supplementary Methods). Significant clusters are outlined with black boxes (see Supplementary Fig. 12). c, Ago HITS-CLIP targets are shown for the most significant pathways (neuronal differentiation/cytoskeleton regulation; on the basis of false discovery rate (FDR), panel b) for miR-124, miR-9 and miR-125 in mouse brain. Actin cytoskeleton pathways are shown based on the KEGG database (http://www.genome.jp/kegg/).

PowerPoint slide

Discussion

Ago–miRNA–mRNA ternary maps identify functionally relevant miRNA binding sites in living tissues, and were developed in the context of several recent studies. Crystallographic structures of Ago–miRNA–mRNA ternary complexes31 demonstrated close contacts between all three molecules, consistent with the ability of CLIP, which requires close protein–RNA contacts28,29, to detect both Ago–miRNA and Ago–mRNA interactions. The development of HITS-CLIP set the stage for generating and analysing genome-wide RNA–protein maps in the brain26 and cultured cells30. High throughput experiments and bioinformatic analysis together generated genome-wide predictions of miRNA seed sequences, particularly of miR-124. These studies demonstrated that miR-124 simultaneously represses hundreds of transcripts7,11,21,42,43,44, and provided a genome-wide ‘gold standard’ with which to compare Ago HITS-CLIP data. This allowed estimates of specificity, false-positive and false-negative rates (93%, 13–27% and 15–25%, respectively; Supplementary Methods) that, although limited to one Ago–miRNA–mRNA data set, indicate that experimental Ago HITS-CLIP data outperforms bioinformatic predictions alone (Fig. 4 and Supplementary Fig. 11).

Although we used seed-driven approaches to validate targets, not all Ago binding need be constrained by these rules. Twenty-seven per cent of Ago–mRNA clusters have no predicted seed matches among the top 20 Ago–miRNA families. Such orphan clusters might bind other miRNAs, or miRNAs that follow other rules of binding, such as wobble or bulge nucleotides40,45,46 (S.W.C. and R.B.D., unpublished observations). Orphan clusters also provide another means of estimating the false-negative rate (15%; Supplementary Methods), which compares favourably with previous studies in which false-negative rates were between 50% and 70%11,18,20.

Ago HITS-CLIP resolves some obstacles that have arisen in efforts to understand miRNA action. It has been difficult to discriminate direct from indirect actions of miRNAs, and to extrapolate miRNA overexpression studies in tissue culture to organismal miRNA action. Target RNAs have previously been identified by immunoprecipitation, microarray analysis21,44 and reporter validation assays, with the concern that low stringency immunoprecipitation of non-crosslinked RNA–protein complexes47, including Ago–miRNAs48, may purify indirect targets. This has spurred interest in efforts to explore miRNA-target identification by covalently crosslinking, using formaldehyde or 4-thio-uridine-modified RNA in culture to identify transcripts complexed with Ago, miRNAs and additional proteins48,49. HITS-CLIP offers a clear means of identifying direct Ago targets and identifying specific interaction sites, which in turn offers the possibility of specifically targeting miRNA activity.

Ago HITS-CLIP complements bioinformatic approaches to miRNA target identification by restricting the sequence space to be analysed to the 45–60-nucleotide Ago footprint. For highly conserved 3′ UTRs, such as those of the RNABPs Ptbp2, Nova1 and Fmr1, many miRNA sites are predicted using algorithms that rely on sequence conservation, but each has only one Ago–mRNA CLIP cluster (Supplementary Fig. 10). In fact, miRNA selectivity is very high such that on average, transcripts have between one to three major Ago-binding sites in a single tissue (Fig. 3 and Supplementary Figs 10 and 14). Ago–mRNA binding sites themselves have no apparent sequence preference (data not shown), suggesting that accessibility may rely on additional RNABPs. Such a mechanism, which may be assessed by overlaying HITS-CLIP maps of different RNABPs26, could provide a means of dynamically regulating miRNA binding and regulation1.

By simultaneously generating binding maps for multiple miRNAs, Ago HITS-CLIP offers a new approach to understanding combinatorial control of target RNA expression. At the same time, analysis of a single miRNA, miR-124, demonstrated that its expression not only induced Ago to bind miR-124 sites, but reduced or precluded Ago binding to sites occupied in untransfected cells (Fig. 5), perhaps reflecting competition between a limited capacity for miRNA binding on a given 3′ UTR50. Such Ago occlusion has important mechanistic, experimental and clinical implications, where studies manipulating miRNA levels are envisioned.

Methods Summary

Ago HITS-CLIP was performed in biological replicates as described26,27 (using monoclonal antibody 2A8 or 7G1-1* as described in Supplementary Methods). High-throughput sequencing was performed with an Illumina Genome Analyser.

Affymetrix exon arrays (MoEx 1.0 ST) were used to measure transcript abundance in P13 mouse brain and data were analysed with Affymetrix Power Tools. Bioinformatics analysis used the UCSC genome browser, miRBASE, BioPython, Scipy and GoMiner, as described in Supplementary Methods. Additional data access can be found at our project website http://ago.rockefeller.edu/.