Main

Several lines of evidence suggest that improved information about the cannabinoid receptor type 1 (CB1/Cnr1) gene and its human variants might add to our understanding of vulnerabilities to addictions. This Gi/Go-coupled receptor is abundant in brain regions important for drug reward and drug ‘memories’, including hippocampus, striatum, and cerebral cortex.1, 2, 3, 4 CB1/Cnr1 appears to mediate virtually all of the rewarding actions of active marijuana components.5 CB1/Cnr1 receptors also modulate the neurotransmitter dopamine in brain circuits that are important for the rewarding effects of most abused substances.6 CB1/Cnr1 knockout (KO) mice or mice treated with CB1/Cnr1 antagonists display less reward from several addicting compounds including cannabanoids, morphine, nicotine and ethanol.5, 7, 8, 9

CB1/Cnr1 is also a candidate for involvement in other important phenotypes. Personality changes that have been reported in human cannabanoid users make CB1/Cnr1 variants candidates for contributions to heritable differences in personality in drug-free individuals.10 Evidence that CB1/Cnr1 variants could possibly play roles in schizophrenia include observations of elevated cerebrospinal fluid levels of endogenous CB1/Cnr1 ligands including anandamide, altered frequencies of CB1/Cnr1 genetic markers and elevated frequencies of marijuana use in at least some studies of schizophrenic samples.11, 12, 13, 14, 15 The CB1/Cnr1 human chromosome 6q14–15, 91.896.1 cM locus also lies within 15 cM of the D6S474 and D6S424 markers that have been linked to schizophrenia in some samples.16, 17, 18

Despite such interesting features, many details of the structure of the human CB1/Cnr1 mRNA and gene remain unclear. Current Genebank CB1/Cnr1 cDNA variants are termed 1 (5.4 kb; gi15208646) and 2 (1.2 kb; gi15208647). Aceview lists a similar CB1/Cnr1 mRNA ‘a’ variant as well as ‘b’ variant of 1868 bp. Both databases indicate shifted open reading frames and shorter coding regions for the 2/b variants.19, 20, 21, 22

Some CB1/Cnr1 genomic variation has been examined. Initial association studies have used several of these variants as markers. Current NCBI and Celera databases list 46 CB1/Cnr1-associated single-nucleotide polymorphisms (SNPs) and other variants (Celera: http://www.celeradiscoverysystem.com/index.cfm; NCBI: http://www.ncbi.nlm.nih.gov/). Among the most-studied are a nonsynonymous 1359A>G variant in codon 453 of the CB1/Cnr1 protein23 and an (AAT)n trinucleotide simple sequence length polymorphism (SSLP) whose position had not been well characterized.15 While associations of these CB1/Cnr1 markers with substance dependence,24, 25, 26 P300 event-related potentials27 and schizophrenia13 have been reported, these associations have not always been confirmed in other populations.15, 28, 29

During initial evaluations of the CB1/Cnr1 locus, we noted discrepancies between the sum of previously identified CB1/Cnr1 exon sequences and the sizes of CB1/Cnr1 mRNAs. To improve understanding of this locus, its human variants, and the possible roles for these human variants in addiction vulnerabilities, we now report assessments of CB1/Cnr1 mRNA sizes, CB1/Cnr1 genomic sequences that include novel exonic sequences, novel CB1/Cnr1 splice variants and novel CB1/Cnr1 polymorphisms. We identify functional activities of CB1/Cnr1 promoter/enhancer sequences that include candidate regulatory motifs in CB1/Cnr1 5′ flanking regions. We describe patterns of linkage disequilibrium across this locus. We report association between CB1/Cnr1 haplotypes and substance abuse phenotypes for which CB1/Cnr1 variation is a good a priori candidate in three different samples. We describe studies of the haplotype-specific expression of CB1/Cnr1 mRNAs that appear to confirm the functional impact of the CB1/Cnr1 haplotypes that are associated with addiction vulnerabilities.

Materials and methods

Genomic, cDNA and EST sequence assemblies

Database searches reveal sequences of bacterial artificial chromosomal (BAC) clones AL121835/gi13560256 (116.691 kb) and AL136096/ gi8217458 (112.212 kb) that provide a genomic contig that includes CB1/Cnr1 sequences. In all, 66 expressed sequence tags (ESTs) with >93% homology with the relevant 80 kb of these BAC clone sequences were also used for CB1/Cnr1 DNA sequence assemblies.

Human hippocampal poly A+ RNA (Clontech, Palo Alto, CA, USA) was reverse transcribed to obtain single-stranded cDNA, and 5′RACE was performed. The universal forward- and gene-specific reverse primers (5′-TGGGCCTGGTGACA ATCCTCTTATAGGC-3′) that corresponded to sequences in the CB1/Cnr1 coding region were used (SMART RACE cDNA Amplification Kit, Clontech, Palo Alto, CA, USA). The nested universal primer and a nested gene-specific reverse primer (5′-CCACCCAGTTTGAACAGAAACACG TTGC-3′) were used in second round 5′RACE. Product bands were cloned into pCR.2.1-TOPO (Invitrogen, Carlsbad, CA, USA) and sequenced using an ABI capillary sequencer. Analyses revealed 5′RACE products that corresponded to novel exons that we term exons 1–3 as well as the previously described main coding exon (GI:38683843), which we now term exon 4.

Reverse transcription-polymerase chain reactions (RT-PCR)

Single-strand cDNA that was reverse transcribed from human hippocampal poly A+ RNA was used for RT-PCR reactions using oligonucleotides with sequences based on the three postulated novel exons as follows: A fragment (exon 1): 5′-GCCTCCCGCACGCTACTCCC-3′ and ′5-CTGGTCCTCGGGACAGAAGCTCCC-3′; B fragment (exon 3): 5′-TGAGGGAAGATGGCATAAGGAATGGG-3′ and 5′-TGCCAATGCTACTTCCATGTCTG AGACC-3′; C fragment (exon 4): 5′-GATGGCCTTGCAGATACCACCTTCCG-3′ and 5′- CCACCCAGTTTGAACAGAAACACGTTGC -3′). Three fragment products, termed A, B and C, were sequenced. Quantitative PCR was performed for 24 cycles using a modified ABI PCR System 9700 with denaturation at 95°C for 15 s, annealing at 56°C for 15 s and extension at 72°C for 60 s.

Northern blotting

A and B fragment RT-PCR products and a 730 bp amplification product of exon 4, produced using oligonucleotides 5′-AGCATGTTTCCCTCTTGTGAAGGCACT-3′ and 5′-CACGTATCCACTGCTTGTCCA-3′, were cloned into pCR2.1-TOPO and confirmed by sequencing. Inserted fragments were digested and radiolabeled using [α-32P]dCTP (ICN Biomedicals, CA, USA) and the Megaprime DNA labeling system (Amersham Bioscience, NJ, USA). Probes were hybridized to nylon membranes, which had been blotted with 40 μg human brain total RNA or 10 μg human hippocampal poly A+ RNA (Clontech, Palo Alto, CA, USA). After hybridization, membranes were washed twice for 20 min with 0.1 × SSC/0.1% SDS at 25°C, rinsed once with water, exposed to phosphoimaging plates, stripped of hybridization probe by boiling in 0.1 × SSC/0.1% SDS solution for 5 min and rehybridized. Blots were hybridized serially with exon 4, then exon 3 then exon 1 hybridization probes.

Cloning 5′ flanking region segments and assessing their promoter activities

Sequences of 1, 2 and 3 kb located 5′ to exon 1 of the CB1/Cnr1 gene were amplified using forward primer for 1.0 kb: 5′-CGCAGCCAGGTAGCGAACG-3′; forward primer for 2.0 kb: 5′-GATAACCTTTTCTAACCACCCACCTAG-3′; forward primer for 3.0 kb: 5′-ATCCTTGACCTCTAAATGGAAAGTCAG-3′ and reverse primer: 5′-GCGGAAAAGAAGTGGAGAAG-3′. Sequences 1.1 and 2.5 kb 5′ to exon 3 were amplified using forward primer for 1.1 kb: 5′-GATGGCCTCCATTTCCTCATCTG-3′; forward primer for 2.5 kb: TCAGGGAGCACCAGTCCT TCC and reverse primer: 5′- CAAGGAGCAAGAGTTGACTCCTGAG-3′. These amplimers were each fused to luciferase in pGL3-Enhancer (Promega, Madison, WI, USA) to generate pCB 1.0, pCB2.0, pCB3.0, pCB1.1 and pCB2.5 recombinant plasmids, respectively. These plasmids, the promoterless negative control (pGL-blank), the positive SV-40 promoter-containing control vector pGL-SV40 and the transfection-efficiency control plasmid pRL-TK, were transfected into NG108-15, N1E-115, SK-N-SH cells and CHO-K1 cells (ATCC, Manassas, VA, USA) grown to 80% confluence in six-well plates in DMEM (Invitrogen, Carlsbad, CA, USA) or F12K(Sigma, Germany). Cells were transfected using 18 h, 37°C incubations with SuperFect reagent as described by the manufacturer (QIAGEN, Valencia, CA, USA). Cells were incubated for another 48 h to allow expression, collected, lysed and lysate luciferase activities were quantitated using the dual-luciferase reporter system (Promega, Madison, WI, USA) and TD-20/20 (Turner Designs, Fresno, CA) as described by the manufacturer.

Primer extension

An oligonucleotide (5′-AAGGGAAGGCGCTGGCGC-3′) complementary to exon 1 sequences was labeled with [γ-32P]ATP (ICN Biomedicals, CA, USA) and T4 polynucleotide kinase (Invitrogen, Carlsbad, CA, USA) and allowed to hybridize to sequences in 5 μg human hippocampal poly A+ RNA. Primer extension was performed as described by the manufacturer (Promega, Madison,WI, USA). Radiolabeled primer extension products were separated using 6% polyacrylamide gels containing 8 M urea and coelectrophoresed size standards and detected by autoradiography.

Human subjects

Polysubstance abusers and controls were research volunteers of self-reported European-American or African-American ancestry recruited at the NIDA-IRP in Baltimore, Maryland under informed consent from the NIDA/NIH Institutional Review Board and characterized using quantity-frequency and DSMIII-R and DSM IV criteria as previously described.30 Tokyo-area research volunteers without medical or psychiatric diagnoses and volunteers with diagnoses of alcohol dependence based on DSM-IV criteria were recruited under informed consent from the University of Tsukuba.

Analyses of allelic variants

Genomic DNAs were prepared from blood by phenol extraction methods and from frozen brain samples using Puregene (Gentra, Minneapolis, MI, USA). PCR amplified several polymorphic regions of CB1/Cnr1 (Table 1). Genotyping of the rs1049353 polymorphism at nucleotide position 1359 in codon 453 (Thr) was carried out as described.15, 31 Genotyping −23866A>C (rs754387, hCV9662507), −10908G>A (rs806381, hCV1652582), 3813A>G and 4894A>G (rs806368, hCV8943804) polymorphisms used PCR-restriction fragment length polymorphism methods and 2–4% agarose gels (Table 1). Genotyping the −3180T ins/del polymorphism in exon 3 used primer extension methods (Table 2, SnapShot, ABI, Foster City, CA, USA). Lengths of PCR fragments containing (AAT)n polymorphisms were measured using an ABI autosequencer. SNPs hCV9046296, hCV9046297, rs754387 (hCV9662507), rs2180619 (hCV15841551), rs2273512 (hCV16178942), hCV11418432, rs6454674 (hCV11418433), hCV11418434, hCV27392043, hCV9863392, rs806381 (hCV1652582), rs806380 (hCV1652583), rs806379 (hCV1652584), rs1535255 (hCV8943758), rs2023239 (hCV11600616) and rs806378 (hCV8943766) were genotyped using TaqMan 5′ nuclease assays (Table 3, ABI, Foster City, CA, USA).

Table 1 Oligonucleotides used to amplify CB1/Cnr gene fragments, polymorphisms, restriction enzymes used and allele sizes
Table 2 Oligonucleotides used for SnapShot −3180T I/D genotyping
Table 3 Oligonucleotides used for TaqMan SNP genotyping

Brain regional RNA analyses

Total and poly A+ RNAs were extracted from frozen brain regional samples obtained from the University of Maryland brain tissue bank using oligo-dT cellulose (Fastrack, Invitrogen) and RNAzol B (Tel-Test Inc., Friendswood, TX, USA). Single-stranded cDNA was synthesized from mRNAs from individuals heterozygous for CB1/Cnr1 exon 3 haplotypes using oligo-dT priming and Thermoscript reverse transcriptase reactions as described (Invitrogen, Carlsbad, CA, USA).

Allele-specific expression levels were assessed in two ways. Expression of mRNA and cDNA with different alleles at rs806378 (hCV8943766) was assessed using oligonucleotides listed in Table 3 and TaqMan/real-time PCR 5′ nuclease assays as described (ABI, Foster City, CA, USA). Expression of mRNA and cDNA with different alleles at the −3180T I/D polymorphism was assessed using oligonucleotides listed in Table 2 and primer extension method (SnapShot, ABI, Foster City, CA, USA). The Ct values (TaqMan) and peak heights (SnapShot) were compared to allelic expression signals from standards constructed by pooling DNA from individuals with known haplotypes and normalized allelic ratios were calculated. For several samples, genotype distributions were also confirmed by resequencing genomic DNA and PCR products produced from the cDNA (3100 Genetic Analyzer, ABI, Foster City, CA, USA).

Statistical analyses

Statistical procedures used Arlequin ver 1.1 (http://anthropologie.unige.ch/arlequin). Deviations of genotype frequencies from Hardy–Weinberg values were tested, maximum-likelihood haplotype frequencies were estimated from the observed data using an expectation–maximization (EM) algorithm and standardized linkage disequilibrium values (D′=D/Dmax) and D′ values were shown as graphic maps (http://www.well.ox.ac.uk/asthma/GOLD/docs/ldmax.html). Differences in observed allele and genotype frequencies and estimated haplotype frequencies between groups were tested using analogs of Fisher's exact test on 2 × 2 or 2 × (number of genotypes or haplotypes) contingency tables. P-values <0.05 were considered nominally significant.

Sequence accession numbers

CB1/Cnr1 exons 1,1a, 2, 3 and 3a are listed under GenBank accession numbers AY505113, AY505115 and AY505116, respectively, while novel SNP 3813A>G is listed as #ss 20398892 and −3180T ins/del is listed as #ss 20398893.

Results

Genomic assemblies produced a scaffold on which we can place the results of subsequent analyses (Figure 1a). Assembly of these genomic sequences reveals that the previously described but unlocalized CB1/Cnr1 (AAT)n SSLP resides 18 086 bp 3′ to the CB1/Cnr1 exon 4 translational start site.

Figure 1
figure 1

Schematic diagram of CB1/Cnr1 gene structure and variants. (a) CB1/Cnr1 gene structures. Top: CB1/Cnr1 gene. Middle: CB1 mRNAs (CB1A-E; predicted sizes on right). Bottom: alignment of bacterial artificial chromosome sequences. (b) Nucleotide sequences of CB1/Cnr1 exons and splice junctions.

Northern analyses of human hippocampal poly A+ and total fetal brain RNAs using the 730 bp hybridization probe that corresponds to the main CB1/Cnr1 coding exon indicate a major mRNA of ca. 5.8 kb (Figure 2a). Much more modest hybridization corresponding to ca. 9.0 kb and ca. 4.2 kb species is also noted in polyA-selected RNA. Northern analyses using the 420 bp fragment from novel exon 3 sequences as probes reveal hybridization to only the 9.0 kb species in polyA+ RNA. Finally, Northern analyses using the 220 bp fragment from exon 1 sequence reveal the 5.8 kb hybridizing species in polyA+ RNA (Figure 2a). These results appeared to define several distinct CB1/Cnr1 isoforms with both previously described and novel 5′ untranslated region variants.

Figure 2
figure 2

Northern and quantitative RT-PCR analyses of CB1/Cnr1 variants. (a) Northern analyses of CB1/Cnr1 expression in human adult hippocampal poly A+ mRNA (lanes 1, 3 and 5) and whole fetal brain total RNAs (lanes 2, 4, 6). Blots were hybridized serially with probes corresponding to: lanes 1, 2: exon 4 (main coding exon, 730 bp probe); lanes 3, 4: exon 3 (420 bp probe); lanes 5, 6: exon 1 (220 bp probe). Exposures were negative following stripping of the blot between hybridizations. Left edge: mRNA size calculations derived from coelectrophoresed mRNA size standards. The large size of CB1/Cnr1 mRNA is consistent with those of a number of G protein-coupled neurotransmitter receptor genes including the mu opioid receptor,32, dopamine D4 recepror,33 GABAB receptor 2,34 5-HTR1F35 and mGluR3,36 although the exons encoding full-length 5′ UTR have not been identified for all of these genes. (b) Gel flurogram displaying RT-PCR products from amplification of human hippocampal poly A+ RNA with oligonucleotides specific for: lane 1: a fragment from exon 1, 220 bp; lane 2: b fragment from exon 3, 420 bp; lane 3: c fragment from exon 4, 570 bp, main coding exon; lane 4: 1 kb DNA ladder Marker. The bands from top to bottom are 1 kb, 850 bp, 650 bp, 500 bp, 400 bp, 300 bp, 200 bp, 100 bp, respectively. (c) Quantitation of densities of fluorescence noted in (b). (d) Gel fluorogram displaying RT-PCR products corresponding to CB1A (lane 1 indicates a mixture of CB1A, CB1B, CB1C and CB1D, primers: 5′-CTCCCTGCAGAGCTCTCCGTAGTC-3′ and 5′-CCACCCAGTTTGAACAGAAACACGTTGC-3′), CB1B (lane 2, primers: 5′-GCCACCCCTTCCTTCTCCAC-3′ and 5′-ACGATGCTGATTATGTGACTCC-3′), CB1C (lane 4, primers: 5′-GCCACCCCTTCCTTCTCCAC-3′ and 5′-GCGCTCCTCGCCTGGAGTGG-3′), CB1D (lane 3, primers: 5′-GCCACCCCTTCCTTCTCCAC-3′ and 5′-CCAAACGCAACTCTGCCAGGTC-3′) and CB1E (lane 5, primers: 5′-TATCTTCCGTCTGCTCTCAAGC-3′ and 5′-ACGATGCTGATTATGTGACTCC-3′) in mRNAs extracted from hippocampus, substantia nigra, cerebellum, amygdala, caudate/putamen and hypothalamus. Lane M: same 1 kb DNA ladder Marker used as (b).

5′RACE and sequencing results reveal three novel exons that we term exons 1, 2 and 3 located 5′ to the previously described main CB1/Cnr1 coding exon (Figure 1a). We estimate the sizes of these exons as 302, 38 and 3500 bp, respectively. These exons thus also define introns I, II and III of 1452 bp, 12.2 and 2.3 kb, respectively. Initial results from 5′RACE using hippocampal mRNA and oligonucleotides corresponding to exon 3 sequences yield reproducible products that correspond to an exon 3 length of 3500 bp as well as an additional 90 bp at the 5′ end of these sequences. This additional 90 bp has no match to current genomic sequence assemblies in this region. It does display repetitive sequence, matching more than 20 other genomic sequence locations (BLAST; NCBI, data not shown).

Alternative splice donor/acceptor sites are found within exons 1 and 3. We term the products of these alternate splice donor/acceptor sites 1a and 3a. Each splicing site contains cannonical GT-AG splice donor/acceptor motifs (Figure 1b). These genomic assemblies thus identify five CB1/Cnr1 mRNA variants that we term CB1A, CB1B, CB1C, CB1D and CB1E.

The first splice donor site for transcripts CB1A and CB1B lies 247 bp 3′ from the exon 1 transcriptional initiation site that we have tentatively identified (see below). CB1A is formed when the splice acceptor at the 5′ end of exon 4 is used. CB1B mRNA includes both exons 3a and 4. CB1C mRNA has an additional exon 2 comparing with CB1B. The first splice donor site for transcripts CB1D lies 303 bp from the transcriptional initiation site; they thus utilize exon 1. These mRNAs all contain the same exon 4 sequences. Finally, CB1E transcripts appear to be initiated at an alternative site at the 5′ end of exon 3. The sequences at the 5′ flank of this exon 3 lack canonical splice donor/acceptor sequences. CB1E transcripts also include exon 4 sequences.

We examined the relative abundance of these CB1/Cnr1 transcripts using quantitative RT-PCR and amplimers specific for several CB1/Cnr1 exons (Figure 2b, c). These results support the dominance of the mRNAs containing exon 4 in mRNA from hippocampus and most brain regions. Abundance of the shorter exon 1 amplimer was also greater than that of the longer exon 3 amplimer, consistent with the dominance of transcripts containing exon 1 in hippocampal RNAs. Comparisons of RT-PCR products from mRNAs extracted from several brain regions appears to indicate that mRNAs from caudate, substantia nigra and amygdala display denser relative expression of CB1E transcripts that contain exon 3. In contrast, the low level of these exon 3-containing CB1E transcripts in hippocampus is also mirrored in mRNAs from cerebellum (Figure 2d). These data support the ideas that there may be major roles for the novel CB1/Cnr1 exons 1 and 3 reported herein forming the CB1/Cnr1 mRNAs that are expressed differentially in several human brain regions.

As noted above, the current Genebank CB1/Cnr1 cDNA variant termed 1 (5.4 kb; gi15208646) contains the exon 4 sequences that encode the 472 amino-acid CB1/Cnr1 receptor that has been identified in numerous previous studies. We have also identified these sequences in numerous resequencing studies. Variant 1 also encodes 5′ untranslated region sequences that include 63 bp that we identify in exon 4 and 41 bp that we identify in exon 1; it thus corresponds to the transcript for the physiologically active CB1 receptor. Variant 1 has 3′ untranslated region sequences that end at a poly A+ addition site that we have used to demark the 3′ end of exon 4 in Figure 1. By constrast, variant 2 (1.2 kb; gi15208647) is reported to encode a shorter 411 amino-acid open reading frame due to frameshift caused by absence of internal segments. These segments are absent from previously reported variant 1 and in all exon 4 resequencing that we have performed.

BLAST searches using 80 kb of genomic sequence from the CB1/Cnr1 gene region that derive from the 3′ 40 kb of gi8217458 and from the 5′ 40 kb of gi13560256 identify 66 expressed sequence tags (ESTs). In total, 53 of these ESTs are found in exon 4. EST 2013134/gi3835480 sequences lie within exon 3. None is found in exon 1 (see below). However, 13 additional ESTs do localize to eight genomic sites close to CB1/Cnr1 exonic sequences. One of these ‘orphan’ ESTs, EST11838167/ gi19722632, lies in the sequences that we identify here as intron 3 located 1 kb 5′ from the start of the CB1/Cnr1 exon 4. Three sites in intron 2 harbor orphan ESTs 2908795/gi5440254, 2908778/gi5440237, 1291384/gi2432268, 1141727/gi2230418, 141563/gi2230254 and 3413819/gi6300263. Two orphan ESTs lie at a single site in intron one, 2248927/gi4294885 and 1463677/gi2786781. Finally, four orphan ESTs lie at three different sites in the 5′ flank of exon 1: 9822800/gi16177345, 16729975/gi27824175, 1165660/gi2261563 and 22199/gi314412. Several lines of evidence fail to provide support for the idea that several of these orphan ESTs provide CB1/Cnr1 exonic sequences, at least those expressed in hippocampus. First, repeated attempts to amplify cDNAs that would correspond to CB1/Cnr1 mRNAs that utilize such exons have failed. PCR reactions using oligonucleotides that have successfully amplified sequences within these potential exons have failed to amplify sequences that include these exons and sequences from either CB1/Cnr1 exons 1 or 4 (data not shown). Secondly, no previously reported CB1/Cnr1 cDNAs contain these EST sequences, although one does contain partial sequences of exon 1. Third, a 1.659 kb cDNA termed BC037581/gi22902375 has >96% homology with the ESTs gi16177345, gi4294885 and gi2786781. BC037581 encodes a site for poly A+ addition, but lacks an open reading frame, consistent with the idea that this sequence and the ESTs that it contains could well encode the 3′ end of an overlapping gene.

Primer extensions from internal exon 1 sequences indicate that the 5′ end of many of these products lies 105 bp 5′ to the primer annealing site. This location corresponds to a 302 bp length for exon 1 (Figure 3a). The transcriptional start site predicted from this data is located near a GC island and lacks obvious CAAT or TATAA box sequences. Sequencing result of nested 5′RACE products are also consistent with this predicted transcriptional start site (Figure 3b). 5′RACE using oligonucleotides complementary to 5′ sequences from exon 3 also produces products that terminate in areas devoid of obvious CAAT or TATAA box sequences (Figure 3c).

Figure 3
figure 3

Defining CB1/Cnr1 transcriptional initiation sites by primer extension and 5′RACE. (a) Gel autoradiogram of products of primer-extension reactions from sequences in exon 1 using human hippocampal poly A+ RNA (lane 1) or Escherichia coli RNA (lane 2). M: DNA size markers; sizes calculated from standards are displayed on the right. (b) Gel flurograms of 5′RACE products (arrows) produced by amplification from two different CB1/Cnr1 exon 1 sites. Lane 1: amplimers from 5′-GACTACG GAGAGCTCTGCAGGGAG-3′; lane 2: amplimers from 5′-CTGGTCCTCGG GACAGAAGCTCCC-3′. (c) Gel flurograms of 5′RACE products produced by amplification from two different CB1/Cnr1 exon 3 sites. Lane1: amplimers from 5′-CAAGGAGCAAGAGTTGACTCCTGAG-3′; lane 2: amplimers from 5′-GATCTGTTAGATGAATGTGACCAGCC-3′.

The CB1/Cnr1 exon 1 and exon 3 transcriptional initiation sites predicted from these experiments would thus lie 19.9 and 6.0 kb 5′ from the translational CB1/Cnr1 ATG start codon, respectively.

Luciferase reporter gene analyses of the activities of sequences that extend 1.0, 2.0 and 3.0 kb 5′ from the exon 1 transcriptional initiation site reveal significant promoter/enhancer like activities (Figure 4). In SK-N-SH cell and CHO-K1 cells, 1.0 kb 5′ flanking sequences of exon 1 display robust activities that are almost 2/3 as high as those of the highly active positive control SV40 promoter sequences. Further, when the 1 kb promoter region fragment is expressed in the NG108-15 and N1E-115 cells, which express endogenous CB1/Cnr1,24, 25, 26, 37 it confers luciferase expression levels even higher than those of SV40 promoter controls. The 2.0 kb sequences display activities that are lower than those of the 1.0 and 3 kb fragment in each of these three cell lines. Conceivably, negative regulatory sequences may reside ca. 2 kb 5′ to the exon 1 transcriptional initiation site.

Figure 4
figure 4

Relative luciferase activities conferred by CB1/Cnr1 5′ flanking/promoter region sequences. Data derive from cells expressing negative control pGL-blank, pGL-SV40, pCB1.0, pCB2.0, pCB3.0, pCB1.1 and pCB2.5 reporter plasmids and transfection efficiency control plasmid pRL-TK.

These activities of exon 1 flaking sequences contrast with the much lower activities of both 1.1 and 2.5 kb sequences from the 5′ flanks of exon 3 (Figure 4). This contrast of results is consistent with the higher abundance of CB1A, -B, -C and -D mRNA transcripts that contain exon 1 sequences and the relatively low abundance of the CB1E transcripts that appear to use an exon 3 transcriptional initiation site in mRNAs extracted from several brain regions.

Since significant positive elements of the CB1/Cnr1 core promoter region may thus lie within 1.0 kb of the 5′ flank of exon 1, we performed searches of the transcription factor binding site motifs in sequences from this region. These searches identified AP2, AP4, USF, HSF and GATA sites that are each good candidates to serve as CB1/Cnr1 regulatory promoter/enhancer elements (data not shown).

We searched for mutations in CB1/Cnr1 exons by assembling ESTs and sequence variants over this region and by resequencing exons 3 and 4 from 16 chromosomes. Resequencing identified a novel exon three T ins/del variant lying –3180 bp 5′ to the exon 4 translation initiation site. These studies also confirmed the presence of the previously described exon 4 A>G SNP that lies in the codon for Thr 453, 1359 bp from the translational start site. We also identified additional exon 4 G>A and A>G polymorphisms that lie in sequences that encode 3′untranslated regions 3813 and 4894 bp from the translational start site, respectively. The former remains a novel SNP and the latter SNP was subsequently described as rs806368 (hCV8943804). These studies failed to identify previously described CB1/Cnr1 missense variations that produce Phe200Leu, Ile216Val and Val246 Ala. (Genbank http://www.ncbi.nlm.nih.gov/, Accession: U73304 and AF107262). To increase the sensitivity of this approach, we sought each of these Phe200Leu, Ile216Val and Val246 Ala variations by genotyping 800 chromosomes from a mixed European- and African-American sample. These studies again failed to identify any chromosomes that displayed these variants (data not shown).

The distributions of alleles at these polymorphic sites reveal no significant deviations from the Hardy–Weinberg equilibria. However, there are ethnic differences in allele frequencies for the 3′ (AAT)n variant and several tested SNPs (Table 4).

Table 4 CB1/Cnr1 polymorphisms: allelic frequencies in control individuals of self-reported European-American, African-American and Japanese ethnicities

Analyses of linkage disequlibria reveal several blocks of restricted haplotype diversity that are most marked in regions extending from the 5′ flank of this gene to portions of intron 2 (Figure 5, lower left aspects of the diagonal). These patterns differ slightly between African- and European-American samples, especially in an area in the middle of intron 2.

Figure 5
figure 5

GOLD plots of linkage disequilibrium maps at the CB1/Cnr1 locus in two populations. (a) European-American sample (upper and left). (b) African-American sample (lower and right). Markers: 1. hCV9046296, 2. hCV9046297, 3. hCV9662507, 4. hCV15841551, 5. hCV16178942, 6. hCV11418432, 7. hCV11418433, 8. hCV11418434, 9. hCV27392043, 10. hCV9863392, 11. hCV1652582, 12. hCV1652583, 13. hCV1652584, 14. hCV8943758, 15. hCV11600616, 16. hCV8943766, 17. −3180 T I/D, 18. hCV1652590, 19. 3813G>A, 20. hCV8943804, 21.18 kb(AAT)n.

Since variation in responses to marijuana and other drugs could produce differences in the heritable components of addiction vulnerability, we compared allele and haplotype frequency differences between polysubstance abusers and control individuals from European- and from African-American ancestries (Figure 6). We initially assessed allele frequencies at the 3′ (AAT)n, the previously studied exon 4 SNP 1359G>A and additional exon 4 SNPs 4894A>G and 3813G>A in European- and African-American polysubstance users and matched control individuals based on previous suggestions of linkage or association of 3′ flanking and/or exon 4 markers with several human phenotypes (Table 5, Figure 6). We identified no significant associations using these markers individually. An exon 4 haplotype, 1359A/3813A/4894G, displayed significant allelic frequency differences between African-American abusers and controls (P>0.0001), but showed no significant differences in European-American samples.

Figure 6
figure 6

Schematic of CB1/Cnr1 genomic region, markers and regional association with polysubstance abuse vulnerability. Top: Schematic representation of CB1/Cnr1 genomic region with positions of markers indicated. Boxes are exons, darkest region represents the protein coding sequences of exon 4. The locations for markers are indicated as distances from translation start site in this figure. The distances from transcript start site for markers are x +20791 (x=position at SNPs). Bottom: Subtracted values of the differences in allelic frequencies between polysubstance abuser and control individuals in European-American (closed symbols) and African-American (open symbols) research volunteers corresponding to the markers noted in the top figure. These data document the center of the association peak and the area in which the phase of association is the same across the sample populations tested.

Table 5 CB1/Cnr1 allele and haplotype frequencies in European- and African-American polysubstance abusers and controls

Examination of variants in 5′ flanking regions 5′ to exon 1 also produced no consistent associations, although studies with several markers in the CB1/Cnr1 5′ flanking region produced significant results in single populations. A haplotype rs754387 (hCV9662507)=C and rs2180619 (hCV15841551)=C in the 5′ flanking region of the CB1/Cnr1 gene displayed nominally significant differences between European-American abusers and controls (P=0.01). However, comparisons in African-American samples yielded no significant differences (Figure 6).

These results contrasted with striking findings made using SNPs and haplotypes that lie within the 5′ CB1/Cnr1 introns and exons. SNPs rs2273512 (hCV16178942), hCV11418432, rs6454674 (hCV11418433), hCV11418434, hCV9863392, rs806381 (hCV1652582), rs806380 (hCV1652583), rs806379 (hCV1652584), rs1535255 (hCV8943758), rs2023239 (hC11600616), rs806378 (hCV8943766) and the insertion/deletion variant −3180 T Ins/Del that lie in the long intron 2/exon 3 region of the 5′ aspect of CB1/Cnr1 provide interesting variants in these regions. Frequencies of the rs2023239 (hCV11600616) variant at 3′ end of intron 2 show significant differences between abusers and controls sampled from both European- and African-American populations (P=0.03). The flanking SNPs rs806379 (hCV1652584), rs1535255 (hCV8943758) and −3180T Ins/Del each displayed nominally significant differences in one of the two populations sampled. We thus studied the haplotype: rs806379 (hCV1652584)=T, rs1535255 (hCV8943758)=A and rs2023239 (hCV11600616)=G, which we term haplotype TAG. The TAG haplotype displayed strong association with polysubstance abuse in both European-American (P=0.00003) and African-American (P=0.007) samples (Table 5, Figure 7).

Figure 7
figure 7

TAG haplotype frequency difference between polysubstance abusers and controls of European-American (169 vs 322), African-American (85 vs 212) and Japanese (285 vs 463) self-reported ethnicities. Scales differ due to differences in haplotype frequencies. P-values derive from comparing haplotype frequencies between polysubstance abusers and controls. With our sample sizes, power was 0.99 for European-American, 0.70 for African-American and 0.4 for Japanese samples under dominant models (0.2, 0.8 and 0.2 for recessive models). Power would be significantly reduced for subgroup analyses.

We sought replication of association between this haplotype and addiction in a separate sample of substance-dependent individuals. We identified this TAG haplotype in 3.1% of a sample of Japanese alcoholics vs only 0.2% of Japanese controls. This difference was again highly significant (P=0.000008, Figure 7).

To seek allele-specific differences in expression of mRNAs corresponding to the TAG haplotype, we compared levels of expression of CB1/Cnr1 exon 3 transcripts containing this haplotype in studies of 39 mRNAs prepared from cerebral cortices and midbrains. Seven of the nine individuals heterozygous for the TAG haplotype displayed striking differences between the expression of the mRNAs encoded by the TAG and the alternate haplotypes. Only low if any levels of the mRNA encoded by the TAG allele could be identified in any of these seven samples. By contrast, we could not identify significant differences in expression from the mRNAs transcribed from the two alleles in mRNA from brains of 12 individuals heterozygous for other CB1/Cnr1 haplotypes (Figure 8).

Figure 8
figure 8

Ratio between the expression of mRNAs corresponding to each of the two alternative CB1/Cnr1 exon 3 haplotypes in mRNAs extracted from brains of individuals heterozygous for the TAG (l) and other haplotypes (r). Each gray dot represents the ratio of mRNA expressions from two heterozygote alleles in a human brain.

Discussion

The CB1/Cnr1 gene that encodes the G-protein-coupled cannabinoid receptor that provides the major brain site for marijuana actions is thus more complex than previously described. The updated picture of the CB1/Cnr1 presented in this study provides a gene that displays at least four exons that are expressed in hippocampus and other brain regions, spans 25 kb and produces several transcripts. The gene has novel SNPs, a T ins/del in its exon 3 that encodes 5′ untranslated region (UTR) sequences and a 3813G>A polymorphism in exon 4 sequences that encode 3′ UTR sequences. Characterization of this locus now locates the previously described (AAT)n polymorphism more than 12 kb from the main CB1/Cnr1 amino-acid coding exon 4. These data, and the association data reported here, provide an improved basis to understand the nature of the human CB1/Cnr1 locus and possible roles for its functional variants in human addictions.

This picture of CB1/Cnr1 appears more consistent with other brain gene precedents than the previous sketches of this gene. Many genes expressed in brain display long 5′ untranslated mRNA regions, leading to generally longer length for brain transcripts than those expressed predominantly in other tissues.38 On this background, current Aceview CB1/Cnr1 depictions with short 5′ UTR sequences would be unusual. Northern results also document larger CB1/Cnr1 mRNA sizes than can be accounted for in the CB1/Cnr1 genomic structure currently described in Aceview. Both of these concerns are addressed by the results of current primer extension and 5′RACE experiments that document the sequences of more 5′ CB1/Cnr1 exons. These additional exons appear to provide a better match for the sizes of the mRNAs identified in Northern blots. While we do not know their functional significance, these mRNA variants could regulate mRNA stability, subcellular localization, translational efficacies or other features.39, 40, 41 Each of the principal mRNA variants displays the same exon 4 that encodes the CB1/Cnr1 protein and some of its mRNA's 3′ untranslated sequences. This exon has been extensively studied and found to display few protein-coding variants.42

There are limitations to the present data concerning the CB1/Cnr1 locus. While this picture of the gene accounts for many ESTs that map well to the CB1/Cnr1 genomic exons that we describe here, we also note 13 EST sequences that map to the ‘introns’ that fall between exons 1 and 4 or to the CB1/Cnr1 gene's 5′ flank. Repeated attempts to amplify cDNAs that would correspond to CB1/Cnr1 mRNAs that utilize such exons have failed (unpublished observations). Conceivably, exons of another gene or pseudogene might interdigitate with CB1/Cnr1 genomic exons in this region. Recent observations that a BC037581 cDNA clone contains what appear to be 3′ sequences of another gene as well as at least three ESTs are consistent with this idea. However, the current observations have focused on mRNAs from hippocampus and other brain regions. It is also conceivable that CB1/Cnr1 transcripts from other tissues could contain sequences that correspond to the additional genomic sites that encode the additional ESTs, all of which derive from non-brain tissues. The four exons that we currently identify may thus represent the CB1/Cnr1 genomic structure expressed in brain regions that include hippocampus, cerebellum, amigdala, caudate putamen and substantia nigra, but other exons might be expressed in other tissues.

The studies here provide some of the first opportunity to begin to elucidate candidate promoter regions for the CB1/Cnr1 gene. Our current picture of the CB1/Cnr1 gene provides dual sites for transcriptional initiation and two regions for possible promoter/enhancer activity. Sequences in as little as 1.0 kb of 5′ flanking region can mediate reporter gene expression in cells that normally express CB1/Cnr1, as well as in cells that do not. Variants in the potentially cis-acting sequences in these sequences include variants in USF and ADR1 transcription factor binding sites. Assessments of the levels of expression of the different CB1/Cnr1 transcripts clearly suggest high activity of the promoter/enhancer sequences that mediate transcriptional initiation from exon 1. They also support much lower levels of activity driven from sequences in the 5′ flank of exon 3 in some of the brain regions with the highest overall levels of CB1/Cnr1 expression, the hippocampus and cerebellum, and more moderate levels of such activity in brain regions with moderate levels of CB1/Cnr1 expression. The patterns of activity in these reporter gene constructs encourage us to think that these sequences are likely to play important roles in the regulation of levels of CB1/Cnr1 expression in vivo but do not exclude the possibility that more 5′ or even intronic regions could also contain additional regulatory elements. Observations that 2 kb segments of exon 1 flanking sequence display promoter activities lower than those of the 1 and 3 kb fragment in three cell lines support ideas that the −1 to −3 kb region of the 5′ flank of exon 1 might contain negative regulatory sites. It is important to note that the combined 5′RACE and primer extension data reported here provide good support for the exon 1 transcriptional initiation sites proposed. However, we do not have complete confidence in our identification of all CB1/Cnr1 transcriptional start sites, especially for the start of transcription of the CB1E transcript that begins with exon 3. Misassembly of genomic sequences and consequent failure to provide a genomic fit with the 90 bp additional sequence from 5′RACE experiments provides one plausible explanation for these uncertainties.

Common human CB1/Cnr1 polymorphisms can be observed in European-American, African-American and Japanese individuals. We can readily identify several previously reported polymorphisms, although we fail to find variation at several other previously reported sites. Some reported variants may thus be rare in the general population and consequently unlikely to contribute substantially to the risks of developing common addictive disorders in the general population. These low frequencies of missense CB1/Cnr1 mutations thus focus our attention on variants in genomic sequences that might confer CB1/Cnr1 regulation. The rich pattern of CB1/Cnr1 transcriptional start and splice variants identified here also focuses attention on the possibility that functionally important variation at this locus could also derive from differential initiation or splicing of CB1/Cnr1 transcripts.

Most prior CB1/Cnr1 association studies use markers in the main coding exon 4 and 3′ flanking region. The restricted extent of linkage disequlibrium extending from variants found in this region clearly indicates that such markers might not display optimal sensitivity in excluding disease association with variants that lie in other exons. Assessments of the allelic frequencies of several of the polymorphisms studied here and of the patterns of linkage disequilibrium identified in several distinct populations also provide additional cautions for use of these CB1/Cnr1 locus markers. Stratification of the allelic frequencies of the CB1/Cnr1 3′ flanking sequence (AAT)n marker (Table 1) may be evident in comparing frequencies of its alleles in the different ethnic samples studied in the current report, the schizophrenic patients and controls studied by Ujike et al13 and the European- and African-American samples studied by Covault et al.43 Linkage disequilibria may also differ from population to population, as revealed in the LD analyses of data from our European-American and African-American samples.

Our findings that haplotypes toward the 5′ end of the CB1/Cnr1 gene's exons and introns distinguish European-American, African-American and Japanese substance abusers from control individuals provide a novel perspective on the areas of this gene that contain variants likely to alter its function. Interestingly, the most-strongly associated haplotypes extend only modest distances from the most-strongly abuse-associated marker, rs2023239 (hCV11600616). While the functional implications of this SNP and the other variants in this abuse-associated haplotype remain unknown, much of the data in this report provides a substrate for evaluating the effects that such haplotypes could have on CB1/Cnr1 expression and regulation. Allelic differences in CB1/Cnr1 expression and regulation are good candidates to play roles in producing differences in addiction vulnerabilities. Such associations are strengthened by our observations that the addiction-associated TAG haplotype is associated with significantly reduced expression of CB1/Cnr1 exon 3 RNA in human brain. Haplotype-dependent differences in levels of expression of a CB1/Cnr1 mRNA variant that displays its most prominent expression in brain areas more linked to reward and dopaminergic function, such as the ventral striatum, fit with ideas that CB1/Cnr1 expressed in these areas might play central roles in the rewards conferred by marijuana and addictive substances of other chemical classes.5, 8, 9, 40

Variation in CB1/Cnr1 expression in brain circuits important for aspects besides drug reward, including circuits contributing to ‘drug memories’ important for establishing addictions,3, 4, 44 could also contribute to addiction vulnerability. While CB1/Cnr1 agonists found in marijuana are often cited as ‘gateways’ to ultimate use of multiple classes of addictive substances,45, 46, effects of CB1/Cnr1 variation in reward and mnemonic circuits could also directly influence reward exerted by multiple substances. Psychostimulant actions can be changed by CB1/Cnr1 agonist and antagonists,47, 48, 49 for example. We have recently identified psychostimulant regulation of CB1/Cnr1 expression in striatum and midbrain in initial results (HI and GRU, in preparation). While it would be quite interesting to further evaluate the relationship between the CB1 variation and substance abuse subphenotypes including the substances preferred by each individual, the current samples provide only limited power for such analyses (see Figure 7 legend).

Despite the emphasis on variation in the present report, it is also clear that the CB1/Cnr1 gene displays substantial functional conservation. There is a relative paucity of common missense variants in the CB1/Cnr1 protein. Such conservation is consistent with the idea that CB1/Cnr1 is an important brain component that has evolutionarily important roles in neurotransmission/neuromodulation mediated via endogenous CB1/Cnr1 ligands and incidental roles as the site for marijuana actions. This biological and pharmacological importance makes careful study of the CB1/Cnr1 gene of continuing interest.