Main

Bile acids (BAs) are steroidal natural products that are secreted into the gastrointestinal tract postprandially, where they act as detergents that aid in digestion and as ligands for host receptors2,3. In the gut, host-derived primary BAs are metabolized by resident microbes to form a large group of compounds called secondary BAs3. Both primary and secondary BAs regulate host metabolism2,4 and immune responses5,6,7,8.

BAs modulate the differentiation and function of T cells, including inflammatory TH17 cells and anti-inflammatory regulatory T (Treg) cells, which help to protect against extracellular pathogens and maintain host immune tolerance, respectively9,10,11,12,13. Specifically, secondary BAs such as isoallolithocholic acid (isoalloLCA) and isodeoxycholic acid (isoDCA) modulate the differentiation of Treg cells1,14,15. Furthermore, 3-oxoLCA inhibits TH17 cell differentiation by blocking the function of the nuclear hormone receptor retinoic-acid-receptor-related orphan nuclear receptor-γt (RORγt)1,16,17. 3-OxoLCA is absent from the caeca of germ-free (GF) C57BL/6 mice1, indicating that gut bacteria may synthesize 3-oxoLCA. However, it is unknown which commensal bacteria and bacterial enzymes produce 3-oxoLCA (Fig. 1a) and whether this compound or additional secondary BAs that modulate TH17 cell responses are implicated in the pathogenesis of inflammatory bowel disease (IBD).

Fig. 1: Human gut bacteria produce 3-oxoLCA, a TH17-cell-modulating BA metabolite.
figure 1

a, Bacterial conversion of host-produced BAs. C. scindens, Clostridium scindens. R′, taurine or glycine. b, Representative ultra-performance liquid chromatography coupled with mass spectrometry (UPLC–MS) traces (left) and the percentage production of 3-oxoLCA (right) by human bacterial isolates. Total ion chromatograms are shown. An unknown peak of m/z 375.2 (#; retention time 5.7 min) was later identified as isoLCA (Extended Data Fig. 2c). n = 3 biological replicates per group. Data are mean ± s.e.m. The full results are provided in Supplementary Table 2. c, d, Supernatants from E. lenta DSM2243 cultured with LCA inhibited TH17 cell differentiation in vitro. Representative fluorescence-activated cell sorting (FACS) plots (c) and population frequencies (d) of mouse T cells activated and expanded in vitro are shown. Naive CD4+ T cells from wild-type C57BL/6J mice were cultured under TH17 cell polarizing conditions for 3 days and bacterial supernatants were added 18 h after T cell receptor (TCR) activation. n = 3 biologically independent samples per group. Data are mean ± s.e.m. Statistical analysis was performed using one-way analysis of variance (ANOVA) followed by Tukey’s multiple-comparison test.

Source data

Here we used a screen of human isolates to identify gut bacteria that produce 3-oxoLCA as well as an abundant gut metabolite, isolithocholic acid (isoLCA), which we demonstrate inhibits TH17 cell differentiation. Multi-omics analyses of two IBD registries revealed that 3-oxoLCA and isoLCA as well as the bacterial genes responsible for their production are negatively associated with IBD and TH17-cell-related host gene expression. Together, our data suggest that bacterial production of 3-oxoLCA and isoLCA may contribute to gut immune homeostasis in humans.

Screen for 3-oxoLCA-producing bacteria

We screened strains isolated from human stool samples for their ability to convert LCA into 3-oxoLCA. LCA is present in high concentrations in human caecal contents (mean, ~160 μM)18 and we reasoned that gut bacteria might be able to oxidize the C3-hydroxyl group of LCA to produce 3-oxoLCA (Fig. 1a). After testing stool from 15 individuals, we used the two samples that contained the highest 3-oxoLCA levels in the screen (Extended Data Fig. 1a (P3 and P27)). We established a library of 990 culturable isolates comprising a diverse array of bacteria, representing members from all of the major gut phyla (Methods and Extended Data Fig. 1b–d).

A total of 238 bacterial isolates converted LCA into 3-oxoLCA after 48 h. These producers belonged to 12 genera (Supplementary Table 2). Among these, the top producers included Gordonibacter pamelaeae P7-E3, Eggerthella lenta P7-G7, Raoultibacter massiliensis P7-A2, Collinsella intestinalis P8-C1, Adlercreutzia equolifaciens P11-C8 and Clostridium citroniae P2-B6 (Fig. 1b). Consistent with our findings, an early study showed that isolates of Eubacterium lentum (which was later reclassified as Eggerthella lenta) could produce small amounts of 3-oxoLCA in anaerobic resting cell culture19. The type strains of a subset of these organisms produced a comparable amount of 3-oxoLCA in vitro (Extended Data Fig. 1e). Taken together, these data indicate that human gut bacteria from an array of families within the Actinobacteria and Firmicutes phyla produce 3-oxoLCA. Supernatant from E. lenta cultures incubated with LCA significantly inhibited the differentiation of naive CD4+ T cells isolated from wild-type C57BL/6J mice into TH17 cells (Fig. 1c, d) without altering Treg differentiation (Extended Data Fig. 2a, b). These data suggest that human gut bacteria that produce 3-oxoLCA can suppress TH17 cell differentiation in vitro.

IsoLCA inhibits TH17 cells

We observed that G. pamelaeae, E. lenta, C. citroniae and a Ruminococcus gnavus isolate (P4-G2) produced a new peak (retention time, 5.7 min; Fig. 1b) in addition to 3-oxoLCA. As this compound had an identical mass-to-charge ratio (m/z) to LCA, and E. lenta and R. gnavus type strains convert DCA into isoDCA, the 3β-OH isomer of DCA20, we reasoned that this unknown metabolite was the LCA isomer isoLCA. Spike-in of pure isoLCA into bacterial cultures confirmed that the unknown compound was isoLCA (Extended Data Fig. 2c). After LCA (mean, ~160 μM) and DCA (mean ~200 μM), isoLCA is the most abundant BA in the caecal contents of healthy individuals (mean, ~50 μM)18. Although isoLCA is largely enterohepatically reabsorbed18, micromolar concentrations are still found in human faeces (mean, 54 μM; Extended Data Fig. 1a). IsoLCA is undetectable in the caecal contents of GF C57BL/6 mice1. Together, these data indicate that members of the microbiome produce the abundant gut metabolite isoLCA (Supplementary Table 2).

We next investigated whether isoLCA also affects TH cell differentiation. IsoLCA inhibited the differentiation of naive CD4+ T cells into TH17 cells as efficiently as 3-oxoLCA, whereas another abundant 3β-OH BA, isoDCA, did not inhibit differentiation (Fig. 2a, b). Although isoLCA caused a dose-dependent reduction in TH17 cell differentiation with no significant effect on cell viability or total cell number (Extended Data Fig. 3a–c), it had no effect on TH1 and Treg cell differentiation (Extended Data Fig. 3d–g). These data suggest that, similar to 3-oxoLCA, isoLCA may also function as a specific inhibitor of TH17 cell differentiation.

Fig. 2: The abundant gut bacterial metabolite isoLCA inhibits TH17 cell differentiation.
figure 2

a, b, IsoLCA inhibited the differentiation of mouse TH17 cells in vitro. Representative FACS plots (a) and population frequencies (b) of TH17 cells that were cultured in the presence of various BAs (20 μM) and ML209 (2 μM). n = 3 biologically independent samples per condition. Data are mean ± s.e.m. Statistical analysis was performed using one-way ANOVA followed by Dunnett’s multiple-comparison test, with the vehicle set as the control. cf, IsoLCA inhibited the differentiation of TH17 cells in vivo. The experimental schematic (c) and representative FACS plots (d), as well as population frequencies of TH17 cells (e) and Treg cells (f), in the ileal lamina propria of SFB-colonized mice are shown. C57BL/6J mice were gavaged with SFB-rich faecal pellets and maintained on control or isoLCA-containing powder chow (0.3%, w/w) for one week. n = 8 mice per group, pooled from two experiments. Data are mean ± s.e.m. Statistical analysis was performed using two-tailed unpaired t-tests. g, Three-dimensional structures of 3-oxoLCA, isoLCA and LCA showing the facial orientation of the C3 oxygenation.

Source data

Consistent with our in vitro observations, isoLCA treatment (0.3% (w/w) in chow) of C57BL/6J mice gavaged with segmented filamentous bacteria (SFB)-rich faecal material21 resulted in a significant reduction in TH17 cell differentiation without affecting the Treg cell population (Fig. 2c–f, Extended Data Fig. 3h). At steady state, isoLCA treatment reduced the levels of pre-existing TH17 cells in the ileal lamina propria of SFB-colonized C57BL/6N (B6Tac) mice compared with chow-fed mice (Extended Data Fig. 3i–k). IsoLCA treatment also significantly lowered the TH17 cell population frequency without affecting the Treg population in the ileal lamina propria of mice that were treated with anti-CD3 (ref. 22) (Extended Data Fig. 3l–o) . These data indicate that isoLCA treatment suppresses TH17 cell differentiation in mice at a steady state and under inflammatory conditions.

On the basis of the structural similarities between isoLCA and 3-oxoLCA, which both possess C3 oxygenation oriented towards the β-face of the steroidal A ring, we hypothesized that isoLCA may also target RORγt (Fig. 2g). Like 3-oxoLCA, isoLCA treatment reduced RORγt reporter activity in HEK293 cells, suggesting that isoLCA inhibits the transcriptional activity of RORγt (Extended Data Fig. 3p). IsoLCA bound directly to the RORγt ligand-binding domain with equilibrium dissociation constants of 7.3 μM and 24 μM based on differential scanning fluorimetry (DSF) and surface plasmon resonance (SPR) measurements, respectively (Extended Data Fig. 3q–v). These values are within the range of physiologically relevant concentrations of isoLCA in human caecal contents18. By contrast, the structurally similar compound isoDCA did not exhibit reliable binding to the RORγt protein.

We next performed RNA-seq analyses of CD4 T cells isolated from wild-type (WT) or RORγt-deficient (KO) mice. We identified 291 genes that were differentially regulated by isoLCA or 3-oxoLCA treatment. Consistent with the interactions of these compounds with RORγt, a subset of isoLCA- or 3-oxoLCA-affected genes (46 genes) was also similarly regulated by RORγt (Extended Data Fig. 3w). Gene Ontology enrichment analysis of these 46 genes revealed that isoLCA and 3-oxoLCA treatment altered the expression of genes involved in IL-17-mediated signalling and cytokine production pathways (Extended Data Fig. 3x). These analyses indicate that isoLCA, similar to 3-oxoLCA, affects the TH17 cell program by directly binding to RORγt protein and suppressing its transcriptional activity, leading to changes in multiple immune-related processes.

HSDHs produce 3-oxoLCA and isoLCA

We previously showed that gut bacteria convert DCA into 3-oxoDCA using a 3α-hydroxysteroid dehydrogenase (3α-HSDH), and 3-oxoDCA into isoDCA using a 3β-hydroxysteroid dehydrogenase (3β-HSDH)20. We reasoned that an analogous biosynthetic pathway was responsible for the conversion of LCA into 3-oxoLCA and then isoLCA (Fig. 3a). We therefore incubated the 990 isolates used in the first screen with 3-oxoLCA (100 μM) as the substrate. A total of 266 isolates converted 3-oxoLCA to isoLCA, and 54 isolates demonstrated more than 50% conversion. Overall, the producers belonged to 15 bacterial genera (Supplementary Table 2). Several strains—including Lactobacillus rogosae P2-F2, Lachnospira pectinoschiza P2-A2 and Catenibacterium mitsuokai P1-A4—exhibited more than 80% conversion of 3-oxoLCA to isoLCA (Fig. 3b). The type strains of a subset of these isolates produced a comparable amount of isoLCA in vitro (Extended Data Fig. 2d). These data also suggest that collaborative metabolism by several bacterial species could contribute to the production of isoLCA.

Fig. 3: Bacterial HSDHs convert LCA to 3-oxoLCA and isoLCA.
figure 3

a, Proposed biosynthetic pathway for the conversion of LCA to 3-oxoLCA and isoLCA. b, Representative UPLC–MS traces (left) and the percentage of production of isoLCA (right) by human bacterial isolates incubated with 3-oxoLCA (100 μM) for 48 h. Total ion chromatograms are shown. Clostridium innocuum (bold font), which encodes 3β-HSDH homologues in HMP2 metagenomes (Methods, Supplementary Table 9), was selected for in vitro testing (Supplementary Table 2). n = 3 biological replicates per group. Data are mean ± s.e.m. c, d, Heterologous expression of candidate HSDHs from E. lenta DSM2243, B. fragilis NCTC9343 and R. gnavus ATCC29149 in E. coli. E. coli lysate was incubated with either 100 μM LCA (c) or 100 μM 3-oxoLCA (d) as a substrate. E. coli containing an empty vector was used as a control. Data are reported as the percentage of conversion to product (either 3-oxoLCA or isoLCA). n = 3 biological replicates per group. Data are mean ± s.e.m. Protein gels are provided in Extended Data Fig. 4, Supplementary Fig. 1. e, B. fragilis Δ3538, B. fragilis Δ3932 or the type strain B. fragilis NCTC9343 was incubated with 3-oxoLCA (100 μM) for 48 h. Representative total ion chromatogram UPLC–MS traces (left) and the percentage production of the target molecule isoLCA (right) are shown. n = 3 biological replicates per group. Data are mean ± s.e.m. DNA gels are provided in Supplementary Fig. 1. f, Cladogram of E. lenta and related human isolates and their production of 3-oxoLCA and isoLCA. El, E. lenta; Es, Eggerthella sinensis; Gs, Gordonibacter sp.; and Gp, G. pamelaeae. The E. lenta isolates shown in red (E. lenta 28B, E. lenta DSM15644 and E. lenta Valencia) that lack a homologue of Elen_0690 did not synthesize 3-oxoLCA from LCA. All of the strains were incubated with 100 μM LCA for 48 h. n = 3 biological replicates per group. Data are mean ± s.e.m.

Source data

We next sought to identify bacterial enzymes that convert LCA into 3-oxoLCA and isoLCA. Heterologous expression of E. lenta and R. gnavus HSDH candidate genes20 followed by incubation with LCA or 3-oxoLCA revealed that Elen_0690 and Rumgna_02133 convert LCA to 3-oxoLCA, whereas Elen_1325 and Rumgna_00694 convert 3-oxoLCA to isoLCA (Fig. 3c, d, Extended Data Fig. 4a, b). Thus, we propose that Elen_0690 and Rumgna_02133 encode 3α-HSDHs and Elen_1325 and Rumgna_00694 encode 3β-HSDHs.

Although the majority of the identified isoLCA-producing bacteria are Gram-positive Firmicutes or Actinobacteria, we also found that the prevalent Gram-negative human gut commensal23 Bacteroides fragilis is a robust isoLCA producer. To identify the B. fragilis 3β-HSDH, we heterologously expressed candidate genes identified using BLASTP searches and secondary structure homology predictions (Supplementary Table 3). Incubation of cell lysates with 3-oxoLCA followed by quantification of isoLCA enabled us to identify BF3538 and BF3932 as 3β-HSDHs that produce isoLCA (Fig. 3d, Extended Data Fig. 4c). Of the two genes, only when BF3538 was deleted24 did B. fragilis cultures lose the ability to convert 3-oxoLCA to isoLCA (Fig. 3e). These data indicate that BF3538 encodes a 3β-HSDH that is responsible for isoLCA production in B. fragilis cells.

To determine whether the two 3α-HSDHs that we identified (Elen_0690 and Elen_0360) are functional in growing bacteria, we used a collection of Eggerthella and Gordonibacter human isolates with sequenced genomes. Through comparative genomics, we identified three E. lenta strains (E. lenta Valencia, E. lenta 28B and E. lenta DSM15644) that lack a homologue for Elen_0690 and two Gordonibacter strains (G. pamelaeae 3C and G. sp. 28C) that lack a homologue for Elen_0360 (Supplementary Table 4). Neither 3-oxoLCA nor isoLCA were detected when E. lenta Valencia, E. lenta DSM15644 or E. lenta 28B were cultured with LCA, whereas G. pamelaeae and G. sp. 28C produced similar amounts of 3-oxoLCA and isoLCA from LCA as control strains containing homologues of both genes (Fig. 3f). These data support the hypothesis that Elen_0690 and its homologues, but not Elen_0360 and its homologues, encode 3αHSDHs that are responsible for converting LCA to 3-oxoLCA. Similarly, we assessed whether gene-level differences could help confirm the HSDHs in R. gnavus. After incubating 13 strains of R. gnavus with LCA for 48 h, we determined that six R. gnavus strains did not convert LCA to either 3-oxoLCA or isoLCA. These results are consistent with BLASTP search results indicating that these six strains do not possess a 3α-HSDH homologue (Rumgna_02133; Supplementary Table 5, Extended Data Fig. 4i, j). Together, these results reveal the biosynthetic pathway for the conversion of LCA to 3-oxoLCA and isoLCA by E. lenta and R. gnavus, and for the conversion of 3-oxoLCA to isoLCA by B. fragilis.

TH17 cell differentiation was significantly reduced in cells treated with E. lenta DSM2243 (3αHSDH+) + LCA supernatant compared with those that were treated with supernatant from either E. lenta DSM15644 (3αHSDH) + LCA or E. lenta DSM2243 alone (Extended Data Fig. 4k, l). These data suggest that the presence of 3α-HSDH in E. lenta affects the ability of this organism to modulate TH17 cell differentiation. Moreover, co-incubation of E. lenta DSM2243 (3αHSDH+) with B. fragilis NCTC9343 resulted in the conversion of LCA to both 3-oxoLCA and isoLCA, whereas no conversion was observed when E. lenta DSM15644 (3αHSDH) was co-incubated with B. fragilis ΔBF3538 (3βHSDH) (Extended Data Fig. 4m). These data support a model of synergy for isoLCA production between strains with 3α- and 3β-HSDH activity.

Bacteria produce 3-oxoLCA and isoLCA in vivo

We next assessed whether gut bacteria could metabolize LCA in vivo. C57BL/6 GF mice were colonized with E. lenta DSM2243 (3α-HSDH+) or E. lenta DSM15644 (3α-HSDH). As LCA is absent from GF animals, colonized mice were then fed chow alone or chow supplemented with LCA (0.3%, w/w)25 (Fig. 4a). Significantly higher levels of 3-oxoLCA were detected in the caecal contents of 3α-HSDH+-colonized mice compared with in the caecal contents of 3α-HSDH-colonized mice (mean 34 pmol mg−1 wet mass versus mean 6 pmol mg−1 wet mass, P < 0.0001) (Fig. 4b).

Fig. 4: 3-OxoLCA and isoLCA modulate the TH17 cell response in vivo and are negatively correlated with CD in humans.
figure 4

a, The experimental scheme for gnotobiotic experiments. Colonized mice were fed control chow or chow containing 0.3% LCA (w/w) and their caecal contents were analysed using UPLC–MS for LCA metabolites. b, c, GF C57BL/6 mice were monocolonized with the E. lenta type strain (DSM2243) or a strain lacking a 3α-HSDH (DSM15644) (b) and were co-colonized with DSM2243 and B. fragilis NCTC9343 or monocolonized with B. fragilis (c). n = 5 and n = 6 for the monocolonized and monocolonized + LCA groups, respectively (b), and n = 7, n = 7 and n = 6 mice for the GF, E. lenta + B. fragilis and B. fragilis groups, respectively (c). IsoLCA was not detected in the experiments in b. d, Experimental schematic for the gnotobiotic experiments. GF mice were treated with a BA-containing diet with or without prior colonization with BA-modifying bacteria. After C. rodentium infection, both IL-17A-producing CD4 T cell percentages in the colonic lamina propria and BA levels in the caecal contents of treated animals were analysed. e, IsoLCA feeding suppressed TH17 differentiation in Citrobacter-infected GF mice. n = 8 per group. Data are mean ± s.e.m. pooled from two experiments. Statistical analysis was performed using two-tailed unpaired t-tests. f, LCA treatment did not significantly affect TH17 cell levels in Citrobacter-infected GF mice treated with LCA-containing diets. Mice were sorted into quartile groups on the basis of LCA levels in the caecal contents (Methods; Q1, lowest BA quartile, n = 5 mice; Q2, n = 6; Q3, n = 6; Q4, highest BA quartile, n = 5 mice). g, Population frequencies of TH17 cells were reduced in mice colonized with human gut bacteria producing higher compared with lower levels of 3-oxoLCA and isoLCA. Mice were sorted into quartile groups on the basis of 3-oxoLCA + isoLCA levels in the caecal contents (Methods; Q1, lowest BA quartile, n = 11 mice; Q2, n = 12; Q3, n = 11; Q4, highest BA quartile, n = 11 mice). Data are mean ± s.e.m. Statistical analysis was performed using two-tailed unpaired t-tests (e), or one-way ANOVA with Dunnett’s (c) or Tukey’s (b, f and g) multiple-comparison test. Data are pooled from two (b and c), three (e and f) and six (g) experiments. hk, 3-OxoLCA (h, j) and isoLCA (i, k) were significantly depleted in patients with CD relative to control individuals in the PRISM cohort (n = 34, n = 52 and n = 34 for individuals with CD, with UC and without IBD, respectively) (h, i), and in dysbiotic CD samples relative to non-dysbiotic control samples in the HMP2 cohort (n = 47, n = 169, n = 12, n = 110 and n = 122 for individuals with dysbiotic CD, with non-dysbiotic CD, with dysbiotic UC, with non-dysbiotic UC and without IBD, respectivley) (j, k). The percentage of zeros is shown on the x axis. Box plots show median and bottom and top quartiles, with outliers outside of box plot whiskers (which indicate the inner fences of the data). Statistical analysis was performed using a linear mixed model and its coefficient and significance (FDR-adjusted P values) are shown (Supplementary Table 6).

Source data

We next examined whether collaborative metabolism by gut bacteria can enhance the production of 3-oxoLCA and isoLCA in vivo. C57BL/6 GF mice were co-colonized with E. lenta DSM2243 and B. fragilis NCTC9343 and then fed chow supplemented with LCA (0.3%, w/w). Co-colonized mice had a higher total concentration of 3-oxoLCA and isoLCA in the caecal contents compared with mice colonized with E. lenta DSM2243 or B. fragilis alone (Fig. 4b, c). The apparent production of 3-oxoLCA by B. fragilis in colonized mice may be explained by the in vivo activation of reversible 3αHSDH genes, as putative HSDH genes from the type strain converted 3-oxoLCA to LCA in vitro (Extended Data Fig. 4c), and several B. fragilis isolates were found to convert LCA to 3-oxoLCA in vitro (Supplementary Table 2).

Next, we examined whether human gut bacteria that produce 3-oxoLCA and isoLCA can modulate TH17 cell levels in vivo. Infection of GF mice with Citrobacter rodentium led to robust increases in the levels of TH17 (IL-17A+IFNγ), IFNγ+ TH17 (IL-17A+IFNγ+) and TH1 (IL-17AIFNγ+) cells compared with control mice (Fig. 4d, Extended Data Fig. 5a). Treatment of C. rodentium-infected GF mice with isoLCA resulted in reduced TH17 and IFNγ+ TH17 cell levels (Fig. 4e, Extended Data Fig. 5b, c). By contrast, treatment with LCA did not significantly affect the numbers of these cells (Fig. 4f, Extended Data Fig. 5d, e). IsoLCA feeding did not affect TH1 and Treg cells (Extended Data Fig. 5b, c), whereas LCA reduced the abundance of these cells (Extended Data Fig. 5d, e).

We next co-colonized GF mice with bacteria that produce either high or low levels of 3-oxoLCA and isoLCA in vitro before subjecting these animals to LCA feeding and C. rodentium infection (Fig. 4d). Notably, we observed substantial mouse-to-mouse variation in the levels of BAs in mice infected with Citrobacter (Extended Data Fig. 5f). Although the precise origin of this BA variability is currently unclear, differences in food uptake, high rates of recycling of BAs through enterohepatic recirculation and possible altered host metabolism caused by C. rodentium infection are potential contributors2,3. Regardless, colonization of GF mice with high producers, compared with colonization with low producers, resulted in more robust production of 3-oxoLCA and isoLCA levels in the caecum (Extended Data Fig. 5f).

Notably, mice with high caecal 3-oxoLCA and isoLCA levels exhibited significantly reduced TH17 and IFNγ+ TH17 cell frequencies compared with those with lower 3-oxoLCA and isoLCA levels (quartile 4 versus quartiles 1–3; Fig. 4g, Extended Data Fig. 5g). Furthermore, high producers of 3-oxoLCA and isoLCA more robustly suppressed both TH17 and IFNγ+ TH17 cell levels compared with low producers (Extended Data Fig. 5h). TH1 and Treg cell percentages, which were significantly reduced in LCA-fed GF mice, were not reduced by these bacterial colonizations (Extended Data Fig. 5g, h). C. rodentium load was not affected by 3-oxoLCA and isoLCA production and reduced TH17 cell frequencies (Extended Data Fig. 5i). Together, these data suggest that human gut bacteria that convert LCA to 3-oxoLCA and isoLCA negatively regulate TH17 cell levels in mice in vivo.

3-OxoLCA and isoLCA are altered in IBD

Consistent with the function of 3-oxoLCA and isoLCA in mice, the levels of 3-oxoLCA and isoLCA were significantly decreased in patients with Crohn’s disease (CD) (n = 68) from the Prospective Registry in IBD Study at MGH (PRISM) IBD cohort26 relative to control individuals without IBD (n = 34; Fig. 4h, i, Supplementary Table 6, Extended Data Fig. 6). Furthermore, within the Integrative Human Microbiome Project (HMP2/ iHMP) IBD cohort27, the levels of 3-oxoLCA and isoLCA were significantly decreased in patients with CD in a dysbiotic state (n = 48) compared with their non-dysbiotic baselines (n = 169; Fig. 4j, k, Supplementary Table 6, Extended Data Fig. 7). These data indicate that the anti-inflammatory metabolites 3-oxoLCA and isoLCA are negatively associated with CD in humans.

We identified a subset of TH17-cell/IL-17-related genes27,28 (n = 21) that were differentially expressed in patients with IBD from the HMP2 cohort (Methods, Supplementary Tables 7, 8). We next correlated these genes with LCA, 3-oxoLCA and isoLCA along with two other control BAs—chenodeoxycholic acid (CDCA) and DCA. Twenty out of the 21 genes with significant correlations after adjustment for false-discovery rate (FDR) were upregulated in IBD and also displayed a significant negative correlation with 3-oxoLCA and/or isoLCA, but not with the three other BAs (Extended Data Fig. 8). These data imply that 3-oxoLCA and isoLCA may specifically contribute to IBD by biasing the TH17-cell–IL-17 signalling axis.

We further examined the associations between 3α- and 3β-HSDH-related microbial features and 3-oxoLCA and isoLCA during gut inflammation. Interestingly, both E. lenta and R. gnavus 3α-HSDH homologues were significantly depleted in samples from patients with CD and ulcerative colitis (UC) in a dysbiotic state compared with non-dysbiotic control samples (Methods, Supplementary Table 9, Extended Data Fig. 9a), and 3β-HSDH homologues were significantly depleted in samples from patients with UC in a dysbiotic state (Methods, Supplementary Table 9, Extended Data Fig. 9b). When we controlled for phylum-level taxonomic changes, we still observed significantly lower levels of 3-oxoLCA and isoLCA and the 3α-HSDH gene in patients with CD and UC in a dysbiotic state (Extended Data Fig. 9c–f). Moreover, we found that most 3α-/3β-HSDH homologues were positively correlated with 3-oxoLCA and isoLCA in patients with CD in a dysbiotic state (Extended Data Fig. 10a). Furthermore, 3-oxoLCA and isoLCA were significantly associated with species that exhibited both 3α-/3β-HSDH activity and differential abundance between dysbiosis states within each IBD phenotype (Methods, Extended Data Fig. 10b, Supplementary Table 10). Collectively, these results suggest that the decreased abundances of 3-oxoLCA and isoLCA in dysbiotic IBD are linked to changes in the abundance of the 3α-/3β-HSDHs and the species that encode these enzymes.

Discussion

Here we identified gut bacteria and enzymes that produce 3-oxoLCA and the abundant gut metabolite isoLCA, BAs that inhibit TH17 cell function. Notably, the majority of the 3-oxoLCA- and isoLCA-producing strains that we found were not uncovered in bioinformatic searches, highlighting the importance of in vitro approaches for the discovery of bacteria with specific enzymatic activities. The findings that levels of 3-oxoLCA and isoLCA as well as the bacterial genes that are responsible for their biosynthesis are reduced in patients with IBD suggest that bacterial production of these molecules may help to maintain homeostatic immune balance in the gut. In other research supporting this hypothesis, E. lenta and the genes encoding 3α-HSDHs and 3β-HSDHs in this species were recently found to be correlated with early remission among patients with IBD treated with anti-cytokine therapy29. Negative correlations between 3-oxoLCA and isoLCA and host genes in the TH17-cell–IL-17 signalling axis further imply that these metabolites modulate the immune response at least in part by regulating TH17 cell function in humans. Consistent with the hypothesis that these metabolites promote human health, increased levels of 3-oxoLCA- and isoLCA-producing bacteria were found in individuals over the age of 100 compared with older and young control individuals30. Our findings add to a growing list of gut microbe-metabolite pairs that control host immune responses by directly modulating a distinct subset of immune cells. Given the growing recognition of the importance of BA molecules in regulating host physiology and immune responses, gaining a deeper understanding of the role of host-microbiota networks in mediating BA biotransformations will offer us opportunities to devise therapeutic interventions for diseases such as IBD, metabolic diseases and cancers of the enterohepatic system.

Methods

Mice

Conventionally reared animals were housed in an individually ventilated cage system (Tecniplast) at 20–22 °C and 40–55% humidity and under a 12 h–12 h light–dark cycle at the specific-pathogen-free New Research Building facility of Harvard Medical School. Male C57BL/6J mice (aged 6 to 8 weeks) were purchased from Jackson Laboratory. Male SFB-containing C57BL/6N mice (aged 6 to 8 weeks) were purchased from Taconic Biosciences. GF C57BL/6NCrl mice were purchased from Charles River Laboratories and maintained in GF isolators at Harvard Medical School. For the BA feeding experiments, irradiated powder meal (Teklad Global 19% protein extruded diet, 2019) was evenly mixed with a measured amount of BA, provided in glass feeder jars, and replenished when necessary. Colonization of mice with SFB was performed using fresh faecal samples from 2–4-month-old il23r/Rag2/ double-knockout mice of either sex, which are known to carry higher levels of SFB compared with conventional C57BL/6N mice. Faecal samples were homogenized in sterile 1× DPBS using a 100 μm cell strainer and a 5 ml syringe plunger. Approximately a quarter of the mouse faecal pellet in 200 μl 1× DPBS was introduced into each mouse using a 20 G gavage needle. Successful colonization was assessed using qPCR using the following primers: SFB-F, 5′-GACGCTGAGGCATGAGAGCAT-3′; SFB-R, 5′-GACGGCACGAATTGTTATTCA-3′. For anti-CD3 experiments, B6Tac mice were fed a control powder chow or a powdered chow containing isoLCA (0.3%, w/w) 4 days before injection with anti-CD3 antibodies (10 μg per mouse). Then, 3 days later, the mice were euthanized and ~10 cm of distal small intestines was harvested for lamina propria immune cell analysis. For gnotobiotic experiments, age- and sex-matched GF mice were orally gavaged with bacterial cultures and maintained in the Isocage system (Tecniplast). Control powder meal (Teklad Global 19% protein extruded diet, 2019) or a chow evenly mixed with 0.3% LCA (w/w) were autoclaved and provided to mice during the experiment. Successful colonization was assessed using qPCR (see Supplementary Table 1 for primer sequences and Supplementary Table 11 for qPCR data). For C. rodentium infection experiments in GF mice, age- and sex-matched GF C57BL/6NCrl mice (aged 4 to 6 weeks) were kept on control, isoLCA- (0.08% and 0.4%, w/w) or LCA-containing diets (0.012%, 0.06% and 0.3%, w/w) 3 days before and orally infected with approximately 1 × 106 colony-forming units of C. rodentium and euthanized 5 days after infection. For C. rodentium infection experiments in gnotobiotic mice, age- and sex-matched GF C57BL/6NCrl or Swiss Webster mice (aged 4 to 6 weeks) were colonized with co-cultures of E. lenta DSM2243 and B. fragilis wild type (high producer group), E. lenta DSM15644 and B. fragilis 3β-HSDH knockout (low producer group) or C. citroniae human isolate P2-B6 and B. fragilis 3β-HSDH knockout (low producer group) strains for 4 days and put on an LCA-containing diet (0.3%, w/w) for an additional 3 days. Colonized mice were then orally infected with 1 × 104 colony-forming units of C. rodentium and euthanized 5 days after infection. C. rodentium was prepared as previously described31. Mice were orally gavaged with 200 μl of PBS containing the indicated colony-forming units. Mice were kept in the Isocage system (Tecniplast) during the experiment. Both male and female mice were used in GF and gnotobiotic experiments. All of the animal procedures were approved by the Institutional Animal Care and Use Committee at Harvard Medical School. Sample sizes were determined by the magnitude and consistency of measurable differences based on similar previous studies to ensure statistical and biological significance1,15. Mice used in the in vivo testing of BAs were randomly assigned to experimental groups. Investigators were not blinded to the experimental groups due to different diet treatments and bacterial colonization conditions in the animal experiments.

Chemical synthesis of 3-oxoLCA and isoLCA

3-OxoLCA was prepared in large scale by the oxidation of LCA according to a previously reported protocol1. Detailed synthesis methods and characterization data of isoLCA are included in Supplementary Figs. 26.

Isolation of lamina propria lymphocytes

Collected intestines were cut open and rinsed in ice-cold PBS. Associated fats were carefully removed and incubated in prewarmed 1× HBSS (without calcium and magnesium) supplemented with 1 mM dithiothreitol, 2 mM EDTA and 0.5% fetal bovine serum (FBS) at 37 °C for 20 min in a shaking incubator. The tissues were next briefly rinsed in warm RPMI and dissociated in digestion medium (RPMI supplemented with 50 μg ml−1 Liberase, 50 μg ml−1 DNase I and 1% FBS) at 37 °C for 40 min in a shaking incubator. Mononuclear cells were collected at the interface of a 40%/80% Percoll gradient (GE Healthcare). Cells were then analysed using flow cytometry. The distal 10 cm of the small intestine was considered to be the ileum.

In vitro T cell culture

Naive CD4+ (CD25CD4+CD25CD62L+CD44) T cells were isolated from the spleens and the lymph nodes of C57BL/6J mice (aged 6 to 8 weeks) by FACS sorting. Then, 96-well flat-bottom plates were precoated with 50 μl of hamster IgG (MP Biomedicals) at 37 °C for 1 h. After multiple washes with 1× DPBS, 40,000 naive CD4+ T cells were seeded into T cell medium (RPMI supplemented with 10% fetal bovine serum, 25 mM glutamine, 55 µM 2-mercaptoethanol, 100 U ml−1 penicillin, 100 mg ml−1 streptomycin) and their T cell receptor downstream signalling pathways (TCR activation) were activated with soluble anti-CD3 (145-2C11, 0.25 µg ml−1) and anti-CD28 (37.51, 1 µg ml−1) antibodies. For TH1 cell differentiation, 100 U ml−1 of IL-2 (Peprotech) and 10 ng ml−1 of IL-12 (Peprotech) were added. For TH17 cell differentiation, IL-6 (eBioscience, 20 ng ml−1) and human TGF-β1 (Peprotech, 0.3 ng ml−1) were added. For Treg culture, 100 U ml−1 of IL-2 (Peprotech) and human TGF-β1 (Peprotech, 5 ng ml−1) was added. Bacterial culture supernatants or small molecules including BAs and ML209, a highly specific RORγt inhibitor32, were added 18 h after TCR activation. Cells were collected and assayed by flow cytometry on day 3.

Flow cytometry

Cells collected from in vitro culture or in vivo mice experiments were stimulated with 50 ng ml−1 phorbol 12-myristate 13-acetate (Sigma-Aldrich) and 1 µM ionomycin (Sigma-Aldrich) in the presence of GolgiPlug (BD) for 2 h to determine cytokine expression. After stimulation, the cells were stained with various surface marker antibodies supplemented with LIVE/DEAD Fixable dye for dead cell exclusion. After washing, the cells were then fixed, permeabilized with the FoxP3/transcription factor staining kit (eBioscience) and intracellularly stained for cytokines and/or transcription factors. The following antibodies were used at the indicated dilutions for staining: anti-IL-17A (1:200, eBio17B7, eBioscience, 25-7177-82), anti-FoxP3 (1:100, FJK-16s, eBioscience, 11-5773-82), anti-RORγt (1:100, B2D, eBioscience, 17-6981-82), anti-IFNγ (1:200, XMG1.2, eBioscience, 48-7311-82), anti-CD3ε (1:400, 145-2C11, eBioscience, 48-0031-82), anti-CD25 (1:200, PC61.5, eBioscience, 25-0251-82), anti-CD62L (1:400, MEL-14, eBioscience, 11-0621-85), anti-CD4 (1:400, RM4-5, eBioscience, 56-0042-82), anti-CD45 (1:400, 30-F11, BioLegend, 103128), anti-CD8α (1:200, 53-6.7, BioLegend, 100744), anti-CD19 (1:200, 6D5, BioLegend, 115540), anti-CD44 (1:400, IM7, BioLegend, 103032), anti-CD4 (1:400, RM4-5, BD, 566407). Live/Dead Fixable viability dye Aqua (Thermo Fisher Scientific) was used at 1:500 dilution. Flow cytometry data were acquired on the LSR II flow cytometer or Symphony flow cytometer (both BD) and data were analysed using FlowJo software (TreeStar) according to the gating strategy shown in Supplementary Fig. 7.

Luciferase reporter assay

The transcriptional activity of the fusion protein of RORγt ligand-binding domain and GAL4-DNA-binding domain is reported by luciferase expression and reporter assays were conducted as previously described1. In brief, 50,000 human embryonic kidney HEK293FT (Invitrogen) cells per well were plated in 96-well plates in antibiotic-free Dulbecco’s modified Eagle medium containing 1% fetal calf serum. Then, 16 h later, cells were transfected with a DNA mixture containing 50 ng of firefly luciferase reporter plasmid (Promega pGL4.31 (luc2P/Gal4UAS/Hygro)), 5 ng of Renilla luciferase plasmid (Promega pRL-CMV) and 50 ng of plasmid encoding a fusion protein of the Gal4 DNA-binding domain and the human RORγt ligand-binding domain in each well. Transfections were performed using GeneJuice (Millipore) according to the manufacturer’s instructions. BAs or vehicle control were added 8 h after transfection and luciferase activity was measured 18 h later using the luciferase assay kits (Biotium).

RNA-seq analysis

Total RNA was isolated using the Qiagen RNeasy Plus Mini Kit according to the manufacturer’s protocol and quantified using the Agilent TapeStation RNA assay on the Agilent 4200 TapeStation instrument. Libraries were prepared using the KAPA mRNA HyperPre kit according to the manufacturer’s instructions. In brief, 50 ng of total RNA per sample was used to capture total mRNA and cDNA synthesis, and adapter ligation and amplification were conducted subsequently. After clean-up, the resulting purified libraries were analysed using the Agilent High Sensitivity D1000 ScreenTape assay on the Agilent 4200 TapeStation instrument. The libraries were next pooled equimolarly and run on the Illumina NextSeq 500 instrument with three runs: a Mid-Output 150-cycle kit and two High-Output 150-cycle kits (to obtain sufficient counts of paired-end 75 bp reads). The pool was loaded for these runs at 2.1 pM, with 5% PhiX spiked in as a sequencing control. The basecall files were demultiplexed through the Harvard BPF Genomics Core’s pipeline, and the resulting FASTQ files were used in the subsequent analysis. Raw sequencing reads were aligned to the Ensembl reference genome GRCm38 and gene counts were quantified using Salmon (v.1.2.1)33. Rstudio (v.4.0.2) and DESeq2 (v.1.28.1) were used for differential expression analysis34 using the Wald test with Benjamini–Hochberg correction to determine adjusted P < 0.05. Pairwise comparisons between DMSO-treated WT cells and isoLCA- or 3-oxoLCA-treated WT cells or DMSO-treated RORγt knockout cells identified 291 differentially expressed genes. Gene Ontology analysis was performed using the PANTHER Overrepresentation Test (released 1 May 2021)35. Heat maps were generated using pheatmap (v.1.0.12).

Bacterial cultures

Culturing of human gut bacteria was performed in an anaerobic chamber (Coy Laboratory Products) with a gas mixture of 5% hydrogen and 20% carbon dioxide (balance nitrogen) unless otherwise stated.

Human stool microbial isolation and cultivation

Faecal samples were obtained from patients with UC who have undergone faecal microbiota transplant under an Institutional Review Board-approved protocol and informed consent was obtained at Weill Cornell Medicine IRB 1404014982. Human isolate screening was performed using a published protocol36 with the following modifications. Two frozen faecal samples with the highest levels of 3-oxoLCA in the cohort were chosen (P3 (faecal 3-oxoLCA 44 pmol mg−1), and P27 (faecal 3-oxoLCA at 83 pmol mg−1), around 0.1 g per sample). The faecal samples were divided in half. One half was homogenized in reduced PBS (Genesee Scientific) and serially diluted and plated directly onto Cullen–Haiser Gut (CHG) agar37, which consists of brain–heart infusion medium (Bacto BHI, BD) supplemented with 1% BBL vitamin K1-hemin solution (BD), 1% trace minerals solution (ATCC), 1% trace vitamins solution (ATCC), 5% FBS (Genesee), 1 g l−1 cellobiose (Sigma-Aldrich), 1 g l−1 maltose (Sigma-Aldrich) and 1 g l−1 fructose (Sigma-Aldrich), and further supplemented with 0.5% (w/v) arginine (Sigma-Aldrich), and cultured at 37 °C. The other half was treated with an equal volume of 70% (v/v) ethanol (Sigma-Aldrich) for 4 h at room temperature under ambient aerobic conditions to kill vegetative cells, washed three times with PBS and plated on CHG agar containing 0.1% sodium taurocholate (TCA, Sigma-Aldrich) anaerobically for spore germination. The picked colonies were restreaked to confirm purity and then cultured in 600 μl CHG medium containing 0.5% arginine. Pure human isolates (990 in total) were archived and stored as glycerol stocks at −80 °C in eleven 96-well plates. To assess the diversity of the cultured isolates, we performed Genewiz 16S-EZ sequencing. Individual colonies were incubated in 600 μl CHG medium with 0.5% arginine for 48 h, after which 100 μl aliquots from each fresh culture from the same individual were pooled together. DNA extracts were prepared using the Allprep Bacterial DNA/RNA/Protein Kit (Qiagen) according to the manufacturer’s instructions and further submitted to Genewiz for bacterial 16S-EZ sequencing (V3 and V4 hypervariable regions) using the Illumina MiSeq system with 2 × 250 bp configuration and data analysis. To screen human isolates for LCA metabolism, isolates were retrieved from the stock plates and cultured in 600 μl CHG medium containing 0.5% arginine for 48 h at 37 °C in 96-well plates. Each isolate as well as the negative controls were then diluted 1:10 in new medium containing 100 μM LCA or 100 μM 3-oxoLCA for an additional 48 h. Cultures (0.2 ml) were collected and extracted. This experiment was conducted once per substrate for all of the isolates from the original 11 library plates. After BA analysis (see below), we prioritized the positive metabolizers and performed 16S ribosomal RNA gene sequencing (universal 16S-F, 5′-GAGTTTGATCCTGGCTCAG-3′; universal 16S-R, 5′-GGCTACCTTGTTACGACTT-3′) to enable taxonomic characterization for individual isolates. Positive producer function was verified in biological triplicate using single culture conditions (see below).

Single cultures

Individual strains were plated from glycerol stocks onto CHG agar supplemented with 0.5% (w/v) arginine and grown for 3 days. Colonies were then inoculated into 3 ml of CHG liquid medium in Falcon round-bottom polystyrene tubes and grown for 48 h at 37 °C to provide starter cultures, which were diluted 1:100 in triplicate into 5 ml fresh CHG medium containing either 100 μM of the corresponding substrate (either LCA (Sigma-Aldrich) or 3-oxoLCA (Steraloids)). Cultures were grown for 48 h at 37 °C. An aliquot of culture (0.5 ml) was collected and used for BA quantification. The experiments were performed in triplicate and repeated twice unless otherwise stated.

Co-cultures

Starting from single colonies, individual bacterial strains were grown anaerobically for 48 h in 3 ml of CHG medium at 37 °C. These starter cultures were normalized to an optical density at 600 nm (OD600) of 0.1 by dilution into fresh medium. Then, 10 μl of each normalized starter culture was diluted in 5 ml of CHG medium containing 0.75% (w/v) arginine. LCA (100 μM final concentration) or T-LCA (100 μM final concentration, Steraloids) was then added into the medium. Cultures were grown for 48 h at 37 °C and 0.5 ml aliquots were collected for BA analysis.

Bacterial supernatant assay for in vitro T cell cultures

Seed cultures from brain–heart infusion medium (Bacto) supplemented with 5 mg l−1 hemin, 2.5 µl l−1 vitamin K, 500 mg l−1 cysteine HCl (BHI+) and 1% arginine were diluted into OD600 = 0.1 in ISP2 + 1% arginine minimal medium, which consists of 4 g l−1 yeast extract (Bacto), 10 g l−1 malt extract (Sigma-Aldrich), 4 g l−1 dextrose (Sigma-Aldrich), 10g l−1 arginine, pH 7.2, containing 800 μM LCA and grown for 8 h. Supernatants were collected by centrifugation (12,000g for 10 min) and subsequently passed through 0.2 μm syringe filters. Then, 10 μl of supernatant was added to 200 μl T cell culture.

Escherichia coli heterologous expression and lysis assays

Candidate genes were placed into pET28 expression vectors under an isopropyl β-d-1-thiogalactopyranoside (IPTG)-inducible operon for heterologous expression. Plasmids were transformed into BL21 expression cell lines containing either the pLysS (Elen and Rumgna) or Rosetta (BF) enhancement cassettes. A negative control of the empty pET28a vector was transformed into both cell lines. All cells were cultured at 37 °C until an OD600 between 0.6 and 1.0 was reached. Expression was induced by the addition of IPTG (500 μM final concentration). The cultures were incubated overnight at 18 °C and collected the next day by centrifugation at 4,100g before the pellets were stored at −80 °C. The pellets were thawed and resuspended in lysis buffer (50 mM sodium phosphate buffer/ 300 mM NaCl/10% glycerol, pH 8). Cells were lysed by sonication and the lysate clarified by centrifugation at 18,200g for 45 min. The lysate was then incubated with 100 μM substrate (LCA or 3-oxoLCA) for 6 h at 37 °C. The mixtures were then frozen to quench the reaction and stored at −80 °C until extraction and analysis (described below). Soluble expression was confirmed by SDS–PAGE or immunoblotting (Extended Data Fig. 4d–g, Supplementary Fig. 1). Initial gel analysis was performed using SDS–PAGE immediately after lysate clarification using Coomassie blue staining to visualize protein bands. In the case of the B. fragilis candidates and select candidates from E. lenta and R. gnavus, an anti-His tag (Cell Signaling, 2365S) immunoblot was performed with transfer verified by subsequent Amido Black total protein staining.

BA analyses

BA analyses were performed using a previously reported method38. Stock solutions of all BAs were prepared by dissolving compounds in molecular-biology-grade DMSO (Sigma-Aldrich). These solutions were used to establish standard curves. Glycocholic acid (GCA) (Sigma-Aldrich) was used as the internal standard. HPLC-grade solvents were used for preparing and running UPLC–MS samples. All data were analysed using Agilent ChemStation and are expressed as the percentage conversions to the predicted product(s) (LCA, 3-oxoLCA, isoLCA) or concentration in μM. Note that the four isomers of LCA that have been reported in human faeces—LCA, isoLCA, alloLCA and isoalloLCA—were separable by UPLC–MS1.

Sample preparation for native bacterial culture and E. coli cell lysis

Bacterial cultures or cell lysates were acidified to pH 1 using HCl (Sigma-Aldrich) and extracted twice with ethyl acetate (Sigma-Aldrich). The organic phase was collected and dried using a SpeedVac (Thermo Fisher Scientific) for 96-well plate cultures or a TurboVap (Biotage) for bacterial tube cultures or microcentrifuge tube lysates, respectively. Dried extracts were solubilized in 75% HPLC-grade methanol (EMD Millipore) in distilled H2O and analysed using UPLC–MS (Agilent Technologies 1290 Infinity II UPLC system coupled online to an Agilent Technologies 6120 Quadrupole LC–MS spectrometry system in negative electrospray mode) using a previously published method39,40 with the following modifications. Extracted BA solutions were injected onto a Phenomenex 1.7 μm, C18 100 Å, 100 × 21 mm LC column at a flow rate of 0.350 ml min−1 using 0.05% formic acid in water as mobile phase A and acetone as mobile phase B. The following gradient was applied: 0–1 min, 25–60% B; 1–5 min, 60–70% B; 5–6 min, 70–100% B; 6–7 min, 100% B isocratic; 7–8 min, 100–25% B; 8–10 min, 25% isocratic.

Sample preparation for mouse and human tissue BAs

BAs were extracted from mouse caecal and human faecal samples and quantified using UPLC-MS as previously reported38. GCA or β-muricholic acid (βMCA, Steraloids) was used as the internal standard for mouse and human samples, respectively. The limits of detection of individual BAs in tissues (in pmol mg−1 wet mass) were as follows: β-muricholic acid, 0.10; GCA, 0.25; T-LCA, 0.04; LCA, 0.12; 3-oxoLCA, 0.18; and isoLCA, 0.29.

Genomic and meta-omic sequence analysis

B. fragilis NCTC9343 3β-HSDHs

BLASTP searches of the B. fragilis NCTC9343 genome were performed using the JGI Integrated Microbial Genomes and Microbiomes database (v.5.0 in March 2020)40,41 with 3β-HSDHs Elen_1325 and Rumgna_00694 as query sequences using an E-value cut-off of 1 × 10−2. All candidate genes with E values below 1 × 10−15 were selected for heterologous expression assays. Secondary structure prediction analysis using the JPRED4 server42 was then performed on the remaining hits. The predicted structures of the known 3β-HSDHs Elen_1325 and Rumgna_00694 were compared with those of the remaining hits. The best match to the known 3β-HSDHs—BF3538 (CAH09226.1)—was also selected for heterologous expression. The B. fragilis Δ3538 and B. fragilis Δ3932 mutant strains were constructed using a reported method with slight modifications24. In brief, the 1 kb regions upstream and downstream of BF3538 or BF3932 were PCR-amplified, cloned into the pLGB30 and transformed into E. coli S17-1(λpir) chemical competent cells. E. coli S17(λpir) cells containing the desired plasmid were cultured and conjugated into the recipient strain (B. fragilis NCTC9343) and selected by tetracycline, and later counter-selected by rhamnose on BHI+ with 10% horse blood (Quad five, 210-500) plates. Knockout strain colonies were confirmed using PCR (Extended Data Fig. 4h, Supplementary Fig. 1) and sequencing. The loss of function of the knockout strain was confirmed using UPLC–MS with 100 μM 3-oxoLCA as the substrate. A list of the primers used is provided in Supplementary Table 1.

Comparative genomic analysis

In E. lenta isolates, genetic variation between the E. lenta DSM2243 type strain and other human isolates was determined using comparative genomic analysis pipelines as previously reported in ElenMatchR v.1.0.9003 (ref. 43). Elen_0360 (ACV54351.1), Elen_0690 (ACV54671.1) and Elen_1325 (ACV55294.1) were used as querysequences for Eggerthellaceae isolates (Supplementary Table 4). A phylogenetic tree was created using PhyloPhlAn44 and visualized with Ggtree45. In R. gnavus isolates, Rumgna_02133 (3α-HSDH, A7B3K3.1) and Rumgna_00694 (3β-HSDH, A7AZH2.1) were used as query genes to perform BLASTP search (Supplementary Table 5).

Analysing the effect of BAs on CD4 T cells in gnotobiotic mice

Caecal levels of LCA, 3-oxoLCA and isoLCA in C. rodentium-infected GF or gnotobiotic mice were analysed using UPLC–MS. For isoLCA-fed GF mice, data were pooled from two experiments to generate bar graphs (Fig. 4e, Extended Data Fig. 5b) as well as scatter plots (Extended Data Fig. 5c). For LCA-fed GF mice, data pooled from three experiments were sorted into the following quartile groups based on the levels of LCA in caecal contents: quartile 1 (Q1): 24.14–59.46 pmol mg−1 wet mass LCA, 5 mice; Q2, 65.97–332.53 pmol mg−1 wet mass LCA, 6 mice; Q3, 371.99–2,162.68 pmol mg−1 wet mass LCA, 6 mice; and Q4, 2,238.142–5,389.45 pmol mg−1 wet mass LCA, 5 mice (Fig. 4f, Extended Data Fig. 5d). For LCA-fed mice colonized with 3-oxoLCA- and isoLCA-producing bacteria, data pooled from six experiments were sorted into the following quartile groups based on the levels of 3-oxoLCA and isoLCA in caecal contents: Q1, 0–46.58 pmol mg−1 wet mass 3-oxoLCA + isoLCA, 11 mice; Q2, 47.91–92.78 pmol mg−1 wet mass 3-oxoLCA + isoLCA, 12 mice; Q3, 101–174.7 pmol mg−1 wet mass 3-oxoLCA + isoLCA, 11 mice; and Q4, 177.26–591.59 pmol mg−1 wet mass 3-oxoLCA + isoLCA, 11 mice (Fig. 4g, Extended Data Fig. 5g). One-way ANOVA followed by Tukey’s multiple-comparison test was used to compare the frequencies of CD4 T cells belonging to each quartile group. The population frequencies of the indicated CD4 T cells were also plotted against the LCA metabolite levels and linear regression analyses were performed in Extended Data Fig. 5c, e, f, h.

Identification of LCA derivatives from PRISM and HMP2 cohorts

The raw LC–MS data were acquired using the same C18-negative mode LC–MS methods described in the HMP2 and PRISM studies26,27. Peaks of unknown ID were confirmed using authentic standards run alongside with the quality control reference stool pool generated in the HMP2 study. The LCA derivatives were confirmed by matching their m/z in negative mode and retention time, and subsequently verified using LC–MS/MS (Supplementary Fig. 8). Extracted ion chromatograms were generated using QualBrowser (Xcalibur v.4.1.31.9; Thermo Fisher Scientific). The commercial standards used were: LCA (Sigma-Aldrich, L6250), 3-oxoLCA (Steraloids, C1750-000), isoLCA (Steraloids, C1475-000), isoalloLCA (Steraloids, C0700-000), alloLCA (Steraloids, C0680-000), DCA (Sigma-Aldrich, D2510), 3-oxoDCA (Steraloids, C1725-000) and isoDCA (Steraloids, C1165-000). LCA peak in PRISM: FFA_Cluster_0731, m/z = 375.2898 at 12.42 min, and in HMP2: C18n_QI48, m/z = 375.2905 at 11.98 min; 3-oxoLCA in PRISM: FFA_Cluster_0722, m/z = 373.2744 at 12.63 min, and in HMP2: C18n_QI6169, m/z = 373.2749 at 12.2–12.35 min; IsoLCA in PRISM: FFA_Cluster_0733, m/z = 375.2901 at 11.73 min, and in HMP2: C18n_QI6230, m/z = 375.2906 at 11.31 min (Supplementary Fig. 8, Supplementary Table 12).

Statistical analysis of PRISM and HMP2 IBD multi-omic datasets

Data overview

We used two publicly available IBD metabolomics datasets for determining the differential abundance (DA) of BAs in disease/dysbiotic conditions, specifically (1) the Prospective Registry in IBD Study at MGH (PRISM)26 and (2) the IBDMDB study within the integrative Human Microbiome Project (HMP2)27. Additional multi-omic profiles from the HMP2 were further used to associate metabolite abundance with microbial species, gene products and host gene expression.

The PRISM dataset used is a cross-sectional cohort incorporating individuals who were diagnosed with CD (n = 68) and UC (n = 53), and control individuals without IBD (n = 34). As with all metabolomics here, PRISM stool samples were analysed with metabolomic profiling using a combination of four LC–MS methods. Paired metagenomic profiles from PRISM samples were not used in this study. Differential abundance in the PRISM cohort was determined as described below based on diagnosis (that is, comparing the CD and UC subpopulations with controls). Metabolomics profiles from the PRISM cohort were taken from the associated publication’s supporting information26.

The IBDMDB HMP2 comprises a longitudinal cohort containing 132 participants with CD (n = 67), UC (n = 38) and control individuals without IBD (n = 27) who were followed for up to one year each. Taxonomic and functional profiles for HMP2 metagenomes (MGX), metabolomes (MBX) and host transcriptomes (HTX) were downloaded from http://ibdmdb.org in July 2020. These were based on 1,595 MGX samples, 546 MBX samples from 106 individuals (CD, n = 50; UC, n = 30; without IBD, n = 26), and 254 HTX samples from 90 individuals (CD, n = 43; UC, n = 25; without IBD, n = 22). MGX samples had been previously profiled for microbial taxonomic composition using MetaPhlAn (v.2.6.0)46 and for UniRef90-level47 gene functional content using HUMAnN (v.2.11.0)48. MGX and MBX samples were strictly matched for multi-omic association if they were derived from the same individual and sampling time point (yielding 461 samples from 106 participants: 50 with CD, 30 with UC and 26 control individuals without IBD). MGX and HTX samples were matched more leniently to compensate for the smaller total number of HTX samples. Specifically, we considered the first pair of MGX: HTX samples from each individual that were separated by no more than 2 weeks (yielding 71 samples from 71 individuals: n = 33 with CD, n = 21 with UC and n = 17 control individuals without IBD).

Identifying differentially abundant metabolites

We used separate statistical models and definitions of disease activity when determining the DA status of metabolites in the PRISM and HMP2 cohorts owing to their cross-sectional versus longitudinal designs. Specifically, PRISM participants classified as having CD or UC were compared with control individuals without IBD, whereas HMP2 participants were compared between ‘active’ (dysbiotic) and ‘inactive’ (non-dysbiotic) states within individual time courses as described previously27. Before statistical model fitting, gut metabolome profiles of PRISM and HMP2 participants were (1) median-normalized to reduce technical sample-to-sample variation; (2) prevalence-filtered to remove low-confidence features (requiring >30% non-zero values); and (3) log-transformed for variance stabilization (replacing zero values with half the smallest non-zero measurement on a per-feature basis).

Differential abundance over disease phenotype (diagnosis) was determined within the PRISM cohort by evaluating the following linear model for each metabolite in base R v.4.0.2:

metabolite ~ (intercept) + diagnosis + age + antibiotics + immunosuppressant + mesalamine + steroids

Diagnosis was coded as a categorical variable (CD, UC, non-IBD control) with non-IBD control as the reference state. Age was coded as a continuous covariate and four medication exposures (use of antibiotics, immunosuppressants, mesalamine and steroids) were coded as binary covariates. Individual medications within these broad classes (such as specific antibiotic treatments) were insufficiently numerous to merit separate coding.

DA status over disease activity (dysbiosis) was determined within the longitudinally sampled HMP2 cohort by evaluating the following linear mixed-effects model for each metabolite using R’s nlme package:

metabolite ~ (intercept) + diagnosis + diagnosis:dysbiosis + age + antibiotics + (1|participant)

Diagnosis was coded as described above in the context of the PRISM cohort. Dysbiosis within diagnosis (diagnosis:dysbiosis) was used to determine DA status, with per-diagnosis non-dysbiotic samples serving as the reference state. Age at study consent was included as a continuous covariate and per-sample antibiotics exposure as a binary covariate. A per-individual random effect was included to compensate for repeated sampling and to ‘absorb’ potential confounders that were invariant over individuals (such as recruitment site).

Model coefficients of the diagnosis (PRISM) and diagnosis:dysbiosis (HMP2) terms were interpreted as DA effect sizes, and their associated two-tailed P values were used to determine statistical significance (Wald’s test). Where applicable, simultaneously derived P values were adjusted for multiple hypothesis testing using the Benjamini–Hochberg FDR method (Fig. 4h–k, Supplementary Table 6).

Identifying differentially abundant microbial features and metabolite associations

3α- and 3β-HSDH homologues were identified based on mapping query sequences to known protein family clusters as defined in UniRef49 (release 2014_07). We first identified the UniRef90 annotations (that is, protein sequences with >90% amino acid identity and >80% coverage) of the genes identified as 3α- and/or 3β-HSDHs (Elen_0690 and Elen_1325 from E. lenta; Rumgna_02133 and Rumgna_00694 from R. gnavus ATCC29149; BF3538 from B. fragilis NCTC9343). We refer to these as ‘query UniRef90s’. To identify homologues of a given query UniRef90s, we collected all UniRef90 families belonging to the same UniRef50 family as the query (that is, a set of proteins expected to have >50% identity and high coverage of the query) (Supplementary Table 9). We then estimated the per-sample abundance of 3α-HSDH and 3β-HSDH homologues in HMP2 metagenomes by summing over the abundances of homologous UniRef90 sequences (which had been pre-computed using HUMAnN) (Supplementary Table 9, Extended Data Fig. 9e, f).

We tested 3α-/3β-HSDH homologues for differential abundance over dysbiosis states following a very similar approach to the one introduced above in the context of HMP2 metabolomics. Gene abundance values were similarly zero-smoothed and log-transformed before linear model fitting within the MaAsLin 2 package50. The same random effects model formulation applied to HMP2 metabolomics was applied here within MaAsLin. We also applied this modelling approach to the abundances of microbial species for which corresponding strains had been found to express 3α-/3β-HSDH activity in vitro.

To associate microbial features (genes and species) with metabolites of interest within the HMP2 dataset, we computed Spearman correlations between the gene and metabolites’ residual abundances from the previously described linear models (Supplementary Table 10, Extended Data Fig. 10b). This procedure helps to identify correlation between features that cannot be explained by the confounding effects of covariates included in the models. By contrast, a metabolite and gene that both correlate strongly with dysbiosis would be expected to also correlate with one another in the raw data, but this would not be suggestive of a direct link between their changing abundances. Here, module residuals reflect variation in gene and metabolite abundance after subtracting differences due to dysbiosis (in addition to diagnosis, age, antibiotics use and per-individual variation) and, therefore, any remaining correlation cannot be attributed to those variables. Two-tailed P values associated with these Spearman correlation coefficients were subjected to FDR correction according to the procedures introduced above for model coefficients.

Identifying differentially abundant host transcripts and metabolite associations

Using paired MBX and HTX samples from the HMP2 dataset, we identified human genes that were differentially expressed with respect to diagnosis. Not that, because this analysis considers only one HTX sample per HMP2 participant, we focus on per-individual diagnosis as a phenotype rather than per-sample dysbiotic state (Supplementary Tables 7, 8, Extended Data Fig. 8). We performed initial normalization on raw sample-by-gene HTX count data using the voom method implemented in R’s limma package51,52. We then used the normalized counts as a basis for linear modelling within MaAsLin 2 to detect differential gene expression:

gene ~ (intercept) + diagnosis + antibiotics + age

That is, the transformed abundance of each gene was modelled as a function of diagnosis, consent age and antibiotics use as defined above for the HMP2 cohort. The coefficient of the diagnosis term and its FDR-adjusted P value were used to determine the effect size and statistical significance of potential differential expression. Note that differential expression analysis was performed in MaAsLin 2 rather than limma voom itself (1) for consistency with other linear modelling analyses in this work and (2) to enable export of model residuals for multi-omic correlation analysis. From the subset of human differentially expressed genes with FDR-adjusted P < 0.25, we selected a subset that was previously identified as TH17-cell related27,28. We then followed the approach outlined above in the context of microbial features to associate residual expression of these human genes with residual metabolite expression using Spearman correlation.

DSF and SPR analysis of RORγ protein and BAs

Human RORγ LBD (ligand-binding domain) Ala265–Pro491 was cloned into pSJ2 and pNic_NT6HB vectors to provide constructs with an N-terminal His-tag and AVI-tag. RORγ LBD was expressed in the E. coli strain Rosetta 2 (DE3) with 0.1 mM IPTG overnight induction. The constructs were purified sequentially by affinity chromatography on Ni Sepharose resin and size exclusion on Superdex 75 PG gel-filtration column. Purified protein constructs were concentrated and stored in a buffer with 20 mM Tris pH 8.0, 150 mM NaCl and 0.5 mM TCEP. DSF was performed using the Thermo Fisher Scientific QuantStudio 7 Flex Real-Time PCR System. Experiments were carried out in 384-well plates with 10 µl reaction volumes. Assay buffer was 20 mM HEPES (pH 7.5), 200 mM NaCl, 0.5 mM TCEP and 1% (v/v) DMSO. RORγ protein (0.04 mg ml−1, 1.35 µM) was mixed with 14-point serial dilutions of BAs ranging from 40 µM to 4.9 nM. The reactions were incubated at room temperature for 30 min before being measured. The Tm shifts were calculated using the Protein Thermal Shift software from the RT–PCR instrument. The results graphs were generated using GraphPad Prism with dose–response fitting. SPR was performed using the Cytiva Biacore T200 system using SA chips that immobilize the N-terminally biotinylated RORγ protein. The reaction buffer was 20 mM HEPES (pH 7.5), 200 mM NaCl 0.5 mM TCEP and 1% (v/v) DMSO. Twelve-point serial dilutions of each BA ranging from 133 µM to 0.75 nM were injected over the chip with solvent corrections performed between each set. A flow cell left blank was used for referencing the sensorgrams. Affinity and kinetics were analysed using the instrument’s programme.

Quantification of SFB and C. rodentium load

Faecal samples were disrupted in 500 μl of 143 mM Tris buffer (pH 8.0) containing 143 mM NaCl, 14.3 mM EDTA and 5.7% SDS. Phenol–chloroform–isoamylalchol mix (25:24:1; 500 μl) was added and vigorously mixed by vortexing. The supernatant (250 μl) was recovered after centrifugation, and bacterial genomic DNA was precipitated with 2-propanol. Pellets were rinsed with 70% ethanol, air-dried and resuspended in 500 μl Tris-EDTA buffer (pH 8.0). DNA samples (1 μl) were used for qPCR using the following primers for quantification of the bacterial load: SFB_736F, GACGCTGAGGCATGAGAGCAT; SFB_844R, GACGGCACGGATTGTTATTCA53; espB_F, ATGCCGCAGATGAGACAGTTG; espB_R, CGTCAGCAGCCTTTTCAGCTA54. Next, copy numbers of SFB 16s rRNA and C. rodentium espB genes were calculated using a standard-curve method and normalized to faecal mass.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.