Virtually every environment on Earth is teeming with microbial life, from the human digestive tract to hydrothermal vents miles beneath the ocean’s surface. These microorganisms are vital components of natural ecosystems: microbial activity drives Earth’s biogeochemical cycles1, fertilizes crops2 and directly influences human health and well-being3. Those functions are typically performed not by a single species, but rather by a diverse community composed of numerous interacting species. For example, there is growing realization that numerous human illnesses, such as inflammatory bowel disease, are associated with an altered microbial community, rather than with any single pathogen4. The ability to predict the structure of these complex, multispecies communities is crucial for understanding how such communities form and function, managing natural communities and rationally designing functional communities de novo57.

Modelling and predicting microbial community structure is often pursued using bottom-up approaches that assume that species interact in a pairwise manner811. However, pair interactions may be modulated by the presence of additional species12,13, an effect that can significantly alter community structure14 and may be common in microbial communities15. While it has been shown that such models can provide a reasonable fit to sequencing data of intestinal microbiomes16,17, their predictive power remains uncertain, as it has rarely been directly tested experimentally (refs 18,19 are notable exceptions).

Current approaches to modelling microbial communities commonly employ a specific parametric model, such as the generalized Lotka–Volterra (gLV) model2022. Generating predictions from such models requires fitting a large number of parameter values from empirical data, which is often challenging and prone to over-fitting. In addition, the exact form of the interactions needs to be assumed, and a failure of the model can reflect a misspecification of the type of pairwise interaction, rather than the presence of higher-order interactions23.

Here, we take an alternative approach in which qualitative information regarding the survival of species in competitions between small sets of species (for example, pairwise competitions) is used to predict survival in more diverse multispecies competitions (Fig. 1). While this approach forgoes the ability to predict exact species abundances, it does not require specification and parameterization of the exact form of interactions. Therefore, it is robust to model misspecification, and requires only survival data, which can be more readily obtained than exact parameter values.

Figure 1: A bottom-up approach to predicting community composition from qualitative competitive outcomes.
figure 1

a,b, Qualitative information regarding the survival of species in competitions between small sets of species, such as pairwise competitions (a), is used to predict survival in more diverse multispecies competitions, such as trio competitions (b). The particular pairwise outcomes illustrated here reflect the true outcomes observed experimentally in one set of three species (see Fig. 3b).

Intuitively, competitions typically result in the survival of a set of coexisting species, which cannot be invaded by any of the species that went extinct during the competition. To identify sets of species that are expected to coexist and exclude additional species, we first use the outcomes of pairwise competitions. We propose the following assembly rule: in a multispecies competition, species that all coexist with each other in pairs will survive, whereas species that are excluded by any of the surviving species will go extinct. This rule formalizes an intuitive expectation regarding how communities may assemble, and can be used to systemically predict community structure from pairwise outcomes (Methods and Supplementary Fig. 1). Importantly, the rule predicts the likely outcomes of competition, rather than the only possible ones. For example, for limited parameter values, even the simple gLV model can generate outcomes that are inconsistent with this assembly rule24.

Results

To directly assess the predictive power of this approach, we used a set of eight heterotrophic soil-dwelling bacterial species as a model system (Fig. 2a and Methods). Competition experiments were performed by co-inoculating species at varying initial fractions, and propagating them through five growth–dilution cycles (Supplementary Fig. 2). During each cycle, cells were cultured for 48 h and then diluted by a factor of 1,500 into fresh media, which corresponds to ~10.6 cellular divisions per growth cycle and ~53 cellular divisions over the entire competition period. The overall competition time was chosen such that species extinctions would have sufficient time to occur, while new mutants would typically not have time to arise and spread. Community compositions were assessed by measuring the culture optical density, as well as by plating on solid agar media and counting colonies, which are distinct for each species25. These two measurements quantify the overall abundance of microorganisms in the community and the relative abundances of individual species, respectively. All experiments were carried out in duplicate.

Figure 2: Pairwise competitions resulted in stable coexistence or competitive exclusion.
figure 2

a, Phylogenetic tree of the set of eight species used in this study. The tree is based on the full 16S gene and the branch lengths indicate the number of substitutions per base pair. b, Coexistence was observed for 19 of the 28 pairs, whereas competitive exclusion was observed for 9 of the 28 pairs. c, Changes in relative abundance over time in one pair where competitive exclusion occurred and one coexisting pair. The y axis indicates the fraction of one of the competing species. In the exclusion example (right panel), the species fraction increased for all initial conditions, resulting in the exclusion of the competitor. In contrast, in the coexistence case (left panel), fractions converged to an intermediate value and both species were found at the end of the competition. Blue and red arrows to the right indicate the qualitative competitive outcome, with the star marking the final fraction in the case of coexistence. Error bars represent the standard deviation of the posterior beta distribution of the fractions, based on colony counts averaged across replicates. d, Network diagram of the outcomes of all pairwise competitions.

Pairwise competitions resulted in stable coexistence or competitive exclusion of one of the species. We performed competitions between all species pairs and found that in the majority of the pairs (19/28 = 68%, Fig. 2b) both species could invade each other, and thus stably coexisted. In the remaining pairs (9/28 = 32%) competitive exclusion occurred, where only one species could invade the other (time trajectories from one coexisting pair and one pair where exclusion occurs are shown in Fig. 2c. Outcomes for all pairs are shown in Fig. 2d). Species’ growth rate in monoculture was correlated with their average competitive ability, but, in line with previous reports26, it could not predict well the outcome of specific pair competitions (Supplementary Fig. 3).

Next, we measured the outcome of competition between all 56 three-species combinations. These competitions typically resulted in a stable community whose composition was independent of the starting fractions (Supplementary Table 1). However, 2 of the 56 trios displayed inconsistent results with high variability between replicates. This variability probably resulted from rapid evolutionary changes that occurred during the competition (Supplementary Fig. 4). All but one of the other trio competitions resulted in stable communities with a single outcome, independent of starting conditions. This raises the question of whether this unique outcome could be predicted based on the experimentally observed outcomes of the pairwise competitions.

Trios were grouped by the topology of their pairwise outcome network, which was used to predict their competitive outcomes. The most common topology involved two coexisting pairs, and a pair where competitive exclusion occurs (30/56 = 54%). To illustrate this scenario, consider a set of three species, labelled A, B and C, where species A and C coexist with B in pairwise competitions, whereas C is excluded when competing with A. In this case, our proposed assembly rule predicts that the trio competition will result in the survival of species A and B, and exclusion of C (Fig. 3a). This predicted outcome occurred for the majority of the experimentally observed trios (Fig. 3b), but some trio competitions resulted in less intuitive outcomes (Fig. 3c). For example, 1 of the 30 trios with this topology led to the extinction of A and the coexistence of B and C (Fig. 3c). The experimentally observed outcomes of competition in this trio topology highlight that our simple assembly rule typically works, and the failures provide a sense of alternative outcomes that are possible given the same underlying topology of pairwise outcomes. Unpredicted outcomes may occur due to several mechanisms, which are considered in the Discussion.

Figure 3: Observed and predicted outcomes of trio competitions.
figure 3

Changes in species fraction were measured over time for several trio competitions. ac, Trios involving two coexisting pairs and one pair where competitive exclusion occurs. In these plots, each triangle is a simplex denoting the fractions of the three competing species. The simplex vertices correspond to a community composed solely of a single species, whereas edges correspond to a two-species mixture. The edges thus denote the outcomes of pair competitions, which were performed separately. Trajectories (grey arrows) begin at different initial compositions, and connect the species fractions measured at the end of each growth cycle. Dots mark the final community compositions. a, Schematic example, showing that only species A and B are predicted to coexist for this pattern of pairwise outcomes. b, Example of a trio competition that resulted in the predicted outcome. c, An example of an unpredicted outcome. df, Similar to ac, but for trios where all species coexist in pairs. gk, All trio layouts and outcomes, grouped by the topology of the pairwise outcomes network. With the exception of one trio, all trio competitions resulted in a unique outcome. Dots denote the final community composition (not exact species fractions, but rather species survivals). One trio displayed bistability, which is indicated by two dots representing the two possible outcomes. Two trios displayed inconsistent results with high variability between replicates, which is indicated by a question mark.

Another frequent topology was coexistence between all three species pairs (15/56 = 27%), in which case none of the species is predicted to be excluded in the trio competition (Fig. 3d). Such trio competitions resulted in either the coexistence of all three species, as predicted by our assembly rule (Fig. 3e), or the exclusion of one of the species (Fig. 3f). Overall, 5 different trio layouts, and 11 competitive outcomes were observed (Fig. 3g–k). Notably, all observed trio outcomes across all topologies can be generated from simple pairwise interactions, including the outcomes that were not correctly predicted by our assembly rule24. An incorrect prediction of our simple assembly rule is therefore not necessarily caused by higher-order interactions.

Overall, survival in three-species competitions was well predicted by pairwise outcomes. The assembly rule predicted species survival across all the three-way competitions with an 89.5% accuracy (Fig. 4a), where accuracy is defined as the fraction of species whose survival was correctly predicted. To get a sense of how the observed accuracy compares to the accuracy attainable when pairwise outcomes are not known, as a null model, we considered the case where the only information available is the average probability that a species will survive in a trio competition (note that this probability is not assumed to be available in our simple assembly rule). Using this information, trio outcomes could only be predicted with 72% accuracy (Fig. 4a and Methods). We further compared the observed accuracy to the accuracy expected when species interact solely in a pairwise manner, according to the gLV equations with a random interaction matrix (Methods). We found that the observed accuracy is consistent with the accuracy obtained in simulations of competitions that parallel our experimental setup (P = 0.29, Fig. 4b). Survival of species in pairwise competition is therefore surprisingly effective in predicting survival when species undergo trio competition.

Figure 4: Survival in trio competitions is well predicted by pairwise outcomes.
figure 4

a, Prediction accuracy of the assembly rule and the null model, where predictions are made solely based on the average probability that species survive in trio competitions. b, The distribution of accuracies of predictions made using the assembly rule from gLV simulations that mirror our experimental design. The experimentally observed accuracy is consistent with those found in the simulations.

Nonetheless, there are exceptional cases where qualitative pairwise outcomes are not sufficient to predict competitive outcomes of trio competitions. Accounting for such unexpected trio outcomes may improve prediction accuracy for competitions involving a larger set of species. We encode unexpected trio outcomes by creating effective modified pairwise outcomes, which replace the original outcomes in the presence of an additional species. For example, competitive exclusion will be modified to an effective coexistence when two species coexist in the presence of a third species despite one of them being excluded from the pair competition. The effective, modified outcomes can be used to make predictions using the assembly rule as before (Methods and Supplementary Fig. 1). By accounting for unexpected trio outcomes, the assembly rule extends our intuition, and predicts community structure in the presence of potentially complex interactions.

The ability of the assembly rule to predict the outcomes of more diverse competitions was assessed by measuring survival in competitions between all seven-species combinations, as well as the full set of eight species (Fig. 5a). Using only the pairwise outcomes, survival in these competitions could only be predicted with an accuracy of 62.5%, which is barely higher than the 61% accuracy obtained when using only the average probability that a species will survive these competitions (Fig. 5b). A considerably improved prediction accuracy of 86% was achieved by incorporating information regarding the trio outcomes (Fig. 5b). As in the trio competitions, the observed accuracies are consistent with those obtained in gLV simulations that parallel the experimental setup, both when predicting using pairwise outcomes alone (P = 0.53) or in combination with trio outcomes (P = 0.21, Fig. 5c).

Figure 5: Predicting survival in more diverse competition required incorporating the outcomes of the trio competitions.
figure 5

a, Species survival when competing all eight species, and all sets of seven species. Filled and empty squares indicate survival and extinction, respectively. Survival is predicted either using only pair outcomes, or using both pair and trio outcomes. b, Prediction accuracy of the null model and the assembly rule, using either pair outcomes only, or pair and trio outcomes. c, The distribution of accuracies of prediction made using the assembly rule from gLV simulations that mirror our experimental design. In these simulations, predictions were made using either pair outcomes only, or pair and trio outcomes. In both cases, the experimentally observed accuracies are consistent with those found in the simulations.

Discussion

Our assembly rule makes predictions that match our intuition, but there are several conditions under which these predictions may be inaccurate. First, community structure can be influenced by initial species abundances27, as has recently been demonstrated in pairwise competitions between bacteria of the genus Streptomyces28. Our assembly rule may be able to correctly predict the existence of multiple stable states, as it identifies all putative sets of coexisting, non-invasible species in a given species combination. However, we did not have sufficient data to evaluate the rule’s accuracy in such cases, as multistability was observed in only one of all our competition experiments.

Complex ecological dynamics, such as oscillations and chaos, can also have a significant impact on species survival29,30, making it difficult to predict the community structure. These dynamics can occur even in simple communities containing only a few interacting species. For example, oscillatory dynamics occur in gLV models of competition between as few as three species24, and have been experimentally observed in a cross-protection mutualism between a pair of bacterial strains31. In contrast, our competitions predominantly resulted in a unique and stable final community. This occurred despite the fact that we observed complex interspecies interactions involving interference competition and facilitation (Supplementary Fig. 4). These results indicate that complex ecological dynamics may in fact be rare, though it remains to be seen whether they become more prevalent in more diverse assemblages. Relatedly, prediction is challenging in the presence of competitive cycles (for example, ‘rock–paper–scissors’ interactions), which often lead to oscillatory dynamics, and are thought to increase species survival and community diversity32,33. Such non-hierarchical relationships are absent from our competitive network, and thus their effect cannot be evaluated here.

In the absence of multistability or complex dynamics, our approach may still fail when competitive outcomes do not provide sufficient information regarding the interspecies interactions. This could be due to higher-order interactions, which only manifest in the presence of additional species, or because only qualitative information regarding survival is utilized. The observed accuracy of the assembly rule was consistent with the one found in gLV simulations, but this does not necessarily indicate that our species interact in a linear, pairwise fashion. In fact, fitting the gLV model directly to our pairwise data does not improve predictability (Supplementary Fig. 6). Determining whether, in any particular competition, predictions fail due to insufficient information regarding the strength of linear, nonlinear or higher-order interactions will require more detailed measurements.

Controlling and designing microbial communities has numerous important application areas ranging from probiotic therapeutics, to bioremediation and biomanufacturing5. The ability to predict what community will be formed by a given set of species is crucial for determining how extinctions and invasions will affect existing communities, and for engineering desired communities. Our results suggest that, when measured in the same environment, community structure can be predicted from the outcomes of competitions between small sets of species, demonstrating the feasibility of a bottom-up approach to understanding and predicting community structure. While these results are encouraging, they were obtained using a small set of closely related species in well-controlled laboratory settings. It remains to be seen to what extent these results hold in other systems and in more natural settings, involving more diverse assemblages that contain additional trophic levels, in the presence of spatial structure and over evolutionary timescales.

Methods

Species and media

The eight soil bacterial species used in this study were Enterobacter aerogenes (Ea, ATCC#13048), Pseudomonas aurantiaca (Pa, ATCC#33663), Pseudomonas chlororaphis (Pch, ATCC#9446), Pseudomonas citronellolis (Pci, ATCC#13674), Pseudomonas fluorescens (Pf, ATCC#13525), Pseudomonas putida (Pp, ATCC#12633), Pseudomonas veronii (Pv, ATCC#700474) and Serratia marcescens (Sm, ATCC#13880). All species were obtained from ATCC. The base growth media was M9 minimal media25, which contained 1× M9 salts (Sigma Aldrich, M6030), 2 mM MgSO4, 0.1 mM CaCl2, 1× trace metals (Teknova, T1001). For the final growth media, the base media was supplemented with 1.6 mM galacturonic acid and 3.3 mM serine as carbon sources, which correspond to 10 mM of carbon for each of these substrates. These carbon sources were chosen from a set of carbon sources commonly used to characterize soil microorganisms (Biolog, EcoPlate) to ensure that each of the eight species survives in monoculture. Nutrient broth (0.3% yeast extract, 0.5% peptone) was used for initial inoculation and growth before experiments. Plating was done on 10 cm Petri dishes containing 25 ml of nutrient agar (nutrient broth with 1.5% agar added).

Competition experiments

Frozen stocks of individual species were streaked out on nutrient agar Petri plates, grown at room temperature for 48 h and then stored at 4 °C for up to two weeks. Before competition experiments, single colonies were picked and each species was grown separately in 50 ml Falcon tubes, first in 5 ml nutrient broth for 24 h and next in 5 ml of the experimental M9 media for 48 h. During the competition experiments, cultures were grown in Falcon flat-bottom 96-well plates (BD Biosciences), with each well containing a 150 μl culture. Plates were incubated at 25 °C without shaking, and were covered with a lid and wrapped in Parafilm. For each growth–dilution cycle, the cultures were incubated for 48 h and then serially diluted into fresh growth media by a factor of 1,500.

Initial species mixtures were prepared by diluting each species separately to an optical density (OD) of 3×10−4. Different species were then mixed by volume to the desired composition. This mixture was further diluted to an OD of 10−4, from which all competitions were initialized. For each set of competing species, competitions were conducted from all the initial conditions in which each species was present at 5%, except for one more abundant species. For example, for each species pair there were two initial conditions with one species at 95% and the other at 5%, whereas for the eight species competition there were eight initial conditions each with a different species at 65% and the rest at 5%. For a few species pairs (Fig. 2a,b), we conducted additional competitions starting at different initial conditions. All experiments were carried out in duplicate.

Measurement of cell density and species fractions

Cell densities were assessed by measuring OD at 600 nm using a Varioskan Flash plate reader. Relative abundances were measured by plating on nutrient agar plates. Each culture was diluted by a factor between 105 and 106 in phosphate-buffered saline, depending on the culture’s OD. For each diluted culture, 75 μl was plated onto an agar plate. Colonies were counted after 48 h incubation at room temperature. A median number of 85 colonies per plate were counted. To determine species extinction in competition between a given set of species, we combined all replicates and initial conditions from that competition, and classified as extinct any species whose median abundance was less than 1%, which is just above our limit of detection.

Assembly rule predictions and accuracy

For any group of competing species, predictions were made by considering all possible competitive outcomes (for example, survival of any single species, any species pair, and so on). Outcomes that were consistent with our assembly rule were those that were predicted to be a possible outcome of the competition (Supplementary Fig. 1). For any given competition, there may be several such feasible outcomes; however, a unique outcome was predicted for all our competition experiments.

Pairwise outcomes were modified using trio outcomes as follows. Exclusion was replaced with coexistence for pairs that coexisted in the presence of any additional species. Coexistence was replaced with exclusion whenever a species went extinct in a trio competition with two species with which it coexisted when competed in isolation. Only modifications caused by the surviving species or an invading species were considered. Therefore, a new set of modified pairwise outcomes was generated for each putative set of surviving species being evaluated.

The prediction accuracy was defined as the fraction of species whose survival was correctly predicted. When the assembly rule identified multiple possible outcomes, which occurred only in the gLV simulations, accuracy was averaged over all such feasible outcomes. In addition, when the competitive outcome depended on the initial condition, accuracy was averaged across all initial conditions.

For reference, we computed the accuracy of predictions made based on the probability that a species will survive a competition between the same number of species. For example, for predicting trio outcomes, we used the proportion of species that survived, averaged across all trio competitions. Using this information, the highest accuracy would be achieved by predicting that all species survive in all competitions, if the average survival probability is >0.5, and predicting that all species go extinct otherwise.

Simulated competitions

To assess the assembly rule’s expected accuracy in a simple case in which species interact in a purely pairwise manner, we simulated competitions using gLV dynamics:

(1) x ̇ i = r i x i ( 1 x i + i j α i j x j )

where xi is the density of species i (normalized to its carrying capacity), ri is the species’ intrinsic growth rate and αij is the interaction strength between species i and j. For each simulation, we created a set of species with random interactions where the αij parameters were independently drawn from a normal distribution with a mean of 0.6 and a standard deviation of 0.46. Results were insensitive to variations in growth rates, thus they were all set to 1 for simplicity. These parameters recapitulate the proportions of coexistence and competitive exclusion observed in our experiments, and yield a distribution of trio layouts similar to the observed one (Supplementary Fig. 7). The probability of generating bistable pairs in these simulations is low (~3.7%, corresponding to one bistable pair in a set of eight species), and we further excluded the bistable pairs that were occasionally generated by chance, as we had not observed any such pairs in the experiments.

The accuracy of the assembly rules in gLV systems was estimated by running simulations that parallel our experimental setup. A set of eight species with random interaction coefficients was generated, and the pairwise outcomes were determined according to their interaction strengths. These outcomes were used to generate predictions for the trio competitions using our assembly rule. Next, all three-species competitions were simulated with the same set of initial conditions used in the experiments. Finally, the predicted trio outcomes were compared with the simulation outcomes across all trios to determine the prediction accuracy. Thus, a single accuracy value was recorded for each set of eight simulated species. Similarly, for each simulated eight-species set, the pair and trio outcomes were used to generate predictions for the seven-species and eight-species competitions, and their accuracy was assessed by comparing them to the outcomes of simulated competitions. Prediction accuracy distributions were estimated using Gaussian kernel density estimation from the accuracy values of 100 simulated sets of eight species.

One-sided P-values evaluating the consistency of the experimentally observed accuracies with the simulation results were defined as the probability that a simulation would yield an accuracy that is at least as high as the experimentally observed one.

Code availability

An implementation of the assembly rule and the gLV simulations as well as routines for evaluating the rule’s accuracy are freely available online at https://bitbucket.org/yonatanf/assembly-rule.

Data availability

The data that support the findings of this study are available from the corresponding authors on reasonable request.

Additional information

How to cite this article: Friedman, J., Higgins, L. M. & Gore, J. Community structure follows simple assembly rules in microbial microcosms. Nat. Ecol. Evol. 1, 0109 (2017).