Introduction

The burden of cancer incidence and mortality is rapidly growing worldwide due to aging of the world population and an improvement in life expectancy particularly in middle- and high-income countries [1].

Positive behavior changes can substantially reduce cancer burden [2, 3]. In relation to diet, the most recent report by the World Cancer Research Fund International (WCRF) recommends consuming a diet rich in whole grains, vegetables, fruit, and beans, with a sub goal to consume a diet providing at least 30 g/day of dietary fiber from whole foods [4]. Pulses/ legumes (i.e. the dry edible seeds of non-oilseed legumes, like dry beans, chickpeas, dry peas and lentils) are an excellent source of protein, carbohydrates, fatty acids and dietary fiber [5, 6]. They also contain several non-nutrients that have been shown to have interactive bioactive properties [5,6,7].

Despite the potential benefit of legumes, the evidence on the relationship between legume consumption and risk of specific cancer sites is limited and inconclusive [7,8,9,10,11]. Most studies used fiber intake (including those from legumes) or an overall dietary pattern, including legumes, as exposure, whereas only a few of them evaluated the association between legume consumption and cancer risk [7, 11,12,13,14]. This led the 2018 WCRF report to define the impact of legumes on the risk of the three most common cancers (breast, colorectal and prostate) as “limited-no conclusion” [4] indicating a need for more robust studies focusing specifically on legume consumption.

The aim of this study was therefore to quantify the role of legumes on the risk of cancer at several sites using an integrated network of case-control studies.

Materials and methods

Study design and population

This work is based on data from a series of case-control studies conducted between 1991 and 2009 in various areas of northern (provinces of Pordenone, Gorizia, Padua, Forlì and in the urban areas of Milan and Genoa), central (the provinces of Rome and Latina) and southern (the urban area of Naples) Italy, and in the Canton of Vaud, Switzerland [15].

The original studies were conducted using comparable study designs, inclusion criteria and data collection tools. They enrolled incident, histologically confirmed cases of oral cavity, esophageal, stomach, colorectal, larynx, breast, endometrial, ovarian, prostate and kidney cancers, diagnosed within one year from the interview. Each single study, enrolled controls in the same hospitals among patients admitted for acute and nonneoplastic conditions, not related to smoking or alcohol consumption and long-term modification of diet (i.e. traumas, other orthopedic disorders, acute surgical conditions and miscellaneous other illnesses, including eye, nose, ear, skin or dental disorders). In some studies, controls were frequency-matched by sex, age group and area of residence, while no studies used an individual matching design.

The original studies were conducted to evaluate the association between lifestyle factors, such as smoking, alcohol intake and dietary habits and cancer risk. To that purpose, a case-control design is particularly efficient as it considerably reduces the time to collect cases over a period of time that covers the latency between the exposure and the occurrence of the disease.

The data from twelve studies that gathered information on legume consumption and also collected comprehensive dietary information, enabling the calculation of energy intake, were incorporated into this study [16,17,18,19,20,21,22,23,24,25,26,27]. Two studies on oral and pharyngeal cancers [16, 17], two studies on esophageal cancers [18, 19], one study on stomach cancer [20], one study on colorectal cancer [21], one study on laryngeal cancer [22], one study on breast cancer [23], one study on endometrial cancer [24], one study on ovarian cancer [25], one study on prostate cancer [26] and one study on kidney cancer [27]. Subjects with unreliable energy intakes as defined by intakes <500 kcal/day or >5000 kcal/day were excluded from the analysis. Further details on the cases and controls enrolled in each study are reported in the Supplementary Information.

Data collection

Trained interviewers asked participants to report sociodemographic information, height, weight, smoking habit, food and beverage consumption including alcoholic beverages, physical activity, medical history, and familiarity for cancer. Information was collected using a structured questionnaire which included a validated food frequency questionnaire (FFQ) evaluating portion sizes and frequency of consumption of 78 foods, food groups or recipes [28]. Consumption of food and beverages were collected over the year preceding the hospital admission.

The FFQ contained a single question on legume consumption which included both fresh and dried legumes. Participants were asked to report the size of the portion consumed (small, medium, large), assuming a medium portion of fresh legumes of 100 grams and of dried legumes of 40 grams. Small and large portions were set to be 0.66 or 1.33 times the medium portion, respectively. Frequency of consumption was collected as number of portions per week. Legume consumption was then expressed as number of medium portions consumed in a week, and used in the analysis as continuous variable or categorized into 3 levels of consumption, i.e. <1, 1 portion or ≥2 portions per week.

Statistical analysis

The association between legume consumption and different cancer sites was evaluated by the odds ratio (OR) and corresponding 95% confidence intervals (CI), which were estimated through multiple logistic regression models. The ORs were estimated for different levels of legume consumption, including at least 1 or ≥2 portions per week, with the reference category being <1 portion per week. Additionally, we also estimated the OR for each additional portion consumed per week. To capture any potential nonlinear relationship, the number of legume portions per week was incorporated in the model as natural cubic spline with three equally-spaced knots positioned at the quartiles of the distribution.

Each model was adjusted for a series of non-dietary covariates including sex, age (<40, 40–44, 45–50,70–74 and ≥75 years), study center, years of education (<13 vs ≥13 years), smoking (current, ever, never), alcohol intake (study-specific tertiles), body mass index (<18.5, 18.5–24.9, 25–29.9, ≥30 kg/m2), diabetes, hypertension, dyslipidemia, physical activity at work (sedentary, light, moderate, vigorous and very vigorous) and leisure-time physical activity (<2, 2–4, 5–7 and >7 h per week). To assess whether the relationship between legume consumption and cancer was independent from other dietary factors, an additional adjustment was made for consumption of raw and cooked vegetables (study-specific tertiles), fruit (study-specific tertiles) and processed meat (study-specific tertiles) and energy intake (study-specific tertiles). Estimates for breast, endometrial, ovarian cancers were also adjusted for age at menarche (<10, 10–15.9, ≥16 years), menopausal status (pre, peri and post-menopause) and number of children (none, 1 and ≥2 children).

Completeness was above 95% for the majority of covariates in all studies. However, some covariates exhibited lower completeness. For instance, the completeness rate for physical activity was approximately 70% in the study on laryngeal cancer and around 60% in the studies on oral, pharyngeal and esophageal cancers. In the study on stomach cancer, the completeness rates for raw vegetables and processed meat ranged from 60 to 70%. Additionally, in the studies examining oral cavity, pharyngeal, stomach, esophageal and laryngeal cancers, the completeness rate for cooked vegetables was approximately 80 to 90% (Table 1).

Table 1 Distribution of sociodemographic characteristics and selected dietary intakes among cancer cases and controls according to cancer site.

A multiple imputation technique using a fully conditional specification (FCS) method was implemented to account for missing values under the missing at random assumption [29]. Five completed data sets were generated for each cancer site and used to obtain five different estimates and the corresponding standard errors, which were then combined using the Rubin’s rule [30]. A complete case analysis was also carried out and results were compared with the main analysis.

All models included vegetable, fruit and whole bread intakes to control for confounding related to the fact that legume consumers tend to have a healthier diet compared to non-consumers. This also implies that legume consumers have a higher fiber intake, in part because legumes are an important source of fiber and in part because of the high consumption of other fiber-rich foods. Thus, to evaluate the contribution of legumes on the total intake of dietary fiber we computed the percentage of total dietary fiber obtained by legumes among cases and controls and in each study.

To evaluate whether a sex-difference in the association between legume consumption and cancer risk exists, we tested the “sex-by-legume consumption” interaction in the regression models using the likelihood ratio test (LRT) between the model with and the model without the interaction term. We rejected the null hypothesis of no difference if the p-value of the LRT was <0.05.

For cancer sites such as the oral cavity, pharynx, esophagus, and larynx, where subjects were enrolled in both Italy and Switzerland, we calculated the OR separately according to the country of enrollment. Additionally, we obtained a pooled estimate using a two-stage meta-analytic approach, based on the Der-Simonian-Laird estimator [31]. The study was approved by the ethical committees of the hospitals involved, and all participants gave informed consent.

Results

This work included a total of 10,482 cancer cases (1292 cancers of oral cavity, 488 esophageal cancers, 225 stomach cancers, 1914 colorectal cancers, 604 laryngeal cancers, 2554 breast cancers, 357 endometrial cancers, 1028 ovarian cancers,1270 prostate cancers, and 750 kidney cancers).

Table 1 gives the frequency distribution of legume consumption and of the main covariates among the cases and controls included in the analysis of oral cavity, esophageal, stomach, colorectum, larynx, prostate and kidney cancers. Corresponding information for female cancers is reported in Table 2. Around 30–40% of the cases and controls consumed at least one portion of legumes per week, with a generally higher frequency of consumption among controls. The only exception was for endometrium cancer cases who consumed more legumes than controls.

Table 2 Distribution of sociodemographic characteristics and selected dietary intakes among female cancer cases and controls according to cancer site.

Table 3 presents the OR for each cancer site, obtained through two different sets of adjustments. The first set of adjustments includes main sociodemographic characteristics and non-dietary risk factors, while the second set additionally includes also dietary covariates. Although most of the estimates were below unity, indicating a potential protective effect, only colorectal cancer showed a significant association. Compared to no consumption, the OR for consuming at least one portion of legumes was 0.74 (95% CI: 0.65–0.86), for consuming two portions was 0.65 (95% CI: 0.55–0.77) and the estimate in continuous for an additional portion was 0.85 (95% CI: 0.79–0.90). After further adjusting for dietary covariates, these estimates remained largely unchanged. The OR was 0.79 (95% CI: 0.68–0.91) for one portion, 0.68 (95% CI: 0.57–0.82) for two portions and 0.87 for an additional portion (95% CI. 0.81–0.93). Similar estimates were obtained in the complete-case analysis: 0.79 (95% CI: 0.68–0.91) for one portion per week, 0.67 (95% CI: 0.55–0.81) for two portions per week and 0.86 (95% CI: 0.80–0.92) for an increment of one portion per week (Supplementary Information).

Table 3 Odds ratio for cancer according to legume consumption by cancer site.

When the analysis was stratified according to the country of enrollment, we found an inverse association also for laryngeal cancer among subjects enrolled in Switzerland (OR per portion: 0.36, 95% CI: 0.16–0.85) (Supplementary Information). Nonetheless, the pooled estimates obtained from the two-stage meta-analysis resulted in no significant associations.

There were no significant sex-differences in the association between legume consumption and colorectum cancer (Supplementary Information).

Figure 1 shows the exposure-risk relationship analysis showing that the risk of colorectal cancer decreased as the portions of legumes consumed per week increased. Additionally, some decreasing risk was observed also for esophageal, stomach, ovarian, prostate and kidney cancers; however, the CI were relatively wide and crossed unity.

Fig. 1
figure 1

Exposure-risk analysis of legume consumption and cancer risk by cancer site.

The mean daily dietary fiber intake varied between 21 and 26 g, depending on the specific cancer site being studied; with however, minimal differences observed between cases and controls. Legumes contributed to only about 5% of the total dietary fiber intake among legume consumers (Supplementary Information).

Discussion

Our findings indicate that a moderate consumption of legumes is associated with a significant decreased risk of colorectal cancer. In line with our findings a recent meta‐analysis of observational studies (n = 14: 3 cohort studies, 11 case‐control studies) found a decreased risk of colorectal adenoma for the highest versus lowest intake of legumes (OR = 0.83) [11]. In the Polyp Prevention Trial, an increased consumption of legumes was also associated with a reduced risk of advanced adenoma recurrence. The OR in individuals in the highest quartile of change in dry bean intake from baseline (median change: +41.5 g/day) versus the lowest quartile (−5.7 g/day) was 0.35 [9].

The OR for the highest vs the lowest level of consumption indicated a possible decreased risk also for esophageal (OR: 0.55) cancer which however was not confirmed when legume consumption was evaluated in continuous as portions per week. Previous case-control studies have reported OR of 0.54–0.62 for esophagus and larynx cancer with the highest intake of legumes [32, 33]. A case-control study of 11 cancer sites conducted in Uruguay between 1996 and 2004 and including 3539 cancer cases and 2032 hospital controls reported an OR of 0.54 for esophagus and 0.55 for laryngeal cancer among the highest as compared to the lowest tertile of consumption [34]. Other studies from the United States (Connecticut and Los Angeles) looking at associations between legumes and esophageal cancer reported significant inverse associations between legume intake and risk of esophageal cancer (particularly a decreased risk of esophageal squamous cell carcinoma), although the legume group within these studies included beans and nuts [35, 36].

With regard to other cancer sites, we did not find any significant association between legume consumption and cancer of oral cavity and pharynx, stomach, larynx, breast, endometrium, prostate, ovary and kidney. Previous studies for these cancer sites reported mixed results, some reporting weak/moderate associations (OR ranging from 0.42 to 0.84) or null associations [37,38,39,40,41].

As to the mechanisms that could explain a possible protective effect of legume intake on cancer risk there are several possible explanations [42,43,44,45]. Legumes are recognized as a protein source but are often overlooked as a source of fiber, with 100 g of cooked legumes containing, at a minimum, 5 g of dietary fiber [7]. The beneficial effects of legume consumption are likely related to their fiber content and this is particularly true for colorectal cancer. When entering the large bowel, fiber increases stool weight, dilutes colonic contents and stimulates bacterial anaerobic fermentation. This process reduces contact between the intestinal contents and mucosa and leads to the production of short chain fatty acids (SCFA) through the fermentation of fiber by gut bacteria. SCFAs reduce cell proliferation, the first biological mechanism promoting carcinogenesis. SCFA reduce colonic pH thereby inhibiting the histone deacetylase enzyme and decreasing the conversion of primary to secondary bile acids (deoxycholic acid and lithocholic acid) which are cytotoxic to colonocytes [6, 42]. Furthermore, dietary fiber is a substrate for the gut microbiota affecting amount and composition favouring anti-inflammatory strains which have local and systemic health benefits via modulation of the immune system, production of microbial metabolites, conversion of polyphenols into biologically active forms, and modifying also distant organ tissue-specific strains [6, 42]. In our study, however, the legume consumption contributed to only approximately 5% of the total intake of dietary fiber among the study subjects (Supplementary Information). Nevertheless, even after accounting for other sources of dietary fiber, such as vegetables and fruit, there was still a consistent inverse association between legume consumption and colorectal cancer. Beyond fiber, other bioactive compounds in legumes, such as phenolics, may also play a role in inhibition of colorectal cancer [42].

Dietary fiber and proteins from legumes for example also contribute to lower the glycaemic load of the diet [6, 21] thus preventing hyperglycemia and hyperinsulinemia [26, 45]. Hyperglycemia and hyperinsulinemia are both sustained by excess body fat and consequential changes in hormonal status, growth factors, inflammatory markers, and oxidative stress – all contributing factors in the development of chronic diseases, including cancers [6,7,8]. Pulses have been linked to improvements in these markers [42].

Legumes are also rich in vitamins (i.e. B vitamins), minerals (i.e. iron, folate, calcium and zinc) and a series of biological active compounds, known as phytochemicals which also have antitumor effects [44, 46]. These compounds include tannins, flavonols, isoflavones, phenolic acids and phytic acids [42]. For example, phytates are excreted in the urine where they inhibit the formation of kidney stones [47], which have been related to kidney cancer [48]. Legumes are also a good source of folate, which may protect against cancers of the upper digestive tract of the colon and several other cancers as well [13, 33].

In addition to the direct cancer preventative effects of legume intake, indirect effects may also be at work as well. Higher intake of legumes may replace other sources of protein such as meat or high glycaemic index carbohydrates, both of which have been shown to be linked to several cancers [45].

Strength and weakness

In this work, we quantified the association between legume consumption and several cancer sites using a series of case-control studies. In these studies, the same validated and reproducible questionnaires have been used to collect information on legume consumption and to measure potential confounders. Several confounders have been considered including age, education, overweight/obesity, smoking, physical activity, presence of comorbidities, alcohol, consumption of fruit, vegetables and processed meat, energy intake and for female hormone-related cancers also age at menarche, menopausal status and number of children.

The study has also some limitations. The first lies in the potential inaccurate measure of legume consumption in a case control design. Second, the inverse association between legume consumption and cancer risk can at least be partially attributable to a generally healthier diet of legume consumers who also had high intake of fiber from other dietary sources. Third, there was a substantial predominance of male in the majority of the studies included in the analysis. Fourth, it is important to note that the analysis is based on hospital controls. Thus, the distribution of legume consumption in controls may not fully reflect that in the population which produced the cases. Finally, although the majority of studies included more than 1000 cases, for some cancer sites only a few cases were in the highest category of consumption (i.e. ≥2 portions).

Conclusions

Our results indicate an inverse association between legume intake and colorectal cancer risk. No consistent associations were found for cancer of oral cavity and pharynx, esophagus, stomach, larynx, breast, endometrium, ovary, prostate and kidney.