Study cohort description
Across two EU FP7 projects (i.e., NeuroFAST and MyNewGut), a total of 100 premenopausal women were recruited at the Unit of Endocrinology and Prevention and Care of Diabetes of the S. Orsola Polyclinic University Hospital in Bologna, Italy. Anthropometric and laboratory parameters as well as psychometric results of all recruited participants are reported in Table 1. The whole study cohort included 63 OB (with BMI from 25.6 to 39.8 kg/m2) and 37 NW (with BMI from 18.5 to 24.6 kg/m2) women. OB women showed significantly higher body weight, BMI, waist and hip circumference, and waist-to-hip ratio compared to NW (p < 0.001, Wilcoxon rank-sum test). Systolic and diastolic blood pressure was higher in OB compared to NW (p ≤ 0.001), and three OB women were under antihypertensive treatment. OB women exhibited higher total cholesterol and triglycerides levels compared to NW (p ≤ 0.04), while HDL-cholesterol was higher in NW group (p = 0.003). Glucose metabolism indices (i.e., blood glucose, insulin, glycated hemoglobin and the area under the curve (AUC) of glucose and insulin during oral glucose tolerance test) were higher in OB compared to NW women (p < 0.001). Five OB and one NW women had fasting glycaemia (≥ 100 mg/dl) but no one had a diagnosis of type 2 diabetes according to American Diabetes Association diagnostic criteria . Finally, the OB group showed a significantly higher rate of insulin-resistant participants than NW (44.4% in OB group vs 0% in NW group, p < 0.001), consistent with a dependent relationship between insulin resistant/sensitive phenotype and BMI status. Correction for effect size indicated good correlation (Cramer’s correlation coefficient = 0.478, p < 0.0001). As for the psychometric questionnaires, some of them were discarded for incomplete data: 12 BITE (Bulimic Investigatory Test Edinburgh), 10 TFEQ (three-factors eating questionnaire), and 4 PSS (perceived stress scale) were incompletely filled in by the participants. One OB participant exhibited high scores in the two BITE subscales, indicating a high probability of fulfilling the criteria for a diagnosis of eating disorder. The mean scores obtained in the TFEQ subscales were as follows: TFEQ UE (uncontrolled eating), 16.91 ± 5.37; TFEQ CR (cognitive restraint), 14.35 ± 4.27; TFEQ EE (emotional eating), 7.59 ± 2.98. Among the 100 participants who completed YFAS, 14 (all belonging to the OB group) received a diagnosis of FA, while the mean FA symptom score of the whole cohort was 2.28 ± 1.55. None of the NW women showed high FA scores. The mean PSS score was 16.03 ± 5.91; two OB patients exhibited high PSS scores. Compared to NW, OB participants exhibited significantly higher scores in BITE severity (p < 0.001), TFEQ CR (p < 0.001), TFEQ EE (p < 0.001), and FA symptom score (p < 0.001), as well as in FA diagnosis (p < 0.01).
Gut microbiota profiling and cluster identification
16S rRNA gene sequencing yielded a total of 6.5 million sequence reads, with an average of 73,152 (± 38,578, sd) paired-end reads per sample, for 11,874 operational taxonomic units (OTUs) grouped at 97% of sequence identity. Considerable differences were identified in the GM diversity and structure of OB and NW women (Fig. 1).
Hierarchical Ward linkage clustering based on the Spearman correlation coefficient of the proportion of OTUs allowed the identification of four participant clusters (named C1 to C4) (Fig. 2). Albeit in the absence of statistical significance (p = 0.38, Fisher’s exact test), one cluster included 48% of NW women (C1) while the remaining three clusters (C2-C4) comprised mostly OB women (72%, C2; 64%, C3 and C4). Interestingly, the four clusters also differed in biodiversity, with C1 and C3 showing the highest values, while C2 and C4 the lowest (p < 8 × 10−4, Kruskal–Wallis test). More precisely, the biodiversity in C2 was lower with respect to C1 and C3 (p < 3 × 10−4, Wilcoxon rank-sum test), while the cluster C4 showed lower levels compared to C1 (p = 0.05) (Fig. 2). To identify trends in the GM structure across the whole dataset, co-abundance associations of genera were first established, and correlated bacterial taxa were subsequently clustered into five co-abundance groups (CAGs) (Additional file 1: Fig. S1), named according to the dominant (i.e., the most abundant) genus in each group: Bifidobacterium (violet), Ruminococcus (blue), Dorea (green), Prevotella (light blue), and Bacteroides (pink). Wiggum plots were then generated to depict the GM compositional relationships for each of the four participant divisions—identified by OTU clustering—showing a peculiar abundance pattern of the five CAGs (Fig. 3). Each cluster (C1 to C4) constitutes a steady state, representing a group of individuals characterized by a GM layout significantly different from the other groups (p < 0.001, permutational MANOVA test on unweighted UniFrac data). However, it should be noted that the clusters did not significantly segregate in the weighted UniFrac-based PCoA (p > 0.05; data not shown), suggesting that the differences were not related to abundant GM components. These results were confirmed by comparisons of the relative abundances of the main genera (see Additional file 1: Table S1 for further details). The microbiota variation from the group comprising most of the NW women (i.e., C1) to the groups including predominantly OB women (C2-C4) was accompanied by distinctive CAGs dominance. Specifically, the cluster C1 was characterized by the co-presence of all 5 CAGs and a higher relative abundance of Prevotella, while in clusters C2-C4, the lack of at least one of the 5 CAGs identified was observed. In cluster C2, despite the absence of the Bifidobacterium CAG, a representation of the other four CAGs was preserved. On the other hand, cluster C3 lost the Bifidobacterium CAG but showed an over-representation of Prevotella and Ruminococcus CAGs. Finally, cluster C4 was characterized by a loss of Bacteroides while being enriched in Bifidobacterium CAG.
Associations between the microbiota clusters and clinical and behavioral measures in normal-weight and overweight/obese women
Associations of demographic and clinical variables and eating behavior with the major axes of unweighted uniFrac PCoA analysis are shown in Fig. 3 and listed in Table 2 (see also Additional file 1: Fig. S2). In particular, based on a quantile (median) age-adjusted regression analysis when considering the whole cohort, a shift of the GM structure towards negative PCo2 values (as the low-diversity cluster C4) was associated with higher BITE symptom score—indicative of binge eating behavior—and TFEQ UE score—indicative of uncontrolled eating.
When comparing the baseline clinical conditions across the four clusters of participants, we found no difference in anthropometric data, as well as in biochemical profile and inflammatory biomarkers. All the parameters considered were similar in all groups except for uric acid, which was higher in C2 than in C4 (p = 0.02, Wilcoxon rank-sum test), although the values were still within the normal range (Table 3). Although statistical significance was not reached, cluster C2 participants also tended to show higher levels of triglycerides, insulin (AUC), and thyroid stimulating hormone compared to C3 or C4 (p ≤ 0.08). Finally, when comparing C1-C4 groups, no difference in insulin-resistant rate was found (p = 0.47).
As for the psychometric measures, the C2 cluster showed the highest BITE severity score (p = 0.1, Kruskal–Wallis test), while C4 the highest BITE symptom score (p = 0.1), consistent with the association found with the PCo2 axis (Fig. 3).
To further explore the relationship between GM clusters and eating behavior, OB women were next stratified according to the diagnosis of uncontrolled eating behavior, based on the YFAS questionnaire and taking into account the contribution of stress symptoms induced by eating behavior (see the “Methods” section for further details), in the following three groups: low addictive eating behavior (i.e., with 2 or fewer symptoms, O_LA) and high addictive eating behavior (i.e., with 3 or more symptoms) with (O_DHA) or without (O_HA) FA diagnosis (Additional file 1: Fig. S3). C1 showed the lowest proportion of O_DHA women (7%), while C2 the highest (36%). On the other hand, C1 shared similar proportions of O_HA women with cluster C2 (C1, 19% vs C2, 18%). C4 showed the highest percentage of O_HA women (36%), followed by C3 (27%). The proportion of O_LA women was the highest in C1 (26%), while similar in the other clusters (C2, 18% vs C3, 18% vs C4, 14%). None of the NW women showed uncontrolled eating behavior.
Diet impact on the gut microbiota of normal-weight and overweight/obese women
In order to identify the food types that contributed (p < 0.05, permutational correlation test) to the GM ordination, the food data from FFQs were superimposed on the unweighted UniFrac PCoA plot of Fig. 3 (Fig. 4a). A greater consumption of seasonings and condiments, olive oil, fried potatoes, and sausages, as well as sweetened drinks, milk, and yoghurt was associated with the GM configuration of cluster C2. On the other hand, the cluster C4 was characterized by higher consumption of cheese, while C1 and C3 by lower consumption of all the above-mentioned foods. The fiber intake (grams/1000 kcal) showed a positive correlation with the first PCoA axis and appeared to be higher in women with C1 cluster (Fig. 4b). The other three clusters showed comparable fiber intake values, with C2 being lower than C1 (p = 0.03, Wilcoxon rank-sum test). An opposite trend was observed for total energy intake (kcal/day), being negatively correlated to PCo1, and higher in clusters C2, C3, and C4 (Fig. 4b). Consistent with a greater propensity to uncontrolled eating (TFEQ UE) and exacerbated BITE symptom score, cluster C4 showed a higher energy intake than C1 (p = 0.04). The average frequency values of daily food consumption for each of the four GM groups are shown in Additional file 1: Table S2, together with additional information on each food category. When focusing on the intake of macronutrients (Fig. 4c), increased carbohydrate intake and reduced fat intake were observed in C4 compared with C1 (p = 0.05 and 0.04, respectively). A lower fat intake was also observed in C3 than in C1 (p = 0.01).
FFQ data were further explored in a Correspondence Analysis, where the first axis, describing over 13.7% of the dataset variance, contained most of the discriminating food types identified in the previous correlation analysis of FFQ data on the microbiota PCoA (i.e., cheese, sweetened drinks, seasonings and condiments). Application of Ward linkage clustering and Euclidean distance metrics to this axis allowed identifying three dietary groups: D1 (“low protein/high carbohydrate”), D2 (“high protein/low carbohydrate”), and D3 (“high fat/high protein”) (Fig. 5a). In particular, D1 was characterized by a greater consumption of sweet snacks, biscuits and eggs, D2 of salty snacks, fried food, meat, sliced ham, and homemade sandwiches, while D3 of dairy products (i.e., cheese, milk and yoghurt). The healthy food diversity (HFD) index, based on the number, distribution, and health value of consumed food , was subsequently calculated for each dietary group. According to HFD, D2, and D3 were the most diversified diets, while D1 the least (p = 0.0002, Kruskal–Wallis test) (Fig. 5b).
By matching the stratifications of women in dietary and microbiota groups, redundant combinations associated with the OB phenotype were sought (Additional file 1: Table S3). In particular, the combination of the less diversified D1 diet and C2 microbiota was the most prevalent among OB women (25% of the OB dataset), especially in O_LA (14%) and O_DHA (6%) women, followed by the combinations D1-C4 (6%) in O_HA women. Interestingly, none of the OB women possessed the combination D2-C4. As for the NW sub-cohort, the three dietary groups were found to be equally distributed within the C1 microbiota configuration, with the combination D1-C1 being the most prevalent (19% of the NW dataset). Three out of five NW women with the C4 microbiota configuration (8%) were associated with the less diversified D1 diet, while the remaining two (5%) with the D2 diet.
Species-level microbiome signatures of obesity and uncontrolled eating behavior
A subset of 45 DNA samples (31 from OB and 14 from NW women) was subjected to shotgun metagenomic sequencing, for a total of 15 Gb of paired-end reads. The metagenomics dataset was dominated by 8 bacterial species, which contributed 52.5–56.6% to ecosystem variability and were variously distributed among the four clusters (C1-C4): Faecalibacterium prausnitzii, Bifidobacterium adolescentis, Bifidobacterium longum, Ruminococcus bromii, Eubacterium rectale, Akkermansia muciniphila, Bacteroides vulgatus, and Subdoligranulum spp. (Fig. 6a). In particular, the cluster C1 was found to be enriched in R. bromii when compared to C2 (p = 0.03, Wilcoxon rank-sum test), as well as in F. prausnitzii compared to C3 and C4 (p < 0.03) (Fig. 6b). On the other hand, with respect to C1, the configuration C2 was enriched in Ruminococcus torques (p = 0.05), a mucolytic bacterial species known to compromise gut barrier integrity . Moreover, C2 showed the lowest levels of the mucin degrader A. muciniphila with respect to C1 and C3 (p < 0.02) as well as R. bromii compared to C1 and C4 (p < 0.03). As for the other GM configurations predominantly characterizing OB women with uncontrolled eating behavior (i.e., C3 and C4), C3 showed higher values of A. muciniphila and Subdoligranulum spp. compared to C2 (p < 0.02), while C4 was enriched in E. rectale with respect to C3 (p = 0.008), as well as in B. adolescentis and Bifidobacterium bifidum compared to the other three clusters (p < 0.05), probably due to the greater consumption of cheese (as revealed by the analysis of FFQs).
The transcriptionally active fraction of the gut microbiome and fecal lipidomic profiles in obesity and uncontrolled eating behavior
RNA sequencing was performed on the same samples subjected to metagenomics to investigate the active species-level fraction of the GM clusters and their transcriptional activity.
According to our findings (Additional file 1: Fig. S4), the most transcriptionally active fraction of the C1 configuration—mainly comprising NW women—included eight Bacteroides spp. (i.e., B. faecis, B. finegoldii, B. cellulosilyticus, B. massiliensis, B. coprophilus, B. dorei, B. plebeius, B. vulgatus), two Bifidobacterium spp. (B. dentium and B. animalis), Coprococcus catus, Lachnospiraceae bacterium (5_1_63FAA), two Roseburia spp. (R. intestinalis and R. inulinivorans), and Escherichia coli. Regarding the configurations that included proportionately more OB women (i.e., clusters C2, C3, C4), the active microbiome fraction was found to be overall depleted of Bacteroides spp., while enriched in generally subdominant bacterial species (e.g., Anaerostipes hadrus and Anaerostipes finegoldii, as well as Gordonibacter pamelaeae in C4). More precisely, the C2 configuration—including the highest proportion of O_DHA women—showed a transcriptionally active fraction mainly composed of A. hadrus, four Bacteroides spp. (i.e., B. thetaiotaomicron, B. fragilis, B. eggerthii, B. caccae), three Bifidobacterium spp. (i.e., B. pseudocatenulatum, B. breve, B. pseudolongum), Klebsiella pneumoniae, Lactobacillus ruminis, and Streptococcus thermophilus. On the other hand, when compared to the other clusters (i.e., C1, C2, C4), the transcriptionally active fraction of the C3 configuration—including mostly O_HA women—was found to be the most biodiverse in terms of active species, being primarily characterized by the overabundance of three Alistipes spp. (i.e., A. finegoldii, A. onderdonkii, A. shahii), Megamonas hypermegale, two Coprococcus spp. (i.e., C. sp. ART55 1, C. eutactus), Parabacteroides distasonis, Barnesiella intestinihominis, three Bacteroides spp. (i.e., B. nordii, B. ovatus, B. vulgatus), three Ruminococcus spp. (i.e., R. torques, R. obeum, R. bromii), R. intestinalis, Methanobrevibacter smithii, two Eubacterium spp. (E. eligens, E siraeum), and E. coli. Finally, the C4 configuration—including the highest percentage of O_HA women—was found to be the least transcriptionally diversified, being characterized by few but extremely active bacterial species, including G. pamelaeae, Lactobacillus casei/paracasei, B. bifidum, Adlercreutzia equolifaciens, E. rectale, Roseburia hominis and S. thermophilus.
We next evaluated the core species and gene distribution within the four GM configurations, focusing on the KEGG pathways involved in carbohydrate, amino acid, lipid, and xenobiotic metabolism (Additional file 1: Fig. S4 and Fig. S5). We found that only C1 and C2 covered all the aforementioned metabolic activities with a discrete number of bacterial species. On the other hand, the C3 cluster showed a higher biodiversity but mainly related to carbohydrate and amino acid metabolism, whereas C4 was the poorest consortium, and both clusters shared almost no xenobiotic metabolism. As for the distribution of KEGG pathways, glycolysis had the highest transcript abundance and, together with nucleotide sugar and pyruvate metabolism pathways, was over-transcribed in all samples. Other metatranscriptome pathways actively transcribed across the entire dataset included the breakdown of simple sugars (e.g., fructose, mannose and galactose) and the non-oxidative pentose phosphate cycle—supporting nucleic acid synthesis—as well as essential and sulfur-containing amino acid metabolism (e.g., glycine, serine and threonine, cysteine and methionine). Regarding cluster-specific features (Additional file 1: Fig. S6), C1 metatranscriptome was particularly enriched in the abovementioned housekeeping functions, suggesting an overall efficient basal metabolism by the corresponding GM layout. On the other hand, the cluster C2 showed a peculiar high transcript abundance for propanoate metabolism. The transcriptomic landscape of C3 cluster was overall the most active and diversified, with a high abundance of transcripts involved in the metabolism of several amino acids, fatty acid biosynthesis, butanoate metabolism, and pentose/glucuronate interconversion. The C4 transcriptome showed an increased abundance of transcripts devoted to glycerolipid and glycerophospholipid metabolism, suggesting alterations in fatty acid digestion. It is also worth noting that both C3 and C4 clusters showed increased transcription of genes involved in the biosynthesis of aromatic amino acids (i.e., phenylalanine, tyrosine, tryptophan), which are known to be involved in gut-brain communication .
Finally, a lipidomics analysis performed on the same stool samples allowed us to identify some discriminant metabolites (Additional file 1: Fig. S7) . In particular, higher SCFA (i.e., butyrate, acetate, propionate) levels were found in C1 and C2 compared to C3 and C4 (p ≤ 0.01, Kruskal–Wallis test). Cholesterol-to-coprostanol conversion was also reflected quite well within the former clusters. Clusters C3 and C4 were enriched in a number of converted sterols, such as coprostanol, 5β-sitostanol, and 5β-campestanol (p ≤ 0.04). On the other hand, the highest levels of cholesterol were associated with C2 (p ≤ 0.05, Wilcoxon rank-sum test). In accordance with the definition proposed by Matysik et al. , C2 included many non- and low-cholesterol converters, whereas many high converters and no non-converters were included in C3. As for bile acids, higher total amounts were found in C2 and C4, with the former being particularly enriched in cholic acid and chenodeoxycholic acid (p ≤ 0.03, Kruskal–Wallis test).
When seeking for correlations between lipidomics measures and the transcriptionally active fraction of GM (meaning both pathways and species) by means of sPLS regression (Additional file 1: Table S4 and Fig. S8), we found the strongest correlations between features of GM clusters that included predominantly OB women (especially C4). In particular, 5β-sitostanol (enriched in C3 and C4) positively correlated with the secondary bile acid biosynthesis by F. prausnitzii (which showed transcriptional activity for a specific gene involved in this pathway in C4—see next paragraph), as well as with B. longum pathways (related to galactose metabolism, glycolysis/gluconeogenesis, and cysteine and methionine metabolism, and actively transcribed across the entire dataset) and glyceroplipid metabolism by R. bromii (abundantly transcribed in C4 as discussed above). The latter, together with galactose metabolism by B. longum, also showed the strongest inverse correlations with the SCFAs acetate and propionate, consistent with their lower levels in C3 and C4.
Transcriptional variation in microbial genes related to metabolic homeostasis
Food intake and energy expenditure—ClpB and bile acids
Several GM-derived metabolites and bacterial proteins have been suggested to dialogue with the brain and regulate energy intake. Among them, the chaperon protein ClpB (caseinolytic peptidase B) has been proved to mimic the anorexigenic POMC (pro-opiomelanocortin)-derived alpha-MSH (alpha-melanocyte-stimulating) hormone, well known for its ability to influence host appetite . On the other hand, bile acids contribute to the regulation of host energetic homeostasis due to their role in lipid absorption, as well as by activating host receptors involved in thermogenesis . We therefore specifically assessed the transcriptional levels of ClpB and enzymes involved in the microbial metabolism of bile acids across the four GM clusters (Additional file 1: Fig. S9). ClpB was actively transcribed by A. muciniphila only in C1 configuration (mainly including NW women), by L. ruminis in C2 (i.e., mainly in O_DHA women), and by A. equolifaciens in C1, C3 and C4 clusters. As for bile acid metabolism, we found a cluster-specific transcriptional layout for three enzymes, i.e., choloylglycine hydrolase (K01442), 7-alpha-hydroxysteroid dehydrogenase (K00076), and 3-dehydro-bile acid delta 4,6-reductase (K07007). The microbial gene coding for K01442, the primary bile acid-deconjugating enzyme, was found to be actively transcribed by several microbial components of the clusters C1 and C3, namely R. intestinalis, C. catus, B. animalis, Coprococcus comes, Eubacterium ventriosum, Dorea longicatena, and A. shahii for C1 and R. obeum, M. smithii, A. finegoldii, P. distasonis, Eubacterium hallii, A. onderdonkii, and A. equolifaciens for C3. In contrast, K01442 was exclusively transcribed by A. hadrus and L. casei/paracasei in clusters C2 and C4, respectively. As for K00076, involved in secondary bile acid biosynthesis, it was found to be exclusively transcribed by E. coli in C1 cluster, while no transcriptional activity was observed for the configurations that included proportionately more OB women (clusters C2-C4), regardless of eating behavior. Finally, K07007, another enzyme involved in secondary bile acid biosynthesis, was exclusively transcribed by C. catus and M. smithii in C1, by R. obeum, A. shahii and E. rectale in C2, and by R. bromii, Coprococcus sp. ART55_1 and M. hypermegale in C3. As for C4, only a weak transcriptional activity by F. prausnitzii and R. hominis was observed.
Neuroendocrine signaling—tryptophan metabolites, opioids, endocannabinoids, and GABA
Patterns of microbial transcriptional activity specifically involved in the metabolism of tryptophan (Trp), endocannabinoids (eCBs), and opioids, as well as in the biosynthesis of GABA, bioactive molecules able to interact with the central nervous system and influence ingestive behavior [61, 62], were subsequently investigated (Additional file 1: Fig. S9).
The clusters C1 and C4 showed comparable Trp-related transcriptional activity lower than C2 and C3, with only slight transcriptional activity responsible for the conversion of indole-3-pyruvic acid to indole-3-lactic acid (K03778), attributed to R. intestinalis and R. hominis, respectively. On the other hand, the configurations that included predominantly OB women (i.e., clusters C2, C3, C4) shared increased transcriptional activity for the conversion of pyruvate into acetyl-CoA by other genes (K00170 and K00172), although attributable to different bacterial actors. Interestingly, the C3 cluster—which mainly included O_DHA women—was characterized by the highest microbial activity devoted to directly converting Trp to indole through tryptophanase (K01667), with A. finegoldii, A. muciniphila, A. shahii, B. ovatus, and A. onderndonkii being the most transcriptionally active species.
With regard to the eCB system, we queried our transcriptome dataset for microbial enzymes involved in the biosynthesis of precursors of endogenous ligands of cannabinoid receptors (i.e., anandamide and 2-arachidonoylglycerol). According to our findings, the enzyme triacylglycerol lipase (K01046), involved in the formation of diacylglycerol, a precursor of 2-arachidonoylglycerol , was actively transcribed exclusively in C1 by B. dentium. C2 showed a residual activity by B. dentium, while no transcriptional activity was observed in C3 and C4 clusters.
As for the opioid metabolism, the C1 configuration showed the highest transcriptional levels of β-glucuronidase (K01995), the microbial enzyme involved in the metabolic pathways of morphine , by B. dentium and B. longum, as well as by B. finegoldii. On the other hand, Bifidobacterium and Bacteroides spp. were not found to be transcriptionally active in configurations that included proportionately more OB women, with only a slight activity by E. coli in C2, F. prausnitzii in C3 and R. hominis in C4.
Finally, the glutamate decarboxylase gene involved in GABA production (K01580) was actively transcribed in C1 by B. faecis, B. cellulolyticus, B. uniformis, and B. finegoldii. On the other hand, for the GM configurations that included predominantly OB women, the major contribution to K01580 transcription was provided by B. dentium and B. fragilis in C2, while Bacteroides egghertii, B. nordii, B. caccae, B. ovatus and A. finegoldii in C3. Interestingly, low to zero transcriptional activity was observed within C4.