Comprehensive multi omics explore the microbial function in metabolic pathway flow during altered diet - npj Science of Food

In summary, this study provides exploratory insights into the complex interactions among diet, gut microbiota, and host metabolism, using a multi-omics framework. By examining changes in microbial composition and metabolites across dietary interventions, our findings generate hypotheses regarding the potential resilience of the gut ecosystem under fiber reintroduction. While these results are preliminary and limited by small sample sizes and a lack of mechanistic validation, they highlight key microbial taxa and pathways that warrant further investigation. This work serves as a conceptual foundation for future mechanistic studies aimed at understanding how dietary modulation may influence host physiology through the microbiota.

Female BALB/c mice (3 weeks old, 12-15 g) were purchased from Hunan Sleek Jingda (SLAC), Changsha, China. All mice were raised in plastic cages covered with metal fences under specific pathogen-free conditions. The mice were housed in individually ventilated cages (IVCs) with free access to food and water. The environmental conditions of housing were maintained as follows: temperature (25 ± 5 °C), humidity (60-70%), and light (12/12 h light/dark cycle). Before commencing the experiment, mice were provided with a period of 5 days to adapt to the experimental environment.

After adaptive feeding, the mice were randomly divided into two groups, with eight mice in each group. According to the experimental design, the two groups were fed different types of diets. The two groups are as follows: the standard diet (SD) group, healthy mice with an AIN-93G diet. The Alter group, healthy mice with a high protein diet (HPD) for 4 weeks, followed by a high fiber diet (HFiD) for four weeks. HPD and HFiD are both modified based on the AIN-93G standard diet. The diet information of the two groups of mice was shown in Table S2. And the content of nutrients and energy in different diets were shown in Table S3. All mice were allowed to freely eat during the feeding phase. Dietary intervention was performed on mice after adapting to the environment. Collect fecal samples from 0 weeks before dietary intervention, and then collect fecal samples every 2 weeks after dietary intervention.

After immobilizing the mice, lifting their tail and gently pressing the lower abdomen to collect fresh feces into sterile EP tubes, which were then stored at -80 °C for further analysis. We randomly selected six mice in each group and performed LC-MS/MS non-targeted metabolomics on fecal samples from 2, 4, 6, and 8 weeks after dietary intervention.

Three mice were randomly selected in each group, and a total of 30 fecal samples from 0, 2, 4, 6, and 8 weeks of dietary intervention were subjected to 16S rRNA gene sequencing. And the other two mice were selected, and a total of 16 fecal samples from 2, 4, 6, and 8 weeks of dietary intervention were subjected to shotgun metagenomic sequencing. Five and four mice were selected from the mice with different dietary interventions for quantitative detection of microorganisms in feces by qPCR and ELISA analysis of specific metabolites.

All food was produced by Beijing HFK Bioscience Co., Ltd., Beijing, China. The production numbers were 20220249 and 20220274. The diet formulation for the SD group was complied with the American Institute of Nutrition's food-grade ingredient formulation standard AIN-93G. High protein and high fiber diets were modified according to AIN-93G criteria. The study was approved by the Laboratory Animal Ethics Committee of Xiangya Hospital of Central South University (No. 2022030735).

The fecal samples were accessed using 16S rRNA gene sequencing. In detail, DNA extraction total genomic DNA samples were extracted using the OMEGA Soil DNA Kit (M5635-01) (Omega Bio-Tek, Norcross, GA, USA), following the manufacturer's instructions. PCR amplification of the bacterial 16S rRNA genes V3-V4 region was performed using the forward primer 338F (5'-ACTCCTACGGGAGGCAGCA-3') and the reverse primer 806R (5'-GGACTACHVGGGTWTCTAAT-3'). Sample-specific 7 bp barcodes were incorporated into the primers for multiplex sequencing. The PCR components contained 5 μl of buffer (5×), 0.25 μl of Fast pfu DNA Polymerase (5 U/μl), 2 μl (2.5 mM) of dNTPs, 1 μl (10 μM) of each forward and reverse primer, 1 μl of DNA template, and 14.75 μl of ddHO. Thermal cycling consisted of initial denaturation at 98 °C for 5 min, followed by 25 cycles consisting of denaturation at 98 °C for 30 s, annealing at 53 °C for 30 s, and extension at 72 °C for 45 s, with a final extension of 5 min at 72 °C. PCR amplicons were purified with Vazyme VAHTSTM DNA Clean Beads (Vazyme, Nanjing, China) and quantified using the Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen, Carlsbad, CA, USA). After the individual quantification step, amplicons were pooled in equal amounts, and pair-end 2 × 250 bp sequencing was performed using the Illumina NovaSeq platform with NovaSeq 6000 SP Reagent Kit (500 cycles).

The sequenced 16S rRNA gene sequence is screened with low quality (Discards bases with Phred score <4, and removes sequences with any ambiguous bases), merged, truncated to 400 bp, and then the chimera was removed, as referred to our previous article. Finally, the remaining sequences were clustered into operational taxonomic unit (OTU). To reduce false positives of obtaining data, only the OTUs mapped reads more than ten were retained and rarefied to the minimum read counts using R package "picante". By rarefication, each sample has 21,506 effective sequences (reads), and the sequencing depth reached saturation (Fig. S12). Next, the taxonomic classification was performed according to the Silva database (version 138). To investigate the variation of microbial communities across samples, the Shannon-Wiener and Gini-Simpson indexes were calculated. At the same time, the Bray-Curtis distance metrics were calculated based relative abundance of OTUs and visualized via nonmetric multidimensional scaling (NMDS) by R package "vegan".

To study the microbial function alteration, the shotgun metagenomic sequencing was carried out. Total microbial genomic DNA was extracted using the QIAGEN DNeasy PowerWater Kit (14900-100-NF). The extracted microbial DNA was processed to construct metagenome shotgun sequencing libraries with insert sizes of 400 bp by using Illumina TruSeq Nano DNA LT Library Preparation Kit. Each library was sequenced by the Illumina HiSeq X-ten platform (Illumina, USA) with a PE150 strategy.

To ensure the data accuracy and obtain the clean reads, raw sequences were processed to remove low-quality (average quality <35 or contains more than 16 unclear bases) reads using Fastp (version 0.21.0) and FastUniq (version 1.1.0) to eliminate duplicates in paired short DNA sequence reads in a FASTQ format. The sequences from humans were filtered out by mapping to the mice reference genome (mm39) using Bowtie2 (version 2.3.5) with sensitive mode. Afterwards, the clean reads were assembled into contigs using Megahit (version 1.2.9) with default settings. The contigs greater than 500 bp were used to perform the gene prediction with MetaGeneMark (version 3.38). After that, we used hmmer (version 3.3.1) based on the AntiFam database to remove spurious genes. A gene catalog was established using mmseq2 (version 13.45111) based on minimum sequence identity >95% and clustering threshold >90%. The function annotation for the gene catalog was performed by kofam-scan (version 1.3.0), and only the annotation with e-value less than 0.00001 was remained. For each gene, reads from that sample were mapped using BWA (version 0.7.17), and calculated the gene abundance according to the mapped read number divided by gene length. Functional composition analysis was performed based on gene abundance. And then we performed Permutational multivariate analysis of variance (PERMANOVA) using the R package "vegan" and uniform manifold approximation and projection (UMAP) using the Python package "UMAP" to analyze the functional composition of gut microbial community at the gene level.

For traceability analysis of differential genes, we extracted all original gene sequences first. Secondly, the differential gene sequences were mapped to the NCBI non-redundant database by DIAMOND (version 2.0.2). At last, we utilized MEGAN (version 6.20.11) software to determine its belonging classification of microbes.

Metagenome assembled genomes (MAGs) were generated using MetaBAT2 (version 2.12.1). The quality of MAGs was evaluated with CheckM (version 1.2.0) for completeness and contamination as a genome. The high or median quality MAGs (HM-MAGs) with >50% completeness and <5% contamination were remained. The remained HM-MAGs were clustered to species-level genome bins (SGBs) based on average nucleic acid identity ≥95% using dRep (version 3.4.0). The GTDB-tk or a set of five linoleic and isomers of conjugated linoleic acids discernible in the data (as informed by the metabolomic analysis) (version 1.7.0) was used for taxonomic identification with Genome Taxonomy Database (r202). The abundance of each species was calculated using CoverM (version 0.6.1).

Genomic DNA was isolated from mouse feces TIANGEN Fecal DNA Extraction Kit (#DP328-02, TIANGEN, China). Quantitative real-time PCR (RT-qPCR) was conducted using Universal SYBR Green Fast qPCR Mix (#RK21203, ABclonal, China) in combination with specific primers listed in Table S4 and the extracted DNA templates. Amplification was carried out on the ROCHE LightCycler® 480 System (Rotor-Gene 6000 Software, Sydney, Australia). The relative bacterial abundance was determined using the -ΔCt method, with universal 16S rRNA gene primers for Eubacteria serving as the internal reference.

The sample was stored at -80 °C and thawed on ice prior to analysis. A 400 μL solution of methanol and water (7:3, v/v), which included an internal standard, was combined with 20 mg of the sample and vortexed for 3 min. Subsequently, the mixture underwent sonication in an ice bath for 10 min, followed by an additional vortex for 1 min. It was then placed at -20 °C for 30 min. The sample was centrifuged at 12,000 rpm for 10 min at 4 °C, after which the sediment was discarded. The supernatant was centrifuged again at the same speed for 3 min, and a 200 μL aliquot was collected for LC-MS analysis.

For LC-MS, we employed a Waters ACQUITY UPLC BEH C18 column (1.8 µm, 2.1 mm × 100 mm) maintained at 40 °C, with a flow rate of 0.4 mL/min and an injection volume of 2 μL. The solvent system consisted of water (0.1% formic acid) and acetonitrile (0.1% formic acid). The elution process began with 5% mobile phase B, followed by a linear gradient to 90% mobile phase B over 11 min, maintained for 1 min, and then quickly reverted to 5% within 0.1 min, followed by a 1.9-min hold to restore initial conditions.

For TOF MS scans, the mass range was established at 50-1000 Da, with an accumulation time of 200 ms and dynamic background subtraction activated. Product ion scan parameters included a mass range of 25-1000 Da, an accumulation time of 40 ms, collision energy set to 30 or -30 V, and a collision energy spread of 15. Additional specifications comprised a resolution of UNIT, a charge state of 1 to 1, intensity at 100 cps, exclusion of isotopes within 4 Da, a mass tolerance of 50 mDa, and monitoring a maximum of 12 candidate ions per cycle.

The quality control (QC) samples were prepared by pooling extracts from all study samples to evaluate the reproducibility of sample processing. Multiple QC samples were interspersed during instrumental analysis to monitor analytical repeatability. The results showed high overlap in total ion chromatogram (TIC) curves of metabolite detection, with consistent retention times and peak intensities, indicating good signal stability for the same sample analyzed at different times (Fig. S13A). Additionally, Pearson correlation analysis of QC samples demonstrated strong consistency, confirming the overall stability of the detection process (Fig. S13B).

The original data of mass spectrometry was converted into mzML format by ProteoWizard, and the peak extraction, alignment and retention time correction were carried out by the XCMS program. The "SVR" method is used to correct the peak area, and the peaks with a missing rate >50% in each group of samples are filtered. The screened peaks were corrected, and the identification information of metabolites was obtained by searching the self-built database in the laboratory, integrating the public database and the metDNA method.

The analysis software used for the identification of non-target metabolites is MetDNA. Priority of data using different databases: Metlin (http://metlin.scripps.edu/index.php), HMDB (https://hmdb.ca/) and Kegg (https://www.kegg.jp/). Mona (https://mona.fiehnlab.ucdavis.edu/), MassBank (http://www.massbank.jp/). followed by MetDNA construction and derivation, and finally Pubchem, ChemSpider. Non-target qualitative provides substances with a score of more than 0.5. Detailed matching rules are as follows: mz error of first-level and second-level matching is 25 ppm, rt error of first-level and second-level matching is 6 s, allowable error of parent ion Q1 searching is 25 ppm, and allowable error of MS2 searching is 50 ppm. rt carries out searching and filtering, with an allowable error of 60 s and the lowest second-level score of 0.3. The final score consists of three parts: fragment score and forward searching. The weight of the reverse search score is: the proportion of debris score = 0.1, the proportion of forward search score = 0.3, and the proportion of reverse search score = 0.6.

Levels of tryptamine, indolepyruvate, and indole-3-acetaldehyde in mouse fecal samples were measured using commercial ELISA kits (#F0233, F0442, F0450; FANNKEW Bioscience, China) following the manufacturer's instructions. In this format, antigen in the sample competes with a fixed amount of enzyme-labeled antigen (HRP-conjugated) for binding to a limited amount of specific antibody precoated on the 96-well plate. Briefly, standards and samples were diluted as required and added to the plate along with the HRP-conjugated antigen. After incubation at 37 °C for 1 h, the plate was washed five times with washing buffer to remove unbound components. Then, the TMB substrate solution was added and incubated in the dark for 15 min, followed by the addition of the stop solution. Absorbance was measured at 450 nm. The optical density is inversely proportional to the concentration of the target analyte in the sample. A standard curve was generated, and sample concentrations were interpolated accordingly. All samples were run in triplicate.

In order to identify microbial features associated with diets and the metabolites, we measured probabilities of co-occurrence between species (based on metagenomic data) and either all metabolites. The Mmvec, a neural network solution inspired by deep-learning processing, to build a log-transformed conditional probability matrix from each cross-omics feature pair and apply singular value decomposition in order to represent co-occurrence in the form of biplots. We chose the model where accuracy was highest for different initialization conditions for the gradient descent algorithm (batch-size of 1000, 2500, and 5000, learn-rate of 0.001, epochs 9999), with low cross-validation error and model likelihood.

After obtaining the co-occurrence probabilities between metabolites and microbes significantly enriched under specific dietary conditions, we retained, for each metabolite, the microbes with the highest and lowest co-occurrence probabilities. To ensure biological relevance, we further filtered these pairs by requiring that the microbe with the highest co-occurrence probability be enriched under the same dietary condition as the metabolite, while the microbe with the lowest co-occurrence probability be enriched under the opposite dietary condition.

To clarify where metabolites among diets come from, we utilized MetOrigin, an online tool for the integrative analysis of the metabolites and discriminating between host, food, microbiota, and co-metabolism activities associated with microbial metabolites. MetOrigin obtains the source information of metabolites by integrating multiple databases and cross-validating database IDs, chemical formulas, and compound names. In addition, the enrichment analysis of metabolite data was performed using MetaboAnalyst, which is a comprehensive platform dedicated to metabolomics data analysis. The hypergeometric test was used to calculate p values and corrected with the Benjamini and Hochberg (BH) method.

For parametric feature-wise multivariable testing, we used MaAsLin2, which finds associations between microbial features and metadata of interest (including responsive microbes and metabolites on diets). To focus on the leading microbes and decrease the false positive, we only consider the microbes with relative abundance more than 0.1% in each sample. MaAsLin2 can utilize a transformed generalized linear model to associate each feature iteratively with covariates.

The Procrustes Analysis applied by the R package vegan of the correlation among OTUs, genes and metabolites was performed based on the principal component analysis (PCA) matrix obtained from the package sklearn (version 1.1.1) in Python. Using PCA for each data, we kept ten dimensions, which in addition can keep more than 95% information of each dataset. All statistical tests were using "scipy" or "statsmodels" package in Python. Most data visualization was applied using "ggplot2" in R or "matplotlib" in Python. In addition, we used GraPhlAn to construct the cladogram. To evaluate whether the sample size was sufficient to detect meaningful biological effects, we performed a power analysis using the "effsize" R package. And the detailed results are in Table S5.

Comprehensive multi omics explore the microbial function in metabolic pathway flow during altered diet - npj Science of Food

POPULAR CATEGORY

misc

entertainment

corporate

research

wellness

athletics