Disparities in incidence and outcome of rectal cancer are multifactorial in etiology but may be due, in part, to differences in gut microbiome composition. We used serial robust statistical approaches to assess baseline gut microbiome composition in a diverse cohort of patients with rectal cancer receiving definitive treatment.
Microbiome composition was compared by age at diagnosis (< 50 vs ≥ 50 years), race and ethnicity (White Hispanic vs non-Hispanic), and response to therapy. Alpha diversity was assessed using the Shannon, Chao1, and Simpson diversity measures. Beta diversity was explored using both Bray-Curtis dissimilarity and Aitchison distance with principal coordinate analysis. To minimize false-positive findings, we used two distinct methods for differential abundance testing: LinDA and MaAsLin2 (all statistics two-sided, Benjamini-Hochberg corrected false discovery rate < 0.05).
Among 64 patients (47% White Hispanic) with median age 51 years, beta diversity metrics showed significant clustering by race and ethnicity (p < 0.001 by both metrics) and by onset (Aitchison p = 0.022, Bray-Curtis p = 0.035). White Hispanic patients had enrichment of bacterial family Prevotellaceae (LinDA fold change 5.32, MaAsLin2 fold change 5.11, combined adjusted p = 0.0007). No significant differences in microbiome composition were associated with neoadjuvant therapy response.
We identified distinct gut microbiome signatures associated with race and ethnicity and age of onset in a diverse cohort of patients undergoing definitive treatment for rectal cancer.
INTRODUCTION
The increasing incidence of early-onset colorectal cancer (EOCRC), defined as a diagnosis of CRC in patients aged less than 50 years, has become a growing concern over the last four decades.[1] This trend is particularly associated with rectal tumors, with notable racial and ethnic disparities in presentation and outcome.[2] For instance, Black individuals have the highest EOCRC incidence and mortality rates,[3] whereas Hispanic patients, despite overall lower overall incidence, tend to be diagnosed at younger ages compared to non-Hispanic White individuals.[4,5]
The mechanisms behind this increasing incidence and observed disparities are not understood. Genomic studies have not yet identified meaningful molecular differences by age, and heritable conditions only account for a fraction of cases in young patients.[6] This scenario suggests a potential role for other factors, such as the gut microbiome, in the pathogenesis of EOCRC.[7] Previous studies have linked specific microbes with CRC, with some data supporting causality in animal models.[8,9] Regarding age, prior research suggests that younger patients tend to gain “harmful” taxa in contrast to older patients who lose “beneficial” taxa.[10,11] Furthermore, healthy humans have distinct microbiome profiles by race and ethnicity, with one study showing an overabundance of CRC-associated bacteria in Black individuals.[12,13]
However there remains a scarcity of research on microbiome variations by race and ethnicity among patients with cancer, in part due to minority underrepresentation in clinical trials and biospecimen repositories.[14,15] Understanding microbiome differences, such as differential taxa abundance or degree of diversity, could provide insight into the mechanisms underlying the increasing incidence of rectal cancer and the disparities in outcomes. As such, we assessed differences in gut microbiome composition and association with race and ethnicity, age of onset, and treatment outcome among a diverse rectal cancer patient cohort.
METHODS
Patient Cohort
The University of Texas Southwestern institutional review board approved this prospective study, and consent was obtained from patients participating. Adults (age > 18 years) with newly diagnosed stage II–IV rectal adenocarcinoma undergoing definitive intent treatment were included in this study. Specifically, this included patients with limited stage IV disease who were treated with curative local therapy including radiation. Patients who had undergone up-front abdominal perineal resection or diverting colostomy for obstruction were excluded due to the effects of these surgeries on their microbiomes, which could skew analyses. Data collected included cancer TNM stage (AJCC 8th edition), age at diagnosis, sex, self-reported race and ethnicity, body mass index (BMI), treatment regimen, medical oncology facility, detailed antibiotic use, and follow-up magnetic resonance imaging (MRI) and pathology reports. Antibiotic use was stratified into three groups: (1) patients with no antibiotics or a single cefazolin administration more than 90 days prior to stool collection; (2) those with a single cefazolin administration within the past 90 days; and (3) patients who received a broad-spectrum antibiotic active against anaerobes in the previous 180 days. Parkland Health and Hospital System is the sole safety-net health system in Dallas, TX and provides oncology care for a diverse population of uninsured persons in North Texas.
Response Grading
Response was defined primarily pathologically, with radiographic grading used for patients without primary surgery. The pathologic specimen was graded via the modified Ryan scheme for tumor regression, which includes the following four categories: complete response, near complete response, partial response, and poor or no response.[16] For patients who did not undergo primary surgery, response was graded based upon restaging MRI performed after chemotherapy and radiation. MRI tumor regression grade includes the following categories: complete response, near complete response, moderate response, and slight response. The reason for the patient not undergoing primary surgery was also recorded. For the purposes of our study, both radiographic and pathologic response were dichotomized into complete or near complete versus partial, poor, moderate, or slight response.
Stool Collection and Processing
Patients provided stool samples prior to radiation initiation, which were subsequently stored at −80°C. Genomic DNA was extracted from these samples using the MagAttract Power Microbiome DNA/RNA KF kit (Qiagen) and Kingfisher Flex (Thermo Fisher Scientific). From each sample, 16S rRNA genes (variable region 4, V4) were amplified using uniquely bar-coded primers. Polymerase chain reaction (PCR) reactions consisted of Accuprime Pfx Supermix, primers, and template. Following amplification, PCR products were verified, cleaned, and normalized. Pooled samples were sequenced using Illumina MiSeq (PE-250).[17] Postsequencing, raw sequences were quality filtered and primer mismatches or ambiguous bases were removed. Alignment and read count from FASTQ files were conducted with DADA2 in R.[18] Taxa were assigned to the genus level with the Silva nr 99 v138.1 training data, and species were assigned with exact matches to ASVs with the Silva species assignment v138.1 dataset.[19]
Statistical Analysis
Differential abundance (DA) testing was conducted at all taxonomic levels except domain using LinDA and MaAsLin2, with 20% abundance filtering performed at each level.[20–22] Four contrasts were investigated: early vs average onset, White Hispanic vs non-Hispanic race and ethnicity, complete or near complete response vs partial or poor response, and broad-spectrum vs cefazolin vs no antibiotic use. Fold change p-values from both methods were combined using the Cauchy combination test[23] and adjusted for multiple testing with the Benjamini-Hochberg method (p < 0.05 considered significant). In all DA tests, mixed-effects models were used with sequencing batch as a random effect and antibiotic use as a fixed effect.[24]
Baseline patient characteristics and potential associations with response were assessed using Fisher exact tests or Wilcoxon rank-sum tests. The Wilcoxon rank-sum test additionally evaluated alpha diversity indices (Shannon, Chao1, and Simpson, significance considered at p < 0.05). Beta diversity was assessed with principal coordinate analysis (PCoA) of both the Bray-Curtis dissimilarity and the robust Aitchison[25] distance (10% abundance filtering, counts adjusted for batch effects using the ConQuR[26] package with default settings for logistic LASSO correction including batch size).
Significant clustering on PCoA was assessed through permutational multivariate ANOVA (PERMANOVA), and differences in dispersion were evaluated using the betadispr function, both from the vegan package. All permutation tests were performed with 100,000 permutations and all calculations done in R (version 4.2.1). The sequencing data generated in this study are publicly available in the NIH SRA (SUB12910880), and all code for computations available in the public GitHub repository (DavidHein96/microbiome_crc_workflow).
RESULTS
Between October 2020 and August 2022, 64 patients met inclusion criteria for response analyses, including 30 (47%) White Hispanic patients (Table 1), and approximately half (47%) of patients were diagnosed before the age of 50. Median age was 49 years for White Hispanic and 52.5 for non-Hispanic patients (p = 0.13, Wilcoxon). Overall, 52% had a complete or near complete response to therapy when categorized binarily including pathologic and radiographic response (Table 1). White Hispanic patients had a significantly higher median BMI than non-Hispanic patients (29.55 vs 25.4, p = 0.008).
When comparing race and ethnicity, we found that White Hispanic patients had significant enrichment of bacterial family Prevotellaceae (LinDA fold change = 5.32, MaAsLin2 fold change 5.11, combined adjusted p < 0.001, Figs. 1 and 2, Supplemental Table S1) compared to non-Hispanic patients. Additionally, both beta diversity metrics showed significant clustering by race and ethnicity (Figs. 3A and B, p < 0.001 in both metrics, PERMANOVA). In contrast, there were no differences in alpha diversity (Fig. 4A).
We did not observe any significantly differentially abundant taxa by CRC age of onset category. We did, however, observe significantly lower Shannon diversity in EOCRC patients (p = 0.029; Fig. 4B) and both beta diversity metrics showed modest clustering (Aitchison p = 0.022, Bray-Curtis p = 0.035; Figs. 3C and D).
Variables associated with response were initial T stage (p = 0.042, Table 1) and medical oncology facility (p = 0.123, Table 1). These variables were included as covariates in the DA analysis for response. However, both before and after controlling for T stage and facility, there were no significant differences in taxa abundance or beta and alpha diversity between complete or near complete and partial or poor responders (Figs. 3E, 3F, and 4C).
Finally, patients with broad spectrum antibiotic use were enriched in family Enterococcaceae (LinDA fold change = 4.15, MaAsLin2 fold change 2.79, combined adjusted p = 0.005, reference group no recent antibiotic use, Supplemental Table S2).
DISCUSSION
In this study of baseline microbiome features in rectal cancers receiving definitive therapy, we found clustering of microbiome composition by race and ethnicity with significant enrichment of Prevotellaceae among White Hispanic patients. Abundance of Prevotella, the major genus within Prevotellaceae, has been associated with positive traits, such as plant-based diets and improvements in glucose metabolism.[27,28] However, other studies have documented negative associations with Prevotellaceae, including increased risk of inflammatory disease, higher rates of chemotherapy-induced toxicity, and decreased response to chemotherapy in CRC mouse models.[27–31] The various reported effects of Prevotellaceae are likely due to their large strain diversity.[32] Related to our findings, a study of Hispanic individuals showed that a higher Prevotella to Bacteroides ratio was associated with obesity.[33] Interestingly, an analysis of healthy subjects in the American Gut Project did not find Prevotellaceae to be differentially abundant across race and ethnicity.[12] In our cohort, White Hispanic patients in our cohort had significantly higher BMI, which is a potential risk factor for colorectal cancers.[34]
We found a modest degree of clustering by beta diversity (comparison of gut microbiome populations between groups) and no significant differences in specific taxa abundance between early- and average-onset patients. A prior meta-analysis suggested that younger patients have increased abundance of “carcinogenic” taxa.[10] Multiple studies have compared microbiome composition in early- versus average-onset CRC to determine whether microbiome profiles can serve as a diagnostic biomarker and have prognostic or predictive value.[10,11,35,36] The results are discordant, with some studies suggesting association of EOCRC with specific bacterial colonization and others showing no differences, including our cohort.
Antibiotics can have profound effects on gut microbiome populations. We separated patients who received multiple doses of broad-spectrum antibiotics due to active infection versus those who received only a single intravenous administration of cefazolin as a procedure prophylaxis. We found that exposure to broad-spectrum antibiotics, specifically those that are effective in killing anaerobic taxa, was associated with expansion of Enterococcaceae. Interestingly, Enterococcus spp. expansion was also observed in adult and pediatric stem cell transplant patients who received broad spectrum antibiotics and developed the posttransplant autoimmune complication graft-versus-host disease, a result that was phenocopied in a preclinical GVHD model.[37,38] A recent study demonstrated that an Enterococcus faecalis–derived metabolite was able to promote colorectal cancer progression in vitro.[39]
Limitations of this study include modest sample size, use of the shorter-term end point of treatment response,[40] and observational study design. The negative findings on treatment response are complicated by the wide range of treatments received and varying baseline characteristics of our patients. A lack of overlap on and incomplete capture of confounding variables precludes our ability to perform further analysis on treatment response. Future studies with larger sample sizes, longitudinal sampling and microbiome profiling, and use of in vitro and in vivo laboratory approaches to investigate causality are warranted.
Most microbiome studies use a single DA method, which can result in a high false-discovery rate.[41–44] A strength of our study is the reporting of taxa identified as enriched or depleted by both of two DA methods, as well as the use of mixed-effects models to control for antibiotic use and batch effects. The two DA methods chosen demonstrate further robustness in our results as they each use different normalization methods, cumulative sum scaling in MaAsLin2 and centered log ratio with correction for library size bias in LinDA. Furthermore, we used both a traditional (Bray-Curtis dissimilarity) and a compositional (Aitchison distance) beta diversity metric and controlled for batch effects using ConQuR.
CONCLUSION
We identified microbiome composition differences by race and ethnicity in a diverse cohort of patients undergoing definitive treatment for rectal cancer. Future studies are warranted to examine potential mechanisms by which gut microbiome composition may affect CRC carcinogenesis and treatment effect. This could lead the way to therapeutic interventions that improve outcomes, particularly in traditionally understudied and underserved populations.
Supplemental Material
Supplemental materials are available online with the article.
References
Nina N. Sanford and Andrew Y. Koh contributed equally to this work.
Hein DM, Coughlin LA, Poulides N, Koh AY, Sanford NN. Assessment of distinct gut microbiome signatures in a diverse cohort of patients undergoing definitive treatment for rectal cancer
Source of Support: Andrew Y. Koh received financial support from the National Institutes of Health (grant K24 AI123163) and the University of Texas Southwestern Medical Center and Children's Health Cellular and Immuno Therapeutics Program. Andrew Y. Koh and Nina N. Sanford received financial support from Harold C. Simmons Comprehensive Cancer Center Early Onset Colorectal Cancer Pilot Grant.
Conflict of Interest: Andrew Y. Koh is a consultant for Prolacta Biosciences, received research funding from Novartis and Prolacta, and is a cofounder of Aumenta. The remaining authors have no disclosures.