The response to radiotherapy can vary greatly among individuals, even though advances in technology allow for the highly localized placement of therapeutic doses of radiation to a tumor. This variability in patient response to radiation is biologically driven, but the individuality of tumor and healthy tissue biology are not used to create individual treatment plans. Biomarkers of radiosensitivity, whether intrinsic or from hypoxia, would move radiation oncology from precision medicine to precise, personalized medicine. Charged particle radiotherapy allows for even greater dose conformity, but the biological advantages of charged particle radiotherapy have not yet been cultivated. The development of biomarkers that would drive biologically based clinical trials, identify patients for whom charged particles are most appropriate, or aid in particle-selection strategies could be envisioned with appropriate biomarkers. Initially, biomarkers for low–linear energy transfer (LET) radiation responses should be tested against charged particles. Biomarkers of tumor radioresistance to low-LET radiations could be used to identify patients for whom the enhanced relative biological effectiveness (RBE) of charged particles would be more effective compared with low-LET radiations and those for whom specific DNA-repair inhibitors, in combination with charged particles, may also be appropriate. Furthermore, heavy charged particles can overcome the radioresistance of hypoxic tumors when used at the appropriate LET. Biomarkers for hypoxia could identify hypoxic tumors and, in combination with imaging, define hypoxic regions of a tumor for specific ion selection. Moreover, because of the enhanced RBE for charged particles, the risk for adverse healthy tissue effects may be greater, even though charged particles have greater tumor conformality. There are many validated healthy-tissue biomarkers available to test against charged particle exposures. Lastly, newer biological techniques, as well as newer bioinformatic and computational methods, are rapidly changing the landscape for biomarker identification, validation, and clinical trial design.
In biology and medicine, biomarker describes a biological indicator of a normal or particular pathogenic biological process, such as an indicator of a disease or disease subtype, a prognostic indicator for the response to treatment irrespective of the therapy, or an indicator for choosing a specific therapy, for example. Many researchers also use biomarker as shorthand to describe an association with a phenotype. The regional definitions for biomarkers are similar. In the United States, biomarker is defined by the US Food and Drug Administration as an objectively measured and evaluated indicator of normal or pathogenic processes or biological responses to a therapeutic intervention. The definition of a genomic biomarker by the International Conference for Harmonisation (ICH), which includes members of government and industry from around the world, is a measurable DNA and/or RNA characteristic that is an indicator of normal or pathogenic processes and/or response to therapeutic interventions (ICH E15). Conditions that support designation as a biomarker include epidemiologic evidence of disease association or disease causality. Depending on the setting in which the biomarker candidate was identified, changes in the candidate should, for example, reflect changes in prognosis or clinical outcomes. The biomarker candidate should be validated by examination in a similar, but independent, data set. Furthermore, predictive biomarkers should test negative in data sets in which the clinical conditions are similar but where a different treatment was used. For example, a candidate biomarker found in data from patients treated with radiation or with a pharmacologic agent should test negative in data from patients treated with surgery alone.
Most putative biomarkers are published and then vanish because they were rarely tested outside the data set from which they were derived. Many have failed for a variety of reasons unrelated to their initial identification. Simply put, the nexus of biomedical research and medical records may not connect. In the past, biomarkers have come from specimens of convenience that may have been poorly annotated, were from heterogeneous tissues, or had errors in the clinical phenotype. Obstacles to biomarker development include lack of initial input into trial design and specimen access; the type of specimen available (fresh, frozen, fixed, or archival); and the size of the specimen. Beyond sample acquisition, data analysis may be complex because of variabilities in platforms, formats for data integration, ever-changing analytic tools (machine learning, artificial intelligence), different infrastructures and data storage, and patient privacy. Because data scientists are making rapid progress on those fronts through newer analytics and computational infrastructure (eg, graphic user interface–based programming and supercomputing), the ever-increasing volume of clinically annotated genomic data provides the opportunity to digitally test candidate biomarkers using independent data sets generated from properly annotated patient-derived specimens or preclinical models of disease. However, the final hurdle for designation as a biomarker is the prospective analysis of a candidate through biologically driven trials.
Biomarkers Associated with Radioresponse
The rationales for using radiosensitivity biomarkers include modifying total dose or dose fractionation schemes and determining rational chemotherapy combinations appropriate to the underlying tumor biology. Although predictive biomarkers for HER2 or EGFR mutations are routinely used to identify the response to targeted agents in breast and lung cancer, respectively, no such biomarker exists for radiation oncology, with the possible exception of human papillomavirus (HPV) status in head and neck squamous cell carcinoma (HNSCC) [1, 2]. Indeed, clinical trials for both dose de-escalation in HPV+ and dose escalation in HPV− oropharyngeal cancers have opened. Although HPV status is prognostic, it is used in disease management and may have predictive value as well. Interestingly, EGFR overexpression, common to HNSCC, as measured by immunohistochemistry (IHC), is a strong predictor of locoregional recurrence after either surgical radiotherapy or induction chemotherapy [3–6]. However, the role of EGFR expression as a biomarker remains controversial [7–9].
Other candidate biomarkers of tumor radiosensitivity have been identified and are being evaluated. Not surprisingly, these single-gene candidate biomarkers are associated with the DNA damage response (DDR). Moeller et al  described the overexpression of the Ku80 protein in tumor specimens acquired from 89 patients with HNSCC who were treated with intensity-modulated radiotherapy as associated with locoregional failure and poor survival. High expression of MRE11, a member of the MRN complex (MRE11, RAD50, and NBN), which recruits ATM to the sites of DNA double-strand breaks, correlates with local recurrence-free survival in patients with breast cancer who have received adjuvant radiotherapy  and in patients with invasive bladder cancer who have received radiotherapy  but did not correlate in a cystectomy cohort. Subsequent independent studies have validated MRE11 as a predictive biomarker for radiotherapy . NBN copy number gain is another DDR gene that is a biomarker candidate for 5-year biochemical relapse-free survival in patients treated with radiotherapy, but it is not associated with a cohort of prostatectomy, except for patients treated for prostate cancer . AIMP3 expression in bladder cancer, determined by IHC, predicted good outcomes after radiotherapy well but was not predictive in patients with cystectomies . AIMP3 is a tumor suppressor and upstream regulator of p53, but the picture for p53 as a biomarker of radioresponse is not clear, likely because it extensively regulates a number of signaling pathways across a number of disease sites.
Although many of the DDR genes identified above as potential biomarkers were examined by IHC, the radioresponse of tumor cells and tumors has been predominantly studied by transcriptome analysis. Initial studies highlighted the role of DDR genes and pathways as key to the response to radiation exposures and suggested that those genes could be targeted to enhance radioresponse [16, 17]. Further transcript analysis of radiation-sensitivity signatures were conducted with clonogenic survival as the metric (surviving fraction at 2 Gy or 8 Gy) [18, 19] based on the radioresponse of all or part of the NCI-60 panel of cell lines (Division of Cancer Treatment & Diagnosis, National Cancer Institute, Frederick, Maryland). Interestingly, Amundson et al  and Torres-Roca et al  developed biosignatures of radiosensitivity that had no overlapping content. Studies applying the radiosensitivity index (RSI) across a number of clinical cohorts have corroborated some associations [19–23], but other investigators have concerns about its performance in other cell lines , and the RSI does not reliably predict local control [25, 26]. Such “negative” results should only be considered as constraints on the utility of the RSI and not as a reason to discount the continued development and refinement of the approach. The RSI, or similar approaches, may be tissue dependent and, thus, not applicable across tumor types. Such approaches may also be endpoint dependent, predictive of radioresponse in one case and prognostic in another.
Despite concerns about the translatability of biosignatures developed from cell lines grown in 2-dimensional tissue culture, researchers have successfully developed prognostic indicators of therapeutic outcome after concurrent chemoradiotherapy or radiotherapy alone including breast, pancreatic, a lung adenocarcinoma, and glioma/glioblastoma [21–23, 27–30]. Khodorev et al  identified an interferon-related DNA damage-resistance signature (IRDS) from cells isolated and cultured from a radioresistant tumor SCC61. Weichselbaum et al  identified IRDS+ and IRDS− states in a variety of tumor types using expression signatures from some of the NCI-60 cell line panel and data from a variety of tumors. With 7 genes from the original 52 genes that described the IRDS signature, they could predict response to adjuvant chemotherapy. The authors argued that this signature was better described as a DNA-damaging agent-response signature.
Charged Particle Biomarkers
The focus on radiosensitivity and the DNA-damage response genes and pathways in this review was intentional. Criteria for patient stratification for charged particle therapy should be based on biological advantages gained through advantageous dose distributions that limit dose to organs at risk. Subsequently, the use of biomarkers could identify the potential for advantageous biological responses could be generalized as well as specific. A general approach would simply identify patients who are more radioresistant and for whom charged particles would be more effective, particularly because the charged particle produces greater levels of complex and clustered DNA lesions from increased levels of localized ionization density, which occurs with increasing particle LET and is more difficult to repair. In this first pass at patient selection, the mechanism underlying radioresistance does not need to be understood. More-specific selection could examine biomarkers associated with DNA-repair pathways, such as homologous recombination repair, which has been shown in a battery of lung tumor cell lines to render those cells with homologous recombination deficiencies vulnerable to proton radiotherapy, a vulnerability not seen when cells were irradiated with x-rays . Beyond intrinsic radiosensitivity, specific biomarkers could include the identification of targeted agents that render tumor cells more radiosensitive, for example, tumor cell “BRCA-ness,” combined with a PARP inhibitor.
Although the enhanced ability of heavy charged particles to kill cells should lead to a survival benefit, other attributes of heavy charged particles can also be exploited. High-LET radiation is especially effective at overcoming the therapeutic resistance of hypoxic cells and tumors. Baumann et al  demonstrated the clinical benefit of overcoming hypoxia with x-rays because the hypoxic cell radiosensitizer nimorazole in combination with conventional radiotherapy for advanced laryngeal and pharyngeal cancers raised overall survival from 27% to 44%. Identifying hypoxic regions is not straightforward. More than 30 hypoxia gene signatures have been described . Their usefulness may be tissue specific [36–38] because they do not necessarily validate when tested across different tumor types . Nonetheless, even if specific to a given tissue, a hypoxia biomarker/biosignature could be used to stratify patients onto clinical trials, including trials with hypofractionated schedules. Furthermore, a biosignature for hypoxia could be used to select other ions besides 12C because the LET of 12C ions used in a spread-out Bragg peak will not completely overcome hypoxia. For example, the LET of 16O, in contrast, is more effective at overcoming hypoxia but may cause excessive adverse healthy tissue responses along the entry pathway. However, targeting regions of hypoxia as part of a multi-ion therapeutic approach is feasible and under discussion at the National Institute of Radiological Sciences (Chiba, Japan) and the Heidelberg Ion-Beam Therapy Center (Heidelberg, Germany). In addition, although there is no 16O-therapeutic beam, research is ongoing to characterize such a beam, along with other ions for their potential therapeutic benefit.
Biomarkers of Adverse Healthy-Tissue Responses
Healthy-tissue toxicity limits the dose that can be used in radiotherapy because the damage to healthy tissues within the treatment field can be debilitating to the patient's quality of life or even be life threatening. Some of the dramatic improvements seen in radiotherapy outcomes have simply been due to limiting treatment fields to exclude as much healthy tissue as possible to intermediate or high doses, sometimes at the cost of a low-dose bath over a large volume of healthy tissue. Charged particles offer far greater conformality than conventional radiotherapy, including hypofractionated approaches, such as stereotactic ablative radiotherapy. The Dutch health care governance bodies have adopted a model for patient selection for proton use based on normal tissue complication probabilities (NTCP) [39, 40]. In that approach, patients most likely to experience fewer serious adverse events via state-of-the art proton therapy than with photon therapy, based upon predictive NTCP models that take in to account both dose to, and volume of, at-risk healthy tissues, are selected for proton therapy. The predictive marker upon which patient-selection decisions are made is based on dosimetric signature comparisons of proton versus photon treatment plans. Treatment plan optimization is critical for the success of such an approach, which could be extended beyond protons to other charged particles. An NTCP modeling, based on optimized treatment plan comparisons, would be highly appropriate for artificial intelligence analytic approaches (see below). Patient-selection strategies for charged particle clinical trials could be optimized further by integrating genetic information, such as biomarkers of radiotherapy toxicity.
The Radiogenomics: Assessment of Polymorphisms for Predicting the Effects of Radiotherapy program began in 2004 as a collaboration between scientists and clinicians in the oncology departments at Cambridge University (Cambridge, England) and the University of Manchester (Manchester, England). In 2009, the Radiogenomics Consortium was established by the National Cancer Institute and now includes more than 200 investigators at more than 130 institutions, who are focused on developing the large-scale genome-wide association studies necessary to identify genetic factors associated with response to radiation therapy [41–43]. The Radiogenomics Consortium holds that the healthy tissue response across the radiotherapy population is a complex polygenic trait in which many common, single-nucleotide polymorphisms (SNPs) and rare variants, combined, modulate the tissue response to radiation exposures, such as those used clinically. Identifying such variants requires a whole-genome approach to identify the SNPs associated with a particular phenotype. Genome-wide association studies can require very large sample sets, depending on the phenotype tested for, and the penetrance of, particular SNPs. The Radiogenomics Consortium has identified 17 SNPs associated with healthy-tissue radiation toxicities to date . Many SNPs in genes associated with the DDR, inflammatory response, cytokine release, myotubule formation, or the regulation of histone methylation are represented. Ultimately, the Radiogenomics Consortium's goal is to build predictive models of healthy-tissue toxicity, which would include genomic, dosimetric, and clinical variables. Prospective trials based on biomarkers are underway and include the now-complete, prospective clinical trial for breast fibrosis that followed apoptosis in CD8+ cells  and the subsequent REQUITE validation study (see http://www.requite.eu/node/135). The REQUITE project will validate known predictors of adverse reactions through a prospective observational study of 5300 patients with prostate, breast, or lung cancers . Extending that approach to charged particle radiotherapy, that is, using radiation-toxicity biomarkers as part of the NTCP modeling to select patients for protons or heavier charged particles, is highly appropriate because the risk of adverse events, although tempered by the reduced volumes of healthy tissue irradiated, may be greater, given the relative biological effectiveness of charged particles, including protons. Finally, a cautionary note: because of the potential to combine immunotherapy with conventional x-ray therapy, stereotactic ablative radiotherapy, or charged particle radiotherapy—the latter 2 employing high-dose limited-fractionation schemes—it may be especially prudent to consider SNP biomarkers of adverse healthy-tissue response found in genes that are associated with or drive inflammatory and immune responses.
Artificial Intelligence in Biomarker Development
Artificial intelligence (AI) is a process through which machines mimic human cognitive functions, such as learning and problem solving. The AI technologies have progressed greatly in recent years and have proven both transformative and disruptive in many fields, such as computer vision, decision making, natural language processing, audio processing, and automobile auto piloting. In addition, AI is expected to have a significant effect on health care, especially in the areas of individualized and precision medicine.
Given the recent advances in big-data science, the capture of large volumes of clinically annotated “omics” data (genomic, epigenomics, proteomics, and metabolomics) in precision medicine, data-mining algorithms are in great demand to establish genotype-phenotype relationships, particularly where there are multiomics data sets. Importantly, a single patient-derived sample set could have a mutation analysis, copy number variant analysis, and transcriptome and epigenomic data. Machine learning, a subset of AI, has been applied to large data sets to recognize complex patterns based on empiric data [46, 47], so that the machine can learn from prior data to make predictions or decisions about future data [48, 49]. Machine learning approaches, artificial neural networks, support vector machines, decision trees, among others, have already been applied to clinicopathologic and genomic data sets to predict cancer survival or disease recurrence rates [50–52], as examples. In addition, AI is being applied in radiation oncology to predict radioresistance based on multiomics data using more-sophisticated AI techniques, such as deep learning.
In photon therapy, partially because of the large amount of accumulated data (including imaging, dosimetry, and outcome), AI has been explored for a number of applications, such as automatic treatment planning [53, 54] and treatment-outcome predictions [55–57]. Among those developments, “radiomics” has become a promising field that involves extracting large amounts of quantitative features from medical images and mining those features for clinical decision support [58–62]. When treatment response and therapeutic efficacy can be predicted early by radiomics models, intensified treatment, such as additional radiation therapy and systemic therapy, could be applied in time to improve the overall treatment outcome. Recently, deep learning has been applied to different tasks in photon therapy, such as automated adaptation for lung cancer  and toxicity prediction for cervical cancer . Different from radiomics-based methods, deep learning extracts features from input images and data automatically, by avoiding handcrafted feature extraction. Because many parameters need to be included during deep learning, large training data sets are needed, and transfer learning is often adopted in radiation therapy applications, in which the data set is relatively small compared with other domains.
Furthermore, AI has been explored in charged particle therapy. Using patient-specific feature sets and a library of historic plans, Valdes et al  demonstrated the applicability of clinical decision support for proton-treatment planning for postoperative oropharyngeal cancers. Gueth et al  developed a machine learning–based, patient-specific, prompt-γ dose-monitoring method in proton therapy through Monte Carlo simulations. Sun et al  explored 3 machine-learning algorithms to predict monitor units for a compact proton machine. For treatment-outcome prediction, attempts have been made to consolidate data from different proton centers . Using proton collaborative grouped data, a machine-learning algorithm was developed to predict treatment response to proton therapy for patients with prostate cancer . With the continuously growing interest in, and practice of, charged particle therapy, more outcome data will be accumulated. Soon, AI will have an indispensable role in recommending treatment modality and predicting treatment outcome and toxicity. In addition to imaging and dosimetry data, incorporating genomic, pathologic, and clinical data (electronic health records) could further improve the predictive power of AI for charged particle therapy.
For charged particle radiotherapy and, importantly, for clinical trial design, patient selection should be biologically based. The biologic basis for charged-particle radiotherapy trials is the physics of the interaction of charged particles with biologic materials compared with x-rays. Biomarkers of radioresistance should be used to direct individuals to heavy ion therapy to take advantage of the induction of clustered and complex DNA lesions afforded by high-LET particles. Because the number of heavy ion centers is extremely limited, LET painting strategies for proton therapy, in which RBEs are greater than 1.1, are also appropriate for radioresistant tumors .
Because of the inherent radioresistance of hypoxic tumors, biomarkers of hypoxia could also be used to direct patients to heavy ion radiotherapy. Biomarkers of hypoxia could be used for ion selection, that is, the use of 16O as a boost to hypoxic regions of a tumor. Although protons alone offer no biophysical advantage to hypoxic tumors, the combination of proton LET-painting schemes and hypoxic-cell radiosensitizers may be an effective strategy.
Lastly, because radiotherapy is ultimately driven by the response of healthy tissues, models of NCTP based on dose and healthy tissue volume constraints, which include biomarkers of risk for adverse healthy-tissue responses specific to the anatomy in the radiation field, should be pursued. Increasingly, AI is likely to have a large role in such approaches. Similar to biomarkers of radioresistance, the current and future biomarkers of adverse healthy-tissue responses should be readily applicable to charged-particle exposures.
ADDITIONAL INFORMATION AND DECLARATIONS
Conflicts of Interest: The authors have no conflicts of interest to disclose.
Acknowledgments: This work was supported by the National Institutes of Health (grant P20 CA183639 to H Choy).