Pencil beam (PB) analytical algorithms have been the standard of care for proton therapy dose calculations. The introduction of Monte Carlo (MC) algorithms may provide more robust and accurate planning and can improve therapeutic benefit. We conducted a dosimetric analysis to quantify the differences between MC and PB algorithms in the clinical setting of dose-painted nasopharyngeal cancer intensity-modulated proton radiotherapy.
Plans of 14 patients treated with PB analytical algorithm optimized and calculated (PBPB) were retrospectively analyzed. The PBPB plans were recalculated using MC to generate PBMC plans and, finally, reoptimized and recalculated with MC to generate MCMC plans. The plans were compared across several dosimetric endpoints and correlated with documented toxicity. Robustness of the planning scenarios (PBPB, PBMC, MCMC) in the presence of setup and range uncertainties was compared.
A median decrease of up to 5 Gy (P < .05) was observed in coverage of planning target volume high-risk, intermediate-risk, and low-risk volumes when PB plans were recalculated using the MC algorithm. This loss in coverage was regained by reoptimizing with MC, albeit with a slightly higher dose to normal tissues but within the standard tolerance limits. The robustness of both PB and MC plans remained similar in the presence of setup and range uncertainties. The MC-calculated mean dose to the oral avoidance structure, along with changes in global maximum dose between PB and MC dosimetry, may be associated with acute toxicity-related events.
Retrospective analyses of plan dosimetry quantified a loss of coverage with PB that could be recovered under MC optimization. MC optimization should be performed for the complex dosimetry in patients with nasopharyngeal carcinoma before plan acceptance and should also be used in correlative studies of proton dosimetry with clinical endpoints.
With the advent of advanced radiation treatment facilities across the world, proton beam radiotherapy is increasingly being used for radical intent therapy of head and neck cancers. It offers the benefit of lower integral body dose in the absence of an exit beam compared with conventional photon beam radiotherapy [1, 2]. Head-and-neck radiotherapy presents a unique treatment planning challenge with multiple prescribed dose levels and several critical organs close to targets. The presence of bone soft-tissue interfaces, oral cavity, and pharyngeal airway can also add to the inconsistency with pencil beam (PB) analytical dose calculation in head-and-neck proton beam radiotherapy . Hence, the accuracy of dose delivery and dose calculation is important in patient management. Changes in critical organ dosimetry may influence risks of acute and chronic toxicity. Similarly, failure to provide optimal target coverage may prove detrimental to tumor control.
The PB analytical dose engines have been the standard for proton radiotherapy as it offers a fast optimization process and efficient computation. However, its accuracy in heterogeneous phantom tissues is compromised compared with recent commercial Monte Carlo (MC) dose engines [4, 5]. Furthermore PB treatment planning is often used in lieu of MC treatment planning to gain speed and efficiency and to achieve clinical dose constraints. Maes et al.  observed deficiencies in planned dosimetry for lung cancer proton radiotherapy plans due to target heterogeneity and recommended integrating the MC algorithm into clinical planning workflows. This dosimetric comparison between MC- and PB-optimized plans has not been quantified in the clinical setting of head and neck cancer. The impact of and tradeoffs between PB and MC treatment-planning approaches may provide useful decision support to head-and-neck radiation oncologists.
The purpose of this investigation was to evaluate and quantify dosimetric differences between PB and MC treatment planning approaches in a retrospective series of patients with nasopharyngeal cancer. We also correlated these changes in dosimetry with clinical toxicity endpoints and evaluated the robustness of planning approaches under range and patient setup uncertainties.
Patients and Methods
The treatment-planning data from 14 patients diagnosed with nasopharyngeal cancer who received definitive proton pencil beam scanning (PBS) radiotherapy were reviewed and analyzed. These patients were enrolled into an Institutional Review Board–approved trial registry at the University of Washington. For this cohort of patients, the gross tumor volumes ranged from 23 to 63 cm3 with a median volume of 39 cm3 for the primary tumor. The nodal GTV ranged between 2 and 20 cm3 with a median volume of 11.5 cm3. All patients successfully completed therapy with a minimum of 3 months' follow-up. Toxicity data (CTCAE V 4.0) were collected from patient charts for correlation with the treatment-planning dosimetry. Patient characteristics are summarized in Table 1.
Patients were simulated using the Optima CT580 (GE Healthcare, Waukesha, Wisconsin) and scanned headfirst in supine position, immobilized on a BoS table (Qfix, Avondale, Pennsylvania) with thermoplastic mask, custom moldcare cushion, and custom oral stent. Helical planning computed tomography (CT) scans were acquired and reconstructed on a 65-cm transverse field-of-view (FOV) with 2.5 mm slice thickness. Contrast-enhanced helical CT scans over 35 cm FOV and metal artifact-reduced CT reconstructions over 50 cm FOV were also generated and fused for treatment planning. Contouring of targets and organs at risk (OARs) was completed in MIM 6.6 (MIM Software, Cleveland, Ohio).
Treatment Planning and Optimization Techniques
Analytical PB-based treatment planning was performed using PB dose version 4.1, while MC-based treatment planning was performed using the MC Dose Version 4.0 algorithm in the RayStation 6.0 SP1 (RS6) (RaySearch Laboratories, Stockholm, Sweden). The planning target volume (PTV) margin expansion was not beam-specific and nominally consisted of an isotropic expansion of 3 to 5 mm. Clinician-defined margins accounted for nasal cavity filling changes in certain cases. Plans were routinely optimized and subsequently evaluated for dosimetric differences with a nasal cavity override turned on versus off, in order to simulate the worst-case scenario in the presence versus absence of nasal cavity obstruction. In general, cases that necessitated plan adaptation were the result of significant decreases in nasal cavity filling during treatment.
Planning configurations varied from 2- to 5 beams but primarily consisted of 3 beams arranged either in Y (PA, anterior obliques) or inverted-Y (AP, posterior obliques) orientations. In cases of complex bilateral target geometry, single-field optimization with split target junction matching was utilized to enhance sparing of critical structures. Due to the shallow proximal extent of nasopharyngeal tumors, range shifters were used in every case, and the air gap between range shifter and patient surface was minimized while avoiding collisions. Objectives for target and normal tissue dosimetry were matched to the original clinically approved plan in each case. Optimization consisted of a hybrid set of PTV and/or robust clinical target volume (CTV) objectives, which varied for each clinical case. Robustness settings for CTV objectives included range (± 3% to 3.5%) and setup (3 to 5 mm) perturbations. The reoptimization with MC algorithm used the same robustness evaluation criteria as the clinically approved base plan, PBPB. Dose painting prescriptions for definitive proton beam therapy regimens included 66 to 70 Gy(RBE) to the PTV high risk (PTV HR), 60 to 63 Gy(RBE) to the PTV intermediate risk (PTV IR), and 54 Gy(RBE) to the PTV low risk (PTV LR) in 33 fractions. The dose calculation grid during optimization and final computation was fixed at an isotropic 2 mm for all plans, per our institutional standard for head-and-neck disease sites. The critical organ doses were kept below standard-of-care guidelines. All doses were calculated and reported with an RBE of 1.1, per current clinical practice.
Clinically approved PB optimized plans were recalculated using RS6 MC algorithm to obtain PBMC plans. For MC treatment planning, under consistent optimization objectives relative to the clinical PB plan, a sampling history of 10 000 ions/spot was imposed along with a sufficient number of ions to yield 0.5% statistical uncertainty in the final calculated dose (MCMC). The robustness criteria used for baseline PB optimization were retained for MC optimization. The PBPB and MCMC plans were normalized to achieve prescribed dose coverage to 95% of the planning target volumes. The PBMC plans were not normalized as they were recalculated using same beam spot size and spot weight as the original PBPB plan. The MC reoptimization of PB plans used a minimum of 100 iterations. The statistical uncertainty was fixed at 0.5% for all MC calculations (PBMC and MCMC).
Dosimetric and Statistical Analysis
The differences in target coverage and normal tissue dose were tabulated and compared between the PBPB, PBMC and MCMC plans. The following plan metrics defined in RS6 were evaluated: PTV HR Conformity Index (CI) and Homogeneity Index (HI), PTV HR D95, PTV IR D95, PTV LR D95, Brainstem D0.03 cm3, Spinal cord D0.03 cm3, and total parotid DMean.
The CI was calculated using the equation
The HI was calculated using the equation
Pairwise differences in planned dose and the dose-volume parameters among the 3 groups of plans (PBPB, PBMC, MCMC) were evaluated using nonparametric Wilcoxon signed rank tests.
To assess treatment plan robustness to setup and range uncertainties, the perturbed dose distributions for the 3 planning methods were generated and compared. Perturbation scenarios consisted of under-/over-ranging beams by scaling the CT density ±3% and shifting beam isocenters ±3 mm in the medial/lateral (x), anterior/posterior (y), and superior/inferior (z) directions. Two representative worst-case dose perturbations scenarios described by (+3%, x + 3 mm, y + 3 mm, z + 3 mm) and (–3%, x − 3 mm, y − 3 mm, z − 3 mm) were compared and analyzed. Relative changes in dosimetry between the worst-case perturbed plan and nominal plan were analyzed to assess the variability in robustness across PBMC and MCMC plans. Pairwise differences between PBMC and MCMC perturbed dose deviations from nominal plans were evaluated with nonparametric Wilcoxon signed rank tests.
Dosimetric changes in OARs between the original clinical PB plan and the PBMC recalculated plan were correlated to toxicity. The OAR dose parameters included mean doses to oral avoidance and laryngo-pharynx avoidance structures, as well as global dose maximum. Toxicity scoring was dichotomized based on incidence of acute grade 3 or higher radiation dermatitis, acute grade 3 or higher radiation mucositis, completion of prescribed chemotherapy and requirement for interventions (emergency department, intensive care unit, inpatient care, feeding procedure, unscheduled intravenous fluids, etc). Differences in OAR dosimetry between patients who experienced toxicity and those who did not were evaluated with nonparametric Mann-Whitney tests. For nonparametric test results, P ≤ .05 were considered statistically significant. All statistical analyses were performed using OriginPro 9.1(OriginLab Corporation, Northampton, Massachusetts).
The Figure illustrates isodose distributions overlaid on a patient planning CT under three planned dose conditions: (A) PBPB, (B) PBMC, and (B) MCMC. The line dose profile (D) in craniocaudal and transverse planes reveals a consistent deviation in coverage with PBMC compared with the baseline PBPB plan. This deviation in dosimetry is recovered with the MCMC plan, both on the line profile of Figure D and the dose-volume histogram on Figure E.
Table 2 lists target and critical structure dose differences between the PBPB, PBMC, and MCMC treatment plans. The PTV volumes of the patients ranged from 73 to 179 cm3 with a median of 144 cm3 for high risk, 131 to 391 cm3 with median of 233 cm3 for intermediate risk and 237 to 514 cm3 with median of 337 cm3 for low-risk regions. For PBPB plans, the PTV HR D95 ranged from 67 to 70 Gy with a median of 70 Gy, PTV IR D95 ranged between 62 and 64 Gy with a median of 64 Gy and PTV LR D95 had a median of 55 Gy within a narrow range of 54 to 55 Gy. Under PBMC plans, the PTV HR D95 ranged from 63 to 66 Gy with a median of 65 Gy, and PTV IR D95 ranged from 56 to 60 Gy with a median of 58 Gy. Both PTV HR and PTV IR D95 on the PBMC plans showed a statistically significant drop in target coverage relative to PBPB plans (P < .005). The median percentage drop in coverage for CTV primary was 4.7% (interquartile range [IQR], 3.4–4.9) and for CTV neck was 4.3% (IQR, 3.0–6.1). There also appear to be significant cold spots of range 4 to 6 Gy within high risk CTV with MC recalculation, which would be considered clinically significant.
The PBMC plan had a 5.8% (IQR, 5.2%–7.0%) drop in coverage relative to prescription for PTV HR compared with the PBPB plan. The drop in coverage was 6.1% (IQR, 4.8%–9.9%) and 7.3% (IQR, 5.4%–10.2%) for PTV IR and PTV LR, respectively. Interestingly, when we compared the drop in coverage with PBMC from baseline PBPB plans with MC recalculation, the median percentage drop in coverage was 12.2, 11.7, and 9.6 for 5 beam plans, and 5.5, 6.0, and 7.1% for 3 beam plans for PTV HR, IR, and L,R respectively.
The HI for PTV HR was also significantly lower on the PBMC plan (P < .005). For MCMC plans, PTV D95 metrics, PTV CI, and PTV HI were concordant to the values achieved in PBPB plans, with no statistically significant differences (P > .09). The total parotid Dmean was lower in the PBMC plan with respect to the PBPB plan (P = .005) and higher in the MCMC plans (P = .002). The PTV CI, Brainstem D0.03 cm3, and spinal cord D0.03 cm3 were not statistically different among the dose calculation and optimization combinations. The median Global Dmax was consistent among all plans.
Table 3 shows delta values representing the percentage change in PTV HR D95, PTV IR D95, PTV LRD95, and maximum dose to the brainstem (0.03 cm3) in PBMC and MCMC treatment plans following range and isocentric setup perturbations: (+3% density/under-range, x + 3 mm, y + 3 mm, z + 3 mm), (–3% density/over-range, x − 3 mm, y − 3 mm, z − 3 mm). All changes were calculated relative to unperturbed nominal plans.
The median decrease in the PTV HR D95 coverage was 2.9% for both methods, in PTV IR D95 it was 3.4% for PBMC and 3.7% for MCMC, and in PTV LR D95 it was 2.7% and 3% for the PBMC and MCMC plans, respectively. The worst-case change in brainstem maximum dose was a median increase of 3.4% in PBPB and 3.5% in the MCMC plan. No changes in dosimetric parameters were statistically significant, suggesting that robustness to range and setup uncertainties was similar between PM- and MC-based treatment planning.
Table 4 compares the median change in dose when PBPB plan was recalculated to give the PBMC plan. Four of 14 patients (29%) were unable to complete the prescribed course of chemotherapy. There was an increase in global Dmax by 2.8% for the suboptimal chemotherapy group compared with a drop of 1.2% for those who completed chemotherapy, but this numerical difference was not statistically significant (P = 0.28). Similarly, correlating the mean doses of the oral avoidance structure and the laryngeal avoidance structures did not show any significant difference. A total of 10 patients developed significant mucositis (71%). There was a greater numerical drop of global Dmax and oral avoidance structure mean dose in the group who did not experience significant mucositis. Eight patients (57%) experienced unscheduled events, such unplanned inpatient care, feeding intervention, unscheduled intravenous fluid administration, emergency department visits. A numerically lower dose in the recalculated plans was observed but did not reach statistical significance.
It has been well demonstrated that MC dose calculation algorithms are a step toward improved proton beam radiotherapy dosimetric accuracy in the presence of tissue inhomogeneities. However, analytical PB dose computation engines remain popular due to fast optimization and widespread commercial availability in clinical treatment planning systems. Air gap influences differences between PB and MC dosimetry at the surface and depths <5 cm, due to how the nuclear halo and lateral scatter of secondary protons are modeled in the analytical dose engine. In general, analytical PB algorithms underestimate dose at shallower depths for the same spot position and intensity distribution. The effect scales with air gap and becomes significant with air gaps >15cm. Beam central axis obliquity relative to the patient surface also influences dosimetric differences between algorithms at shallow depths. This phenomenon was studied at great length in our prior study [5, 7]. At depths >5 cm, dose differences are primarily driven by scatter through tissue inhomogeneities.
Yepes et al  conducted the largest comparison (125 patients) of MC dosimetry against standard treatment planning system analytic PB algorithms and reported absolute dose differences as high as 10 Gy in head-and-neck disease sites. However, they employed a nonclinical MC planning algorithm. Our study is the first to report on MC planning using a Food and Drug Administration approved and commercially available treatment planning system for head-and-neck cancers. Furthermore, this study looked at the influence of dose calculation algorithms in dose painting plans defined by multiple nested target regions (PTV HR, PTV IR, PTV LR) in the uniquely challenging nasopharynx disease site. We also have shown that the target dose coverage lost when recalculating an analytical PB-optimized plan with MC, representing a more accurate estimate of plan dosimetry can be regained through MC optimization. Hence, we highlight the need for clinically adopting MC planning in sites like the nasopharynx with complex dosimetry. Our results suggest overestimation of target dose coverage by analytical PB dose engines. Based on these findings, we advocate for the adoption of MC in routine clinical treatment planning and the potential for correlating dose differences between dose algorithms with clinical endpoints.
The median high-risk target volume in this cohort was 144 cm3, which is closer to the head-and-neck cohort reported by Yepes et al. (178 cm3) . Across 3-level dose painted plans we consistently observed a drop of 4 to 5 Gy in coverage of all targets between PB and MC dosimetry. This difference remained statistically significant for all dose levels. The loss of target coverage was regained in the MC optimized plans with all plans being able to achieve similar or better dose coverage compared with the baseline PBPB plans. It was noted that the drop in target coverage might be higher for greater beam numbers when MC recalculation of PB optimized plans is evaluated.
This increase in target dose coverage with MC plans was accompanied by a similar increase in normal tissue dosimetry with a sample median increase of 3 Gy in the brainstem maximum dose and median increase of 3 Gy in mean parotid doses. Despite an increase in normal tissue doses, all plans met clinical goals for standard-of-care head-and-neck plans. The global Dmax in MC-optimized plans remained statistically similar to PB-optimized plans despite numerical differences in parotid dosimetry.
The secondary analyses explored whether improved target coverage remains robust under the influence of range and setup uncertainties. We also tested whether higher perturbed dose to normal tissue adversely affected clinical planning goals and acceptance criteria. Our results revealed that variations in dose parameters under range/setup perturbation remain similar for target and critical structures irrespective of the optimization process. A study by Schuemann et al.  reported that the MC algorithms produced more accurate dosimetry in complex head-and-neck geometries and reduced uncertainties due to range differences and tissue heterogeneity. We similarly observed in our cohort that MC optimized plans were robust in the presence of range and setup uncertainties.
Dosimetric changes between the PB optimized plan and the MC recalculated plan were correlated with incidence of toxicity based on CTCAE V 4.0. A trend toward lower global maximum doses was observed in MC recalculated plans for patients with lower reported toxicity incidence, though this trend did not reach statistical significance. There have been few reported series of patients with nasopharyngeal carcinoma treated with proton therapy [10–12]. Most series report local control ranging from 75% to 100% and survival ranging from 70% to 90%. They also report toxicity and breaks during therapy. From our small patient sample, oral avoidance Dmean from MC dosimetry, along with global Dmax changes between PB and MC dosimetry, may correlate with toxicity endpoints and should be explored further in future studies. With improved dosimetric accuracy from MC-based treatment planning, better estimates of toxicity risk factors during and after proton therapy may be identified.
Beyond increased accuracy of physical dose with MC, studies are also seeking to account for spatially variant linear energy transfer and relative biological effectiveness of proton treatment plans when correlating dosimetry with toxicity. These approaches require implementation of radiobiological models capable of predicting the location of toxicity [7, 13]. One key limitation of this work is that we have compared only the fixed relative biological effectiveness dose changes between algorithms. Due to the limited sample size, we could not definitively correlate toxicity incidence with plan dosimetry. A larger multi-institutional cohort may be required to address this question. With more patients, we plan to conduct a comprehensive study on proton MC dosimetry, extracted from PBMC and MCMC plans, for predicting toxicity in nasopharyngeal cancer patients.
Retrospective analyses of nasopharynx treatment plans showed statistically inferior dosimetric endpoints, affecting both target coverage and OAR doses, when using the analytical PB algorithms for treatment planning. The impact on dosimetric endpoints could be reversed through MC-based treatment planning. An MC optimization should be performed to characterize complex dosimetry in patients with nasopharyngeal carcinoma before plan acceptance and should also be used in correlative studies of proton dosimetry with clinical endpoints. Clinical integration of MC planning may lead to improved treatment tolerance and outcomes in patients with head-and-neck cancer.
ADDITIONAL INFORMATION AND DECLARATIONS
Conflicts of Interest: The authors have no conflicts of interest to disclose.
Ethical approval: All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.