Evaluation of Computed Tomography Scanners for Feasibility of Using Averaged Hounsfield Unit–to–Stopping Power Ratio Calibration Curve

Purpose: The purpose of this study was to quantify the variability of stoichiometric calibration curves for different computed tomography (CT) scanners and determine whether an averaged Hounsfield unit (HU)–to–stopping power ratio (SPR) calibration curve can be used across multiple CT scanners. Materials and Methods: Five CT scanners were used to scan an electron density phantom to establish HU values of known material plugs. A stoichiometric calibration curve was calculated for CT scanners and for the average curve. Animal tissue surrogates were used to compare the water-equivalent thickness (WET) of the animal tissue surrogates calculated by the treatment planning system (TPS) and the WET values measured with a multilayered ionization chamber. The calibration curves were optimized to reduce the percentage of difference between measured and TPS-calculated WET values. A second set of tissue surrogates was then used to evaluate the overall range of uncertainty for the optimized CT-specific and average calibration curves. Results: Overall, the average variation in HU for all 6 calibration curves before optimization was 8.3 HU. For both the averaged and CT-specific calibrations, the root mean square error (RMSE) of the percentage of difference between TPS-calculated and measured WET values before optimization was 4%. The RMSE of the percentage of difference for the TPS-calculated and multilayered ionization chamber measured WET values after the optimization for both averaged and CT-specific calibration curves was reduced to less than 1.5%. The overall RMSE of the TPS and the measured WET percentage of difference after optimization was 2.1% for both averaged and CT-specific calibration curves. Conclusion: Averaged CT calibration curves can be used to map the HU-to-SPR in TPSs, if the variations in HU values across all scanners is relatively small. Performing tissue surrogate optimization of the HU-to-SPR calibration curve has been shown to reduce the overall uncertainty of the calibration for averaged and CT-specific calibration curves and is recommended, especially if an averaged HU-to-SPR calibration curve is used.


CT Scanners and CT Calibration Phantom
There are 5 CT scanners in service at the Department of Radiation Oncology at the University of Maryland. There are 4 Brilliance Big Bore scanner (Philips Medical Systems, Best, the Netherlands) and 1 Definition Edge scanner (Siemens, Erlangen, Germany). All the Philips CT scanners are located in the photon clinic, and the Siemens CT scanner is located in the proton clinic. To establish the HU values with known materials, an electron density phantom (model 062M, CIRS, Carlton, Victoria, Australia) containing 7 tissue-equivalent electron density plugs and 1 water vial [8] in an abdominal configuration, with 2 nested disks combined, was imaged with all 5 CT scanners. Elemental composition and the density of the plugs were obtained from the phantom manufacturer. All 5 CT scanners used similar scanning protocols; 120 peak kilovoltage, 3-mm slice thickness, and 500-mm field of view. All the images were transferred to the Varian Eclipse TPS (version 13.7) to obtain the measured HU values from the plugs. We obtained 6 sets of HU values for the electron density phantom: 5 from the CT scanners, and 1 from the averaged HU values from the 5 CT scanners.

Stoichiometric Calibration
To generate the HU-to-SPR calibration curves for the 5 CT scanners and averaged HU values, the stoichiometric calibration method proposed by Ainsley and Yeager [3] was used. To summarize briefly, the electron density phantom and the plugs were scanned and the measured HU values for the plugs were obtained. The measured HU values for each plug were used to determine the parameterization coefficients A, B, C, and D via linear regression fit: The coefficient D and q e i are the intercept and the relative electron density to water for the ith plug, respectively. The coefficients A, B, and C of each plugs represent the parameterized contributions of the linear attenuation coefficient from the photoelectric effect, coherent scattering, and Compton scattering, respectively.Z i andẐ i are the effective atomic numbers, which are defined as follows:Z where k j ¼ ðw j Z j Þ=A j X j ðw j Z j Þ=A j . The terms w j , Z j , and A j are the mass fraction, the atomic number, and the mass of element j in plug i. By taking the parameterized coefficients and reference tissue compositions from ICRU reports [9,10] into consideration, theoretical HU values were computed for each reference tissue. The stopping power ratio, SPR calc , for each reference tissue was then calculated using the Bethe-Bloch equation [11]: where m e c 2 is the rest mass energy of the electrons, b is the relativistic speed of the proton with respect to the speed of light c, and I tissue and I water are the mean ionization energy of the reference tissue and water, respectively. The ionization energy for each element was adopted from Seltzer and Berger [12], and the mean ionization energy of a tissue composition was calculated using the Bragg additivity rule [1]. A fixed proton energy of 240 MeV was used to compute the relativistic speed of the proton in equation 4, assuming the SPR is approximately constant with the proton energy in the therapeutic region [2,10]. Finally, the theoretical HU for each reference tissue was plotted against its calculated SPR to generate the HU-to-SPR calibration curve. Of all the reference tissues, selected tissues were connected piecewise in the general regions that represent lung, tissue, water, and bone to create a calibration curve. The curve was extrapolated to HU ¼ À1000 to include air. This process was repeated for each of the 5 CT scanners, and the average of the HU values computed for each CT scanner was calculated and used to plot the average HU-to-SPR curve.

Evaluation of HU-to-SPR Calibration Curves
After the initial HU-to-SPR calibration, a set of animal tissue samples, referred to as preoptimized tissue surrogates, was used to evaluate and optimize all 6 HU-to-SPR calibration curves (5 CT scanner-specific curves and 1 averaged calibration curve). For the optimization of the calibration curves, the preoptimized tissue surrogates were the water, fat (adipose tissue), muscle, intact stomach, liver, femur, and head of a pig. Except for water, all other tissue surrogates were placed in a hermetically sealed bag and frozen. All the tissues were scanned on the a Siemens Definition Edge CT scanner only because of the relative ease of access to the scanner. The same CT protocols used to scan with the CIRS phantom were used to acquire images of the tissue surrogates. Once scanned, the images were imported to the Varian Eclipse TPS, and all the HU-to-SPR calibration curves were applied to the images to obtain the WET value for each tissue. Next, the WET value for each tissue surrogate was measured with a multilayered ionization chamber (MLIC; Giraffe, IBA Dosimetry GmbH, Schwarzenbruck, Germany) with a proton-pencil beam energy of 240 MeV. The WET values were obtained by taking the difference between the distal 80% of the Bragg peak measured with the MLIC with and without the tissue: The uncertainty in the SPR-to-HU calibration curves was then defined as the percentage of the difference in the TPS-based and measured WET values: WET TPS ÀWET measured WET measured 3 100 The measurement uncertainty of the MLIC was 0.4 mm, based on internal experience from current and other studies. To minimize setup errors for the MLIC measurements, BBs were placed on the tissue surrogates to help with the setup.
Based on the results of the SPR-to-HU uncertainty, all 6 calibration curves were optimized to minimize the percentage of difference between the TPS-based and measured WET values. To minimize the percentage of difference between the TPSbased and measured WET values, the SPR-to-HU calibration curves were adjusted overall for optimization. When performing adjustments, only the SPR values were adjusted, whereas the HU values remained static. To better fit the curves with the measured results, 2 arbitrary points were added (indicated as Fill in in Table 3) to the curves.

Evaluation of the Overall Uncertainty of the Optimized Calibration Curves
After the tissue surrogate-based HU-to-SPR calibration curve optimization was performed for all 6 calibration curves, a separate tissue surrogate sample, which we call the postoptimized tissue surrogate, was used to evaluate the overall uncertainty of the calibration curves. The postoptimized tissue surrogate included muscles, ribs, fat (adipose tissue), and cartilage from a pig, intact leg (including muscle cartilage and bone) from a chicken, and air cavities. All the tissues were placed in a Tupperware container (Tupperware Brands, Orlando, Florida) and frozen for CT scanning. The arrangement of the surrogate tissues in the container was performed randomly, on purpose, with air cavities to simulate the true human anatomy as closely as possible. The postoptimized tissue surrogate sample was imaged with all 5 CT scanners with the protocols described in the previous section. Once imaged, the tissue surrogate was irradiated with a 240 MeV proton beam at 6 different locations within the container, using the MLIC to determine 6 WET measurement points for each CT image. The proton beam energy of 240 MeV was chosen to ensure that there was enough energy to penetrate through the tissue surrogates and maintain a stable and high dose rate. The locations of the 6 measurement points are identified in Table 4 (xy-plane is the axial plane, xz-plane is the coronal plane, and yz-plane is the sagittal plane). The measurements along the thin side of the tissue sample is the x-direction (5 measurements) and the thicker side of the tissue sample is the z-direction (1 measurement). The relative position of the axial measurements are indicated in Table 5.
To evaluate the overall uncertainty of the averaged CT calibration curve, the optimized, averaged HU-to-SPR calibration curve was applied to the images from the 5 CT scanners. The WET values predicted by the TPS for each tissue surrogate were obtained and evaluated against the measured WET values. For the CT-specific calibration evaluation, the CT-specific, optimized HU-to-SPR calibration curves were used for the given CT scan to obtain the WET values from the TPS for each scanner.

Evaluation of Preoptimized Tissue Surrogate-Adjusted Calibration Curves
For the evaluation of the 6 calibration curves, 10 ICRU reference tissues, representing lung, soft-tissue, and bone, were chosen. The selected reference tissues were lung-inflated, adipose tissue (ICRU 49), water, cartilage, humerus, sacrum (female), rib-10th, mandible, cortical bone (ICRU 49) and cortical bone. A plot of all 6 preoptimized tissue surrogate-adjusted calibration curves is shown in Figure 1a. The overall average variation (2r) of the HU values for the 6 pretissue surrogateadjusted calibration curves was 8.3 HU with maximum and minimum HU values of 22.2 HU and 2.2 HU, respectively (see Table 1).
A comparison of the TPS-calculated WET to the measured WET from the tissue surrogates is shown in Figure 1b, with the tabulated, preoptimized values shown in Table 2. The tissues that showed the largest WET percentage of difference were femoral head (range, 6.8% to 7.6%), fat (range, À6.4% to À6.8%), and intact pig head (range, 5.0% to 5.5%). The positive sign on the WET percentage of difference shows that the measured WET value was smaller than TPS WET value. A negative sign shows that the measured WET value was larger than TPS WET value. The root mean square errors (RMSEs) of the WET uncertainty for the preoptimized tissue surrogates for the averaged and the CT-specific calibration curves were both 4%. The overall RMSEs for both the averaged and CT-specific calibration curves were also 4%. It was determined that the overall SPR uncertainty without the animal tissue surrogate adjustment was approximately 4% for this study. Figure 1c is the plot of all 6 postoptimized tissue surrogate-adjusted calibration curves. For better agreement with the measured WET values, 2 fill-in points were added to the postoptimized tissue surrogate-adjusted calibration curves. One fill-in point was added between the lung-inflated and adipose tissue, and the second was added between the water and the cartilage. The postoptimized tissue surrogate-adjusted calibration curves showed that the high-density bone region had lower SPR values than the preoptimized tissue surrogate-adjusted calibration curves, indicating that the unadjusted stoichiometric calibration may overestimate the SPR for high-density materials. The SPR values for the low-density lung region did not show much difference between the calibration curves of the preoptimized and postoptimized tissue surrogate. Furthermore, the SPR values for the water/tissue region (À100 HU to 100 HU) showed a flatter calibration curve for the postoptimized tissue  11.1 HU)). The overall average variation (2r) of HU values for the 6 calibration curves was 5.0 HU, with maximum and minimum HU values of 11.1 HU and 1.1 HU (not counting for the fill-in points), respectively (see Table 3). The tabulated postoptimized tissue surrogate-adjusted results of the WET percentage of difference is shown in Table 2. The tissues that showed the largest WET percentage of difference were fat (range, À1.5% to À1.4%), stomach (range, 1.5% to 1.6%), and heart (range, 3.0% to 4.4%). The RMSE for the WET percentage of difference after the adjustment for averaged and CT-specific calibration curves was 1.3% and 1.5%, respectively. The overall RMSE for both averaged and CT-specific calibration curves was 1.5%. It was determined that the overall SPR uncertainty after the animal tissue surrogate adjustment was 1.5%.

Evaluation of Postoptimized Tissue Surrogate-Adjusted Calibration Curve Using Postoptimized Animal Tissues
A separate set of animal tissue surrogates was used to evaluate the overall uncertainty of the postoptimized tissue surrogateadjusted calibration curves. Figure 2 shows the CT images of the animal tissue surrogate and the measurement setup. Table  4 shows the tabulated results of the overall uncertainty using the postoptimized tissue surrogate-adjusted calibration curves. The first set of evaluations was performed by applying the averaged CT calibration curves to CT images acquired with each of the individual CT scanners. The second set of evaluations were performed by applying the CT-specific calibration curves to the respective CT images and is shown in Table 3. The measured WET values from the separate tissue surrogates were compared against the WET values from the TPS to determine the overall uncertainty of the CT calibration curves. For the averaged CT calibration, the overall SPR uncertainty ranged from À3.7% to 3.4%. The overall RMSE for the average CT calibration was 2.1%. For the CT-specific calibration, the overall SPR uncertainty ranged from À4.3% to 2.4%. The overall uncertainty for each calibration was 2.1%.

Range Uncertainty
The variability of the stoichiometric calibration curves for 5 CT scanners as well as the average of all 5 individual scanner curves was evaluated. We determined that an averaged CT calibration curve was appropriate for all 5 CT scanners used in this study without increasing the uncertainty in the determination of the proton SPR. Before the tissue surrogate optimization, the overall WET percentage of difference between the TPS values and the measured values ranged from À6.8% to 7.6%, with an RMSE of 4.0%. After the tissue surrogate optimization of all the calibration curves (ie, 5 CT-specific calibration curves and the    Breast, 50-50 À46.1 1.000 À44.9 1.000 À47.9 1.000 À46.0 1.000 À44.9 1.000 À46.5 1.000 À44.9 À47.9 1. Within the Department of Radiation Oncology, University of Maryland, there are 5 CT scanners for which a planning CT scan can be performed for proton treatment. Although that is convenient for the patient and the clinical workflow, it poses a challenge when it comes to selecting the appropriate HU-to-SPR calibration curve. To accommodate all 5 CT scanners for proton-planning purposes, there must be multiple HU-to-SPR calibration curves, with the risk of choosing an incorrect calibration curve at the time of planning CT image imported into the TPS. The current study showed that, between the 5 CT scanners, the CT-specific calibration curve showed minimal variation un the HU values (see Table 2). Because of the minimal variation in the HU values from the 5 CT scanner, an averaged HU-to-SPR calibration curve can be used to describe the HUto-SPR relation among the 5 CT scanners, if the CT scanners are properly calibrated and show minimal HU variability.
The fill-in points in the postoptimized tissue surrogate-adjusted calibration curves require some explanation. The 2 fill-in points were required to increase the accuracy of the calibration curve with respect to the measured WET values. The number and placement of fill-in points need to be carefully determined. The fill-in points in this study were selected by evaluating the WET values for multiple tissue samples. The most difficult part of that process is that the fill-in points may improve the WET values for low-density tissues but increase the errors for high-density tissues. For that reason, both the low-and high-density WET values must be determined carefully when selecting such fill-in points. For the set of tissue surrogates, those 2 fill-in points provided the most-optimal calibration curves, which minimized the difference between the TPS-based and measured WET values.

Comparison of Proton Uncertainty Reported in Literature
Other investigators have studied the overall uncertainty of HU-to-SPR calibration curves. Cheng et al [5] examined the HU-to-SPR calibration curves from 18 CT scanners with 5 RMI 467 phantoms and investigated the effects of dosimetric uncertainties. The HU-to-SPR calibration curves were generated by mapping the scanned HU values from the RMI phantoms and calculating the SPR by Bethe-Block formalism of the tissue substitute inserts in the RMI phantom. The stoichiometric calibration method was not used to generate the calibration curves. To evaluate the dosimetric uncertainty of the calibration curves, the investigators generated 3 calibration curves that represented minimum, maximum, and average HU values for all the curves measured from the scanners. Those curves were compared against the clinically accepted calibration curves at the investigator's institution. Prostate and head-and-neck cases were used to evaluate the dosimetric uncertainties. The investigators reported a dosimetric uncertainty for the prostate plan of 1% for all volumes of interest using the 3 calibration curves (minimum, maximum, and average HU-to-SPR calibration curves) when compared against the clinically accepted calibration curve. For the head-and-neck case, they reported a dosimetric uncertainty of 5% with more than 10% uncertainty for the optic nerves and cochlea. The current study has demonstrated better overall uncertainty (2.1%) using the averaged calibration curve, than was found in the Cheng et al [5] study. Furthermore, Cheng et al [5] evaluated the calibration curves against the clinically accepted calibration curve as their reference. Because they never reported the uncertainty for the reference curve, it is not possible to determine the overall dosimetric uncertainty for the study. Lastly, there was no measurement-based dosimetric uncertainty evaluation of the TPS using the generated calibration curves.
A recent study [13] evaluated dosimetric uncertainty from dual-energy CT (DECT) and single-energy CT (SECT) for proton therapy treatment. A CT phantom was scanned with a SECT at the clinical energy spectrum and scanned again with lower- Table 4. The overall uncertainty of the calibration curves. There were 6 points of measurements, which are shown in the ''Relative location'' column. For the averaged computed tomography (CT) calibration evaluation, the averaged CT calibration curve was applied for all 5 CT images to obtain the waterequivalent thickness (WET) values from the treatment planning system (TPS). For the CT-specific calibration evaluation, the CT-specific calibration curves were applied to a given CT scan to obtain the WET values from the TPS.

Relative location (cm)
Averaged CT calibration (%) CT-specific calibration (%)  [14] investigated the difference between the stoichiometric calibration and a calibration curve from beam range measurement with tissue substitutes from phantom model 467 (Gammex, Middleton, Wisconsin). The measurement-based calibration curve was performed by irradiating the tissue substitutes with 284 MeV of carbon ion and measuring the range using a Peakfinder (PTW-Freiburg, Freiburg, Germany). The stoichiometric calibration showed negligible differences up to 150 HU. The slopes of the calibration curves started to diverge from 150 HU to 2500 HU. A favorable agreement was reported up to 1200 HU. Above 1200 HU, the curves proceeded to diverge where the measured curve showed lower SPR values than the stoichiometric curve, which was consistent with our findings. In our study, the posttissue surrogate calibration curves of the high-density bone region had lower SPR values than the stoichiometric calibration curves (see Figure 1), in agreement with observations of Witt et al [14], indicating that the stoichiometric calibration method may overestimate the SPR values for highdensity materials.

Significance of Using Animal Tissue Surrogates to Adjust Calibration Curves
In our study, animal tissue surrogates, such as pig parts (pig head, lung, fat, water, stomach, muscle, liver, and femur), significantly improved the uncertainty from 4% to 1.5%. Based on this study, performing stoichiometric calibrations alone was insufficient to reduce the overall uncertainty to less than 3%. The best that could be achieved was typically 3.5% [2]. The reason for such uncertainty is due to several factors. One of the main factors is the variations in the HU of the planning CT from scatter and beam-hardening effects in the patient's anatomy. Minor CT artifacts from high-density materials can cause the TPS to use the wrong SPR values, which could significantly increase the errors in the dose calculation. The errors in the parameterization of the stoichiometric formula to determine the theoretical HUs can add to the overall uncertainty. The stoichiometric calibration requires a fit of measured HU values with known materials to predict the HU of the calculated SPR of the ICRU reference tissues. Uncertainty and error in the fit parameterization will propagate downstream and affect dose calculation as well. Moreover, variation in the patient's anatomic tissue composition from that in the ICRU reference tissues can increase the SPR uncertainty. For instance, femur density will vary from patient to patient, which will affect the SPR value for the same tissue type. In this study, the initial calibration curves were generated with the stoichiometric calibration methods from a tissue substitute from a known phantom. A set of animal tissue surrogates (pig tissue parts) were used not only to validate the calibration curves from the tissue substitutes but also to further optimize the curve for better agreement with the WET measurements. The current study and the Witt et al [14] study show that the stoichiometric calibration method may overestimate the SPR in the high-density region of a curve. One of the best ways to determine the magnitude of the adjustment of the SPR in the high-density region is via measurement. In addition to optimizing the curves, new tissue surrogates were used to evaluate the overall uncertainty of the calibration curves, which is much more comprehensive than simply using another tissue substitute to validate the calibration curves. Because of the inherent errors within the stoichiometric calibration methodology, the current study shows that some type of tissue surrogate evaluation and optimization should be performed to minimize the uncertainty in the calibration curve for the CT scanner in question.

Conclusion
The current study was performed to evaluate the variability of the stoichiometric HU-to-SPR calibration curves of 5 CT scanners to determine whether an averaged HU-to-SPR calibration curve could be used to account for all 5 CT scanners. Based on the study, the overall average variation in HU values from all the calibration curves before the optimized adjustment of the calibration curves was 8.3 HU. The RMSE of the WET percentage of difference before and after the tissue surrogate optimization for both the averaged and CT-specific calibration curves were 4% and 1.5%, respectively. The overall range of the uncertainty evaluated by the postoptimized tissue surrogate was 2.1% for both averaged and CT-specific calibration curves. The current study shows that an averaged HU-to-SPR calibration curve can be used to account for variability among CT scanners, if CT scanners are properly calibrated and show minimal HU variability from one CT scanner to another. Lastly, to achieve overall proton range uncertainty of less than 2%, we have demonstrated that an animal tissue surrogate-based optimization of the HU-to-SPR calibration curve should be performed.

ADDITIONAL INFORMATION AND DECLARATIONS
Conflicts of Interest: The authors have no conflicts of interest to disclose.