The goal of the College of American Pathologists Accuracy-Based Proficiency Testing Program is to promote the quality, standardization, and harmonization of clinical laboratory results through proficiency testing specimens that are free from matrix effects, have target values that are traceable to reference methods, and that probe the limitations of current methods.
To summarize the first 6 years of the Accuracy-Based Vitamin D Survey and highlight key insights from the data generated as it relates to assay performance.
Accuracy-based challenges were created by using pooled human serum samples. Certain samples were derived from participants in an institutional review board–approved protocol in which vitamin D–deficient participants were treated with ergocalciferol (vitamin D2). Reference targets for the survey were set by the Centers for Disease Control and Prevention using isotope-dilution liquid chromatography–tandem mass spectrometry. Each method was compared with the reference method procedure over the course of the program (n = 43 proficiency testing samples).
Linear regression versus the reference method procedure revealed proportional biases across the methods, ranging from 0.0% to 16.7%. Pearson correlation coefficients (r2) ranged from 0.902 to 0.996. Results were influenced by the concentration of 25-hydroxyvitamin D2 as well as the C-3 epimer of 25-hydroxyvitamin D3. During the 6 years, 2 manufacturers altered their assays to match the reference method procedure more closely.
There is considerable bias, both proportional bias and sample-specific matrix effects, affecting many assays. This ongoing accuracy-based proficiency testing program for vitamin D will provide the data needed for laboratories and manufacturers to improve their assays and thereby patient care.
Vitamin D measurement techniques and proper utilization have been a source of interesting debate in the laboratory medicine community for quite some time.1–10 Vitamin D is made up of several fat-soluble secosteroids.11,12 The 2 major forms of vitamin D are vitamin D2 or ergocalciferol, which is derived from plants, and vitamin D3 or cholecalciferol, which is derived from animal sources. Vitamin D3 is also synthesized endogenously from 7-dihydrocholesterol in the skin upon exposure to ultraviolet-B light. The primary function of vitamin D is to help maintain calcium homeostasis. This is accomplished by enhancing calcium absorption in the small intestine. Vitamin D undergoes 25-hydroxylation in the liver by vitamin D 25-hydroxylase to form 25-hydroxyvitamin D [25(OH)D]. This product is complexed to its binding protein, vitamin D–binding protein, in the circulation and is stored in adipose tissue. It represents the major storage form of vitamin D, making it the ideal marker to assess vitamin D status.11–13 To become the active hormone, 25(OH)D reacts with the enzyme alpha-1 hydroxylase in the kidney and other target cells to produce 1,25-dihydroxyvitamin D [1,25(OH)2D]. This hormone then acts on the intestine, bone, and kidney to increase calcium levels in the circulation. An additional enzyme, 24-hydroxylase, converts 25(OH)D and 1,25(OH)2D to 24R,25-dihydroxyvitamin D [24,25(OH)2D] and 1,24,25-trihydroxyvitamin D, which are thought to be mostly inactive metabolites. Cashman et al14 have shown that 24,25(OH)2D may serve as a biomarker for nutritional status and also interferes with some immunoassays to cause a positive bias. Another possible source of interference is the C-3 epimer of 25(OH)D, which can be measured by using high-performance liquid chromatography–tandem mass spectrometry (LC-MS/MS). The epimer concentration is relatively constant throughout life, but constitutes a larger fraction of total 25(OH)D in infancy (mean of 10.1%, up to 61%) than in adulthood (mean of 2.3%–3.2%).15–18
While LC-MS/MS has made significant inroads as a measurement technique, traditional immunoassays are still the mainstay in most clinical laboratories. There is a lack of consensus among the various professional societies on how to diagnose and treat vitamin D deficiency.3,13,19 To confound the problem further, there is also a lack of standardization among diagnostic manufacturers for vitamin D testing. To investigate the performance of clinical assays for the measurement of 25(OH)D, the College of American Pathologists (CAP) initiated the Accuracy-Based Vitamin D (ABVD) Survey.
Accuracy-based proficiency testing surveys, a subset of those available from CAP, provide minimally processed human samples that have target concentrations determined by reference measurement procedure.20–24 As a result, individual laboratories can compare their results to the most accurate result possible. Grading is independent of instrument peer groups, allowing health care systems and instrument vendors to compare the performance of multiple instrument systems across sites.25 This article summarizes the data gathered during the first 6 years of the ABVD Program.
MATERIALS AND METHODS
Specimens
Proficiency testing samples were prepared from pooled off-the-clot serum obtained from several donors, some of whom received oral ergocalciferol (vitamin D2) at least 24 hours before their blood draw (under an Institutional Review Board [IRB]–approved protocol at the University of Washington, Seattle). Pools were created by using the C37-A guideline from the Clinical Laboratory Standards Institute, which describes the preparation and validation of homogenized commutable frozen human serum pools as secondary reference materials.26 Briefly, blood is drawn into a plastic blood collection bag and kept on ice. Plasma is separated within 5 minutes and transferred to a glass bottle. The plasma is allowed to clot at room temperature for 3 to 4 hours and the resulting serum is pooled and gently stirred for 18 hours at 4°C. The samples are then filtered, aliquoted, and frozen. The entire process is completed within 56 hours. Recently, in collaboration with the National Institutes of Health's Vitamin D Standardization Program, the specimens used in the ABVD Survey were demonstrated to be commutable and fit-for-purpose for proficiency testing of all clinical assays tested.7
Reference Measurements
Reference concentrations for 25(OH)D2, 25(OH)D3, and the C-3 epimer of 25(OH)D3 were determined by the isotope-dilution LC-MS/MS reference method procedure performed at the Fat-soluble Nutrients Laboratory, Nutritional Biomarkers Branch at the Centers for Disease Control and Prevention (Atlanta, Georgia).22
Data Analysis
Each laboratory provided results for total 25(OH)D and, if measured, for 25(OH)D2 and 25(OH)D3.25 Method means, standard deviations, and coefficients of variation were determined internally by the CAP. Summary data from the ABVD Survey were collected from participant summary reports spanning 2011–2017. Fractional deviation was calculated in Microsoft Excel 2010 (Microsoft, Redmond, Washington) as follows: (Observed Value – Reference Value) ÷ Reference Value. The data were analyzed by standard linear regression in Microsoft Excel and represent the results from up to 320 laboratories and included 9 different automated platforms and LC-MS/MS as a general methodology (Table 1), depending on the year. Multiple linear regression in the software package R (v.3.2.3) was used to predict the contribution of 25(OH)D2, 25(OH)D3, and the C-3 epimer of 25(OH)D3 to the measurement of the concentration of total 25(OH)D observed for each method. If the contribution of the epimer was determined to be statistically significant when 25(OH)D2 and 25(OH)D3 were included in the model, models with and without the C-3 epimer are presented. The multiple linear regression equations were then used to predict the concentrations that would be reported by each method for ABVD-08, which was included in the 2018-A mailing.
RESULTS
During the first 6 years of the ABVD Program, there were 43 samples distributed to participants. Data were provided to participants via participant summary reports and an example of these data is presented in Figure 1. The reports provided participants with descriptive data for each of the methods so that laboratories could review their results in comparison to other laboratories running the same method. The Accuracy-Based Surveys also provide reference target values, which have been established by a reference method procedure.
Sample Participant Summary Report. A representative table from the College of American Pathologists participant summary reports is shown (sample Accuracy-Based Vitamin D [ABVD-02] from 2013). Listed for each company or method category are the number of laboratories responding, as well as the mean, median, standard deviation (SD), coefficient of variation (CV), and low and high values for total 25-hydroxyvitamin D [25(OH)D]. The all-methods mean and related data for all methods are listed in the penultimate row, and the target concentration determined by the reference method procedure is shown in the bottom row. Abbreviation: CDC, Centers for Disease Control and Prevention.
Sample Participant Summary Report. A representative table from the College of American Pathologists participant summary reports is shown (sample Accuracy-Based Vitamin D [ABVD-02] from 2013). Listed for each company or method category are the number of laboratories responding, as well as the mean, median, standard deviation (SD), coefficient of variation (CV), and low and high values for total 25-hydroxyvitamin D [25(OH)D]. The all-methods mean and related data for all methods are listed in the penultimate row, and the target concentration determined by the reference method procedure is shown in the bottom row. Abbreviation: CDC, Centers for Disease Control and Prevention.
To evaluate each method, simple linear regression was performed for each methodology compared with the reference method procedure (Figure 2, A through H). The correlation coefficients (Pearson r2) of the mean total 25(OH)D measurements for each methodology (ie, the mean across all laboratories submitting data) versus the reference method were in the range of 0.9021 to 0.9962. The slopes of the linear regression varied from 1.00 to 1.17 and the y-intercepts varied from 0.50 to 4.7 ng/mL.
A through H, Correlation with reference method. The mean observed concentration of total 25-hydroxyvitamin D [25(OH) vitamin D] for each sample for each company or method category within each mailing is compared to the reference method procedure (2011–2017). Values for Siemens (G) obtained before recalibration are shown in gray. Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
A through H, Correlation with reference method. The mean observed concentration of total 25-hydroxyvitamin D [25(OH) vitamin D] for each sample for each company or method category within each mailing is compared to the reference method procedure (2011–2017). Values for Siemens (G) obtained before recalibration are shown in gray. Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
To test the hypothesis that immunoassays were variably able to detect 25(OH)D2, we obtained IRB approval to administer 100,000 units of ergocalciferol to participants with vitamin D deficiency (defined as a plasma 25(OH)D concentration of less than 20 ng/mL). Samples drawn from these research subjects were included in pools sent to survey participants such that approximately 30% of the samples sent out had low to moderate concentrations of vitamin D2 metabolites [4.4–23.5 ng/mL, making up 24%–57% of the total 25(OH)D2]. When evaluating the relative bias for each sample versus the concentration of 25(OH)D2 (Figure 3, A through H), it is apparent that the Abbott (Figure 3, A), Beckman (Figure 3, B), Diasorin (Figure 3, C), and Roche (Figure 3, F) methods failed to fully detect 25(OH)D2.
A through H, The influence of 25(OH) vitamin D2 on observed concentrations. For each sample (2011–2017), the fractional deviation from the reference method procedure is plotted against the percentage of the total 25-hydroxyvitamin D [25(OH) vitamin D] that is composed of 25(OH) vitamin D2. Fractional deviation was calculated as follows: (Observed Value – Reference Value) ÷ Reference Value. Values for Siemens obtained before recalibration are shown in gray. Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
A through H, The influence of 25(OH) vitamin D2 on observed concentrations. For each sample (2011–2017), the fractional deviation from the reference method procedure is plotted against the percentage of the total 25-hydroxyvitamin D [25(OH) vitamin D] that is composed of 25(OH) vitamin D2. Fractional deviation was calculated as follows: (Observed Value – Reference Value) ÷ Reference Value. Values for Siemens obtained before recalibration are shown in gray. Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
We also tested the hypothesis that C-3 epimeric 25(OH)D3 had effects on vitamin D results by comparing the relative and absolute bias with the amount of C-3 epimeric D3 in each sample (Figure 4, A through F). Somewhat surprisingly, the Abbott (Figure 4, A and D) and Siemens (Figure 4, C and F) methods appear to be affected by the epimer. Less surprisingly, the LC-MS/MS (Figure 4, B and E) methods also appear to be significantly affected by the D3 epimer, with the mean bias of the LC-MS/MS methods matching the mean proportion of epimer across the samples [5.9% of the total 25(OH)D, on average].
A through F, Influence of epimeric 25(OH)D3 on observed concentrations. For each sample (2011–2017), the fractional deviation from the reference method procedure is plotted against the percentage of the total 25-hydroxyvitamin D [25(OH)D] that is composed of epimeric 25(OH)D3 (top panels) or the absolute deviation is plotted against the concentration of the C-3 epimer in each sample (bottom panels). Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
A through F, Influence of epimeric 25(OH)D3 on observed concentrations. For each sample (2011–2017), the fractional deviation from the reference method procedure is plotted against the percentage of the total 25-hydroxyvitamin D [25(OH)D] that is composed of epimeric 25(OH)D3 (top panels) or the absolute deviation is plotted against the concentration of the C-3 epimer in each sample (bottom panels). Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
When the ABVD Program was initiated, we hypothesized that instrument manufacturers and those performing laboratory-developed procedures would use the results of the proficiency testing survey to modify their calibration when necessary so that laboratories would provide more relevant concentrations for patient care. To test this hypothesis, we examined the fractional deviation of each measurement from the reference method for each methodology during a 6-year period and 43 samples (Figure 5, A through H). The results illustrate different sample-specific biases across the methods, with a consistently positive bias for the LC-MS/MS (Figure 5, E) methods and a consistently negative bias for Roche (Figure 5, F) and Diasorin (Figure 5, C). Importantly, it was obvious that there were 2 instrument manufacturers that made changes to their assay, which greatly improved the agreement between their assays and the reference measurement procedure, namely, Siemens (Figure 5, G) and IDS iSYS (Figure 5, D).
A through H, Method performance over time. For all of the samples used in the Accuracy-Based Vitamin D (ABVD) Survey from 2011 to 2017, the fractional deviation from the reference method procedure is plotted against the ordinal sample number over time (N = 43 samples). Some methods did not have enough participating laboratories to generate a relevant mean. Arrows for Siemens and IDS iSYS indicate when recalibration was performed. Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
A through H, Method performance over time. For all of the samples used in the Accuracy-Based Vitamin D (ABVD) Survey from 2011 to 2017, the fractional deviation from the reference method procedure is plotted against the ordinal sample number over time (N = 43 samples). Some methods did not have enough participating laboratories to generate a relevant mean. Arrows for Siemens and IDS iSYS indicate when recalibration was performed. Abbreviation: LC-MS/MS, liquid chromatography–tandem mass spectrometry.
Finally, we hypothesized that from the results of the 43 samples, we would be able to predict the results of a proficiency testing sample with a significant amount of 25(OH)D2. To test this hypothesis, we used multivariable linear regression to determine the influence of 25(OH)D2, 25(OH)D3, and epimeric 25(OH)D3 concentrations on the measured concentration by each method (Table 2). We then used the resulting β-coefficients and intercepts to predict the mean concentration that would be reported for each method for the ABVD-08 sample that was sent out in the 2018-A mailing. For at least 1 model for each method, the predicted 25(OH)D results were within 15% of the observed 25(OH)D results, suggesting that the β-coefficients for 25(OH)D2, which are estimates of the recovery of each assay for 25(OH)D2, are approximately correct.
DISCUSSION
Harmonization of clinical laboratory results is important to maintain consistency of care from one institution to the next and to enable proper use of concentration guidelines established when using other methods. When available, reference method procedures and reference materials can be used to standardize assays, helping laboratories and manufacturers achieve the highest level of accuracy. By providing commutable samples (ie, samples that react in exactly the same manner as patient samples) with reference method procedure–defined concentrations, accuracy-based proficiency testing programs have delivered an important resource that allows laboratories and manufacturers to ensure that their results can be confidently used in the context of guidelines established elsewhere. Even in the absence of reference method procedures, accuracy-based proficiency testing can help bolster the harmonization of measurements over time.
By leveraging the results from many different laboratories around the world for each proficiency testing sample, we demonstrated sample-specific differences between each method and the reference method procedure. While some of the differences were explained by suboptimal cross-reactivity with 25(OH)D2 or unexpected cross-reactivity with epimeric 25(OH)D3, there are clearly other interferences in each of the assays that likely include different lipids and 24,25-dihydroxyvitamin D.14 In the future, it might be possible to use targeted or global metabolomics to identify what other molecules specifically cause the observed sample-specific matrix effects in this and similar accuracy-based proficiency testing surveys.
The underrecovery of vitamin 25(OH)D2 could be clinically significant in some populations in the United States, where the only pharmaceutical-grade preparation of vitamin D continues to be ergocalciferol. Unlike other proficiency testing programs, this survey was able to carefully evaluate the influence of endogenous 25(OH)D2 by using an IRB-approved protocol that administered pharmaceutical-grade ergocalciferol to vitamin D–deficient subjects. The results confirm previous findings27 and owing to the number of samples sent out over many years, allowed us to calculate the average recovery of 25(OH)D2 in patient samples. Most of the ligand-binding assays were found to underrecover this important analyte.
Another important lesson learned during the first 6 years of the ABVD Survey is that mass spectrometry is not perfect. In fact, many laboratories using LC-MS/MS failed to report values within the acceptable range (within 25% of the target value). There are 3 likely explanations for this; first, many laboratories that use LC-MS/MS make their own calibrators in-house, which can lead to significant bias with the reference method procedure if made improperly; second, LC-MS/MS assays have variable lower limits of quantification for 25(OH)D2, which can lead to a significant bias in samples with low total 25(OH)D and a high percentage of 25(OH)D2, as is commonly seen in patients receiving ergocalciferol therapy; lastly, insufficient chromatographic separation of the C-3 epimer of 25(OH)D3 leads to overrecovery of 25(OH)D3, which is most significant in neonates and infants, but will also lead to bias when compared to the reference method procedure in these pooled adult serum samples. The data from the ABVD Survey will hopefully be used by laboratories to refine their methods to reduce bias.
The most encouraging lesson learned might be the fact that 2 manufacturers made significant changes to their immunoassays, which substantially reduced the bias observed between each assay and the reference method procedure. These efforts are to be applauded by the laboratory community and will hopefully stand out as something other manufacturers could model as we attempt to improve patient care.
In conclusion, the first 6 years of the ABVD Survey have provided an important window into the quality of patient care surrounding vitamin D status, have allowed us to characterize the cross-reactivity of each assay for endogenous 25(OH)D2, and may have helped 2 manufacturers improve their assays. It is anticipated that once laboratories and manufacturers fully embrace the importance of harmonized or standardized clinical biomarker results, other accuracy-based proficiency testing programs will have similar success.
References
Author notes
Dr Hoofnagle received grant support from Waters, Inc. The other authors have no relevant financial interest in the products or companies described in this article.
Presented at the 2017 Meeting of the Vitamin D Standardization Program; November 2017; Bethesda, Maryland.
Competing Interests
All authors are current or former members of the College of American Pathologists Accuracy-Based Testing Committee (ABTC).