Use of cystatin C for glomerular filtration rate estimation (eGFR) has garnered heightened interest as a means to avoid race-based medicine, since eGFRcys equations do not require specification of race. Before considering more widespread use of cystatin C, it is important to confirm that assays provide accurate measurements of cystatin C concentration, to ensure accurate GFR estimates.
To determine if the accuracy of cystatin C measurements in laboratories participating in the College of American Pathologists (CAP) Cystatin C (CYS) survey has improved since 2014.
Two fresh frozen serum pools, the first from healthy donors without chronic kidney disease (CKD), and the second from patients with CKD, along with a synthetically prepared elevated cystatin C pool, were sent to laboratories participating in the 2019 CYS-A survey. Target values were established by using 2 immunoassays and a bracketed 2-point calibration with diluted ERM-DA471/IFCC reference material.
For the healthy donor fresh frozen pool (ERM-DA471/IFCC-traceable target of 0.725 mg/L), the all-method mean (standard deviation, coefficient of variation) was 0.731 mg/L (0.071, 9.7%). For the CKD pool (ERM-DA471/IFCC-traceable target of 2.136 mg/L), the all-method mean was 2.155 mg/L (0.182, 8.4%). For the synthetically spiked pool (ERM-DA471/IFCC-traceable target of 1.843 mg/L), the all-method mean was 1.886 mg/L (0.152, 8.1%). This represents marked improvement in accuracy and between-method agreement compared to the 2014 CAP survey.
Manufacturers have markedly improved accuracy and between-method agreement of cystatin C measurement procedures since 2014, which allows for greater confidence in estimated GFR relying on cystatin C.
Cystatin C, a low-molecular-weight protein freely filtered by the kidneys, is an established alternative to creatinine for glomerular filtration rate (GFR) estimation.1–10 Current Kidney Disease: Improving Global Outcomes (KDIGO) guidelines recommend use of estimated GFR based on cystatin C (eGFRcys) in situations where estimated GFR based on creatinine (eGFRcr) may be less accurate.11 Additionally, concomitant use of both creatinine and cystatin C measurements for GFR estimation (eGFRcr-cys) has been shown to provide greater accuracy than reliance on one, or the other biomarker alone.2,12
In the last year, with an increased spotlight on issues related to racial injustice, heightened scrutiny on practices that involve race-based medicine has led to criticism of eGFRcr owing to the need to specify whether the patient is Black or non-Black for computation of the eGFR.13–16 This has led to calls to discontinue race-based reporting of eGFRcr, and a vigorous debate on how to best achieve this goal.17 One of many options under discussion is the eGFRcys equation, which does not include a race parameter.17
Before recommending more widespread use of serum cystatin C concentration, it is important to ensure that cystatin C measurements in clinical laboratories are accurate, and that different measurement procedures are in agreement with one another. Disagreement would lead to variation in eGFR values reported by different laboratories, which would be misinterpreted as changes in actual GFR. In 2014, an assessment of laboratories' performance in the College of American Pathologists (CAP) Cystatin C Proficiency Testing Program (CYS survey) revealed a concerning lack of measurement procedure agreement as well as substantial biases from target for some measurement procedures.18 This report concluded that improvements in cystatin C measurement across procedures were needed in order for cystatin C to provide reliable estimates of GFR for clinical care. To determine whether performance of cystatin C assays has improved, we set forth to examine proficiency testing data from the 2019 CAP cystatin C CYS survey, to assess the state of cystatin C assay performance across laboratories, and to determine whether broader use of cystatin C for GFR estimation can really be recommended at this time.
METHODS
The 2019 CAP CYS-A survey samples were shipped in April 2019 and had 207 participating laboratories reporting cystatin C results. Laboratories were instructed to treat survey samples like routine clinical samples and measure each sample a single time, and report the concentration back to CAP. This survey included 3 proficiency testing samples. Samples CYS-01 and CYS-03 were pooled, off-the-clot, fresh frozen serum without additives or additional manipulation, and CYS-02 was created by synthetically spiking pooled serum from CYS-01 with purified cystatin C. Samples CYS-01 and CYS-02 were prepared by Aalto Scientific, Ltd. CYS-01 was obtained from 22 donors, resulting in a cystatin C level in the normal range. Blood was collected from donors in serum tubes without anticoagulant or preservatives, and immediately centrifuged after collection. The supernatant was collected and allowed to clot at room temperature for at least 24 hours. After recentrifugation, the supernatant was collected, then frozen at −20°C; final serum volumes ranged from 205 to 251 mL per donor. Once all participant samples were collected, units were thawed at 2°C to 8°C and pooled. A portion of the CYS-01 pool was then spiked with purified cystatin C from human serum to create CYS-02, a pooled sample with an elevated cystatin C concentration. Cystatin C was purified by Aalto Scientific from human serum, using a proprietary tangential flow filtration method, which separates proteins by molecular weight. Both the CYS-01 and CYS-02 pools were then individually aliquoted into testing vials, and stored at −10°C to −20°C until shipment to participating laboratories. CYS-03 was obtained from approximately 12 patients with known chronic kidney disease (CKD) at the Kidney Disease and Blood Pressure Center at Tufts Medical Center (Boston, Massachusetts), and thus had an elevated concentration of cystatin C. Blood was collected from each participant in five 7.5-mL double-gel serum separator tubes without anticoagulant or preservatives, and immediately centrifuged at 1500g for 10 minutes at room temperature in a swinging bucket centrifuge. Tubes were same-day overnight shipped with a refrigerated gel pack to the University of Minnesota Advanced Research and Diagnostic Laboratory (ARDL). At ARDL, serum from each participant was combined into a single large aliquot tube and frozen at −80°C. Once serum from all participants was collected, the serum aliquots were thawed, pooled, and aliquoted into individual testing vials, then frozen at −80°C. These testing vials were then shipped with a refrigerated gel pack overnight to Aalto Scientific, where they were stored at −10°C to −20°C until shipment to participating laboratories. All testing vials were shipped from Aalto Scientific to individual participating laboratories overnight with a refrigerated gel pack, which allows for partial thawing during transit. Participating laboratories were then instructed to store the samples at 2°C to 8°C for up to 3 days. The relevant institutional review boards reviewed and approved these blood collections.
Target values for each sample were established with the ERM-DA471/IFCC reference material (obtained from the Joint Research Centre), as previously described.18 The 3 CYS survey samples and 1:2 and 1:6 gravimetric dilutions of ERM-DA471/IFCC reference material in normal saline were measured in triplicate on 2 different days (6 total measurements per sample), using 2 different measurement procedures (Siemens ProSpec instrument using Siemens reagents and calibrators, and the Roche COBAS instrument using Gentian reagents and calibrators). Mean values of the reference material dilutions were used to create a 2-point calibration line for each measurement procedure, by plotting the reference measurement procedure results versus the calculated ERM-DA471/IFCC reference material dilution concentrations. Using the 2-point ERM-DA471/IFCC calibration lines, ERM-DA471/IFCC-traceable concentrations were computed for each of the 3 survey pools. For each survey pool, the target values from the 2 measurement procedures were averaged to obtain a final target value for each survey sample. Since the certified cystatin C mass concentration of undiluted ERM-DA471/IFCC has an expanded uncertainty of ±2.7% (k = 2), uncertainty of these calculated target values is estimated to be about 4%.18
All analyses and graphics were completed by using SAS (Cary, NC). Data were analyzed for all measurement procedures combined, and 5 reagent/calibrator groups with at least 10 users were subgrouped for analysis and comparison: Roche COBAS c Series; Gentian; Diazyme Laboratories; and 2 Siemens subgroups, namely Siemens Nephelometer Systems and Siemens Nephelometer Systems IFCC-traceable. Siemens Nephelometer Systems users self-categorized on the basis of whether their method calibration was traceable to the ERM-DA471/IFCC reference material. A 2-pass 3 standard deviation (SD) outlier screen was applied at both the all-method analysis and the subgroup analysis. The Shapiro-Wilk test was used to assess the distribution of results for each subgroup. Analysis of variance (ANOVA) was used to test for differences in means between the measurement procedures for each sample.
RESULTS
A total of 207 laboratories returned single assay results for the 2019 CAP CYS-A survey; for CYS-01 and CYS-02 the 3SD outlier screen removed 2 outliers each, and therefore 205 laboratory-reported results are presented for these 2 survey samples. The all-method mean, SD, coefficient of variation (CV), target value, and percentage bias of all-method mean from target are shown in the Table for each of the three 2019 CYS survey samples. For comparison, analogous sample data from the CAP 2014 CYS survey are also shown.18 The CVs for all three 2019 CYS survey samples were lower than 10% and were lower than the CVs for all samples in the 2014 CYS survey, indicating an improvement during a 5-year period in all-method variability. Additionally the percentage biases [standard error] of the all-method means from target values were low for all three 2019 CYS survey samples (0.83% bias [0.68%] from the 2019 CYS-01 target of 0.725 mg/L; 0.89% bias [0.59%] from the 2019 CYS-03 target of 2.136 mg/L; and 2.33% bias [0.58%] from the 2019 CYS-02 target of 1.843 mg/L), which was also an improvement from the percentage biases seen in the 2014 CYS survey (−4.73% bias [1.09%] from the 2014 CYS-WC2 target of 2.370 mg/L; and −6.88% bias [1.2%] from the 2014 CYS-WC1 target of 0.960 mg/L).
Comparison of 2019 Cystatin C (CYS) Survey and 2014 CYS Survey Statistics for All Measurement Procedures

Figure 1 demonstrates the between-method agreement and variability of individual measurement procedures for the larger subgroups (≥10 participants), relative to the ERM-DA471/IFCC-traceable target values for each sample (horizontal lines). The Shapiro-Wilk test demonstrated normal distributions for all samples and measurement procedures, except for the Siemens Nephelometer Systems IFCC-traceable measurement procedure for sample CYS-02. The between-measurement procedure agreement has substantially improved when compared to 2014 CYS survey data.18 Means for individual measurement procedures for the 2014 CYS-WC1 healthy donor pool ranged from 0.780 to 1.061 mg/L (81.3%–110.5% of the target), whereas the measurement procedure means for the 2019 CYS-01 healthy donor pool ranged from 0.685 to 0.775 mg/L (94.5–106.9% of the target). Individual measurement procedure means for the 2014 CYS-WC2 CKD pool ranged from 2.052 to 2.909 mg/L (86.6%–122.7% of the target), with the 2019 CYS-03 CKD pool measurement procedure means ranging from 2.056 to 2.250 mg/L (96.3%–105.3% of the target). Lastly the 2014 CYS-01 synthetically spiked sample had measurement procedure means of 1.365 to 2.203 mg/L (target not determined), with the 2019 CYS-02 synthetically spiked pool ranging from 1.806 to 1.926 mg/L (98.0%–104.5% of the target). Therefore all three 2019 CYS survey samples demonstrated improved between-method agreement when compared to the analogous 2014 CYS survey samples. In terms of individual measurement procedure variability in the 2019 survey, the Roche COBAS c Series, Siemens Nephelometer Systems IFCC, and Gentian had relatively small interquartile ranges (IQRs) for all 3 samples, indicative of lower variability in results across laboratories. In contrast, the Diazyme and Siemens Nephelometer Systems had the largest IQRs across the 3 survey samples, indicating greater variability in results from participants. Additionally, in comparing the 2 Siemens Nephelometer Systems subgroups, the IFCC-traceable measurement procedure is closer to the ERM-DA471/IFCC-traceable target values for all 3 survey samples than the nontraceable Siemens Nephelometer Systems measurement procedure, confirming an expected improvement in accuracy with standardization. The nontraceable Siemens Nephelometer Systems measurement procedure results are consistent with prior reports of negative bias with nontraceable Siemens measurement procedures.18–20
Boxplots sorted by method-specific groups for each sample in the 2019 cystatin C CAP survey. Lower and upper limits of each box represent the 25th to 75th percentile. Within each box, the diamond represents the mean and the bar represents the median. Whiskers extend to the values within 1.5 times the IQR above and below the box limits. Any dots represent values above or below the 1.5*IQR limits. The horizontal line across each set of boxplots represents the ERM-DA471/IFCC-traceable target value for each sample. CYS-01 is pooled serum from healthy (non-CKD) donors, CYS-02 is pooled serum synthetically spiked with cystatin C, and CYS-03 is pooled serum from donors with CKD. Siemens Nephelometer Systems users were asked to self-categorize on the basis of whether their measurement procedure had an IFCC-traceable calibration (method 5) or not (method 6). Abbreviations: CAP, College of American Pathologists; CKD, chronic kidney disease; IQR, interquartile range.
Boxplots sorted by method-specific groups for each sample in the 2019 cystatin C CAP survey. Lower and upper limits of each box represent the 25th to 75th percentile. Within each box, the diamond represents the mean and the bar represents the median. Whiskers extend to the values within 1.5 times the IQR above and below the box limits. Any dots represent values above or below the 1.5*IQR limits. The horizontal line across each set of boxplots represents the ERM-DA471/IFCC-traceable target value for each sample. CYS-01 is pooled serum from healthy (non-CKD) donors, CYS-02 is pooled serum synthetically spiked with cystatin C, and CYS-03 is pooled serum from donors with CKD. Siemens Nephelometer Systems users were asked to self-categorize on the basis of whether their measurement procedure had an IFCC-traceable calibration (method 5) or not (method 6). Abbreviations: CAP, College of American Pathologists; CKD, chronic kidney disease; IQR, interquartile range.
The improvement in between-measurement agreement and accuracy from 2014 to 2019 is further highlighted in Figure 2 for 4 commonly used measurement procedures. The percentage bias relative to ERM-DA471/IFCC-traceable target values for the CKD and healthy (non-CKD) donor pooled serum survey samples in 2014 and 2019 are shown for comparison. In 2014, the percentage bias for the CKD sample ranged from −14.4% to +21.7% for the 4 methods, whereas in 2019 the percentage bias range was much smaller, from −2.9% to +4.3%. Similarly, the healthy (non-CKD) donor pool sample had a percentage bias range of −19.8% to +9.5% in 2014, with a narrower range of −5.3% to +5.9% in 2019. Therefore, measurement procedures have improved in terms of between-method agreement and accuracy.
Comparison of percentage bias from the ERM-DA471/IFCC-traceable target value for 4 commonly used methods in 2014 versus 2019. Left panel represents the pooled serum survey samples collected from patients with CKD (2014 CYS-WC2 and 2019 CYS-03), whereas the right panel represents pooled serum survey samples from healthy, non-CKD donors (2014 CYS-WC1 and 2019 CYS-01). For 2019, since Siemens Nephelometer Systems users were split into ERM-DA471/IFCC-traceable and nontraceable method groups, the IFCC-traceable subgroup was used to construct this plot. Abbreviations: CKD, chronic kidney disease; WC, wild card.
Comparison of percentage bias from the ERM-DA471/IFCC-traceable target value for 4 commonly used methods in 2014 versus 2019. Left panel represents the pooled serum survey samples collected from patients with CKD (2014 CYS-WC2 and 2019 CYS-03), whereas the right panel represents pooled serum survey samples from healthy, non-CKD donors (2014 CYS-WC1 and 2019 CYS-01). For 2019, since Siemens Nephelometer Systems users were split into ERM-DA471/IFCC-traceable and nontraceable method groups, the IFCC-traceable subgroup was used to construct this plot. Abbreviations: CKD, chronic kidney disease; WC, wild card.
It was also of interest to determine whether the biases seen with the synthetically spiked CYS-02 sample matched biases detected from the off-the-clot CYS-03 sample collected from donors with CKD, because collecting samples from patients with CKD on a regular basis for every proficiency testing challenge is logistically and economically impractical. Figure 3 demonstrates an excellent agreement in the biases seen between the 2 samples, which suggests that a synthetic sample is an adequate surrogate for detecting cystatin C assay biases in patient samples.
Comparison of the absolute bias (mg/L) for 2019 cystatin C survey sample CYS-02 (synthetically spiked) to the absolute bias (mg/L) seen for 2019 cystatin C survey sample CYS-03 (obtained from CKD donors). Line of identity (x = y) is shown for reference. Color-coded symbols represent individual bias results for measurement procedures with 10 or more participants.
Comparison of the absolute bias (mg/L) for 2019 cystatin C survey sample CYS-02 (synthetically spiked) to the absolute bias (mg/L) seen for 2019 cystatin C survey sample CYS-03 (obtained from CKD donors). Line of identity (x = y) is shown for reference. Color-coded symbols represent individual bias results for measurement procedures with 10 or more participants.
DISCUSSION
In the last 5 years, the between-method agreement and accuracy of cystatin C measurement procedures has substantially improved. Specifically, the ranges of individual measurement procedure means and differences from their targets were smaller for all 2019 CYS survey samples than for 2014 CYS survey samples, and CVs and percentage biases from target values were lower in both normal and elevated CAP CYS cystatin C 2019 survey samples than in 2014 CYS survey results. Among the most commonly used measurement procedures, Roche, Gentian, and the Siemens measurement procedure with IFCC-traceable calibration demonstrated the lowest variability (low IQRs) in results among participants. In addition, our data demonstrate that the degree of bias in a synthetically prepared survey sample closely approximates the bias seen in a pooled off-the-clot serum sample from patients with CKD, providing support for the use of synthetically prepared samples as surrogates for patient samples in cystatin C proficiency testing.
While performance of cystatin C measurement procedures significantly improved during this 5-year period, caveats and concerns remain. First, despite the improved performance of cystatin C measurement procedures, the accuracy and between-method agreement of cystatin C falls well short of creatinine,21,22 which is still considered the most widely accepted filtration marker for GFR estimation. Virtually all creatinine measurement procedures have been universally standardized to an isotope-dilution mass spectrometry (IDMS) reference measurement procedure, with a serum-based reference material (SRM 967 from the National Institute of Standards and Technology) available to manufacturers and laboratories that is traceable to IDMS. While a certified reference material for cystatin C (ERM-DA471/IFCC) is available, there are currently no certified reference measurement procedures for cystatin C to definitively establish target values for the cystatin C reference material.
Second, while all cystatin C assay manufacturers now have measurement procedure “kits” that are traceable to the ERM-DA471/IFCC reference material, there is still distribution of non–ERM-DA471/IFCC traceable measurement procedure “kits” to users of older Siemens platforms, such as the Advia Centaur or Siemens Dimension Vista. Therefore, users of older Siemens platforms should be aware that use of eGFR equations established from assays standardized to ERM-DA471/IFCC reference material will not provide accurate eGFR estimates; prior published data estimate that nontraceable Siemens cystatin C results are biased approximately 12% to 17% low.18–20 Relatedly, we suspect that some Siemens participants in the 2019 CYS survey may not be aware of the reference material–traceability status of their measurement procedure calibration, and may have selected the incorrect Siemens reagent/instrument group for the 2019 CYS survey. This could explain the nonnormal distribution for survey sample CYS-02 for the Siemens Nephelometer Systems IFCC results, and could also explain the higher IQRs seen for the Siemens Nephelometer Systems for all three 2019 CYS survey samples, if the results represent a mixture of measurement procedures with nontraceable and IFCC-traceable calibration. Data shown in Figure 3 would support this hypothesis, as there are 2 distinct clusters for the Siemens Nephelometer Systems users, with one cluster showing a negative bias (consistent with non-traceable calibration) and the other cluster showing minimal bias (consistent with IFCC-traceable calibration).
There are some limitations to this study. First, both Gentian and Diazyme reagents can be used on multiple instruments; we did not have data on the specific instrument used by each laboratory for these reagents and therefore cannot draw conclusions about performance of these reagents on specific instrument platforms. However, it is remarkable that the IQRs for Gentian reagents were relatively low, given the number of potential instruments used by clinical laboratories. By contrast, the larger IQRs seen with Diazyme reagents may reflect issues of discordance in reagent performance across various instruments. Second, laboratories using Siemens reagents/instruments were responsible for self-selecting whether they should be included in the IFCC-traceable or nontraceable measurement procedure groups; we cannot verify whether their self-selected measurement procedure groupings were accurate. In fact, it appears from our data that a subset of laboratories that categorized as nontraceable had results more consistent with an IFCC-traceable calibration.
CONCLUSIONS
In conclusion, harmonization and accuracy of cystatin C measurement procedures are both important to ensure accurate estimates of GFR when using standardized equations across different reagent and/or instrument platforms. Our data demonstrate marked improvement in both metrics during a 5-year period, providing greater confidence in the clinical utility and reliability of cystatin C measurements. This has important implications for avoiding race-based medicine when estimating GFR, as creatinine-based eGFR has faced increased criticism for its inclusion of a race parameter. However, establishment of certified reference measurement procedures for cystatin C is needed to approach the accuracy and between-method agreement already achieved for standardized creatinine measurement procedures.
We are grateful to the blood donors at Tufts Medical Center (Boston, Massachusetts) with chronic kidney disease (CKD) who provided blood samples. We also thank the clinical laboratories that participated in the College of American Pathologists' CYS survey and therefore contributed to this study, the CKD-EPI research group at Tufts Medical Center, staff at the Advanced Research and Diagnostic Laboratory (ARDL) at the University of Minnesota, and the staff and committee members at College of American Pathologists who supported this project. Lastly, we are grateful to Greg Rynders, BS, now retired from ARDL, for his expertise and contributions to data collection for this manuscript.
References
Author notes
Authors received grants supporting this work from the National Institute of Diabetes and Digestive and Kidney Diseases (R01DK097020, U01DK053869, R01DK116790). Karger has received research support from Siemens Healthcare Diagnostics and is a consultant for Roche Diagnostics Corporation. The other authors have no relevant financial interest in the products or companies described in this article.
The Advanced Research and Diagnostic Laboratory (ARDL) at the University of Minnesota has received free or considerably discounted reagents for measurement of cystatin C and other CKD-related biomarkers for a variety of National Institutes of Health–funded research studies from Siemens Healthcare Diagnostics.