Context.—

Reliable quantification of α-fetoprotein (AFP) is critical for clinical diagnosis. Accuracy in AFP analysis relies on traceability to reference materials with confirmed commutability.

Objective.—

To assess the commutability of the reference materials for AFP. We screened for appropriate reference materials for the calibration of clinical AFP analysis and for application in an external quality assessment scheme. The feasibility of using water to dilute a reference material from the World Health Organization was also evaluated.

Design.—

Patient serum samples with various levels of AFP were randomly interspersed among AFP reference materials from the World Health Organization, the Beijing Center for Clinical Laboratories, and Beijing Controls and Standards Biotechnology and quality controls from Bio-Rad. The samples were analyzed on 5 different platforms to assess the comparability of the results and commutability of the reference materials.

Results.—

Significant variations in AFP measurement were observed among the 5 instrument platforms. The Beijing Center for Clinical Laboratories and Beijing Controls and Standards Biotechnology reference materials were commutable across all the instrument platforms. The World Health Organization AFP 72/225 reference material diluted with distilled water was also commutable at high concentrations. The Bio-Rad quality control materials for AFP were commutable among 4 out of 5 instrument platforms.

Conclusions.—

Our results suggested that the Beijing Center for Clinical Laboratories and Beijing Controls and Standards Biotechnology materials were commutable across all 5 instrument platforms, whereas the Bio-Rad quality controls were limited by the concentration of AFP and the instrument platforms used. Caution needs to be taken in using water to dilute the World Health Organization 72/225 reference material because its commutability is limited to high concentrations.

α-Fetoprotein (AFP) is one of the most commonly used markers in the clinical diagnosis of cancer. Its expression is highly specific for liver, and it is commonly used in screening of high-risk populations for primary liver cancer and in diagnosis and monitoring of therapeutic outcome and tumor recurrence. In addition, AFP levels in maternal blood and amniotic fluid are used for screening of developmental abnormalities in the fetus. Accurate measurement of AFP, therefore, weighs heavily in decision making for diagnosis and patient management.1 

Achieving accuracy within clinical meaningful limits is challenging, as measurement procedures and instruments in different clinical laboratories differ. Though international standards have helped in the standardization and harmonization of clinical measurement, their use is rather limited to certain measurands because of the lack of commutability. Accuracy in AFP analysis relies on traceability to international standard materials because of the unavailability of a reference measurement procedure. The AFP reference material 72/225, provided by the World Health Organization (WHO), has been used for calibration by the majority of instrument platforms; however, the commutability of this reference material is yet to be confirmed.

Participating in external quality assessment schemes (EQAS) helps improve the reliability of the measurement of AFP. Results of EQAS from the College of American Pathologists, the Beijing Center for Clinical Laboratories (BCCL), and Stichting Kwaliteitsbewaking Medische Laboratoriumdiagnostiek of the Netherlands indicate significant variability in the measurement of AFP using different instrument platforms.2  This suggests that commutability of the EQA materials and the WHO reference standard is questionable, and this may potentially affect the comparability of the results.

In the present study, the comparability of AFP measurements using 5 different instrument platforms was evaluated using 24 human serum specimens. Reference materials from WHO, BCCL, and Beijing Controls and Standards Biotechnology (BCSB) and quality control materials from Bio-Rad (Hercules, California) were used in the commutability study so as to screen for appropriate reference materials for EQAS. In addition, the main factors contributing to the variation were explored to guide further standardization.

Patient Samples

Twenty-four human serum specimens with low, medium, and high levels of AFP were collected from Beijing Chao-yang Hospital, Capital Medical University (Beijing, China), after approval from its human ethics committee (document 2015-7-24-11). All 24 specimens were nonhemolyzed and nonlipemic, with AFP levels ranging from 11.44 to 786.58 ng/mL as determined using the Abbott Architect i2000 (Abbott Diagnostics, Lake Forest, Illinois). For every serum sample, a minimum of 2.5 mL was collected, mixed, aliquoted into 5 parts, and stored at −80°C until analysis.

Reference Materials

Standard materials for AFP analysis included WHO 72/225 (60 500 ng/mL), purchased from National Institute for Biological Standards and Control (Hertfordshire, United Kingdom); 3 candidate materials with different concentrations of AFP from BCCL (Beijing, China); a certified AFP standard (GBW: E090088) from BCSB (Beijing, China); and 3 quality control materials with low, medium, and high levels of AFP supplied by Bio-Rad Lyphochek Tumor Marker Plus Control (54581, 54582, and 54583, respectively).

Instrument Platforms and Analysis

The 5 analytical systems used in this study included the Architect i2000SR Immunoassay Analyzer (Abbott Diagnostics), the UniCel DXI 800 Immunoassay System (Beckman Coulter, Brea, California), the Cobas e601 Analyzer (Roche Diagnostics, Indianapolis, Indiana), the Advia Centaur XP Immunoassay System (Siemens, Erlangen, Germany), and the Lumo A2000 Immunoassay System (Autobio, Zhengzhou, China), each with the manufacturer's own reagents and calibrators. The instruments were calibrated according to each manufacturer's specifications in the instructions for use.

In addition, the WHO standard material 72/225 was diluted by distilled water, which is commonly used in the dilution of the calibrators, into 10 different levels (966.2, 724.1, 482.9, 241.7, 180.8, 120.7, 60.5, 30.1, 18.1 and 6.0 ng/mL). These standards, together with the 3 reference materials from BCCL, 1 from BCSB, and 3 quality control materials from Bio-Rad, were randomly interspersed among the 24 patient serum samples. The AFP levels in these samples/standards were then measured in the same analytical sequence in triplicate using the 5 different instrument platforms to assess the possibility of using water to dilute this widely used standard.

Statistical Analysis

Microsoft Excel 2010 (Microsoft, Redmond, Washington) was used to process the data, using formulas provided in the Clinical and Laboratory Standards Institute (CLSI) guideline EP30-A (formerly CLSI C53-A).3  Results from 1 platform were individually compared with those from the other 4 systems, and general linear regression analysis was performed to calculate the slopes (b), intercepts (a), correlation coefficient (r), and residuals (Sres). The coefficient of variation (CV) of the average of all 24 human serum specimens across all 5 platforms was used as a standard. If 2CV was lower than the allowable total error (TEa), the results were regarded as comparable. Analysis of variance (randomized block design) was performed to assess the significance in variation among platforms using SPSS (version 17.0; SPSS, Chicago, Illinois).

The commutability of AFP reference materials was evaluated by Deming regression, which was carried out following CLSI guideline EP30-A, to obtain the regression and the 95% prediction intervals. The reference materials that fell within the 95% prediction intervals were regarded as commutable between 2 instrument platforms.3,4  To estimate the matrix-related biases for noncommutable materials, the predicted logarithms were back transformed and the variations of measured values from the predicted values were calculated.4 

Effect of the diluents—distilled water and the provided diluents from individual instrument suppliers—on the accuracy of AFP measurement was assessed using WHO AFP standard 72/225. The certified concentrations of AFP and the detected levels on individual platforms were compared to calculate the percentage deviations and linear regression.

Comparability of the Analytical Results of AFP

The levels of AFP in 24 patient serum samples determined by the 5 different instrument platforms are summarized in Table 1 and detailed in Supplemental Table 1 (see supplemental digital content containing 3 tables at www.archivesofpathology.org in the October 2017 table of contents). The median values of the 24 samples were calculated for individual analytical systems. The results suggested that the measurements of AFP levels by the Cobas platform were generally high, and those by the A2000 platform were low. The average CV in the detected values for all 24 samples on 5 platforms was 12.7% (Supplemental Digital Content Table 2); thus, 2CV was 25.4%, which was higher than a TEa of 21.9%. The TEa was selected based on the biological variation.5  The results suggested significant variations in the detected values of AFP across different platforms (P < .05).

To evaluate the performance of individual instrument platforms, the detected values for 24 samples were compared between 2 systems, covering all possible combinations, by linear regression analysis. Thus, for each platform, 4 pairs of comparisons were performed and average Sres, r, b, and a values calculated (Table 1; Supplemental Digital Content Table 3). The data indicated apparent system deviations (2%∼13%) among 5 platforms with average residuals ranging from 11.60 to 20.02.

Commutability of Reference Materials

The variations in the analytical results of AFP on 5 commonly used instrument platforms internationally necessitated the screening for reference materials, which ideally should manifest robust equivalences of the mathematical relationships among the results obtained using different measuring protocols. We thus tested reference materials from BCSB, BCCL, Bio-Rad, and WHO, and their commutability scores are detailed in Table 2.

Commutability was evaluated by a pairwise comparison between any 2 measurement procedures of the measured values of the reference materials with known AFP concentrations, and the plots are shown in Figure 1, A through J. The results indicated that the reference materials from BCCL and BCSB were commutable across all 5 instrument platforms, with 10 “commutable” marks out of 10 pairwise comparisons (Table 2; Figure 1, A through J). The Bio-Rad reference material was commutable with 4 of the 5 platforms tested (Table 2; Figure 1, A, B, C, E, F, and H). The WHO AFP standard 72/225 diluted with distilled water showed substantial commutability at high concentration except at the low level (6.0 ng/mL), and acceptable equivalence in its measured values was limited to 3 comparison pairs (Table 2; Figure 1, E, F, and I). The matrix bias range was 20% to 66%.

Effects of the Diluents on the Analytical Results of AFP

Poor concordance of the clinical analysis results from different measurement procedures can also be attributed to the matrix differences for the reference materials, the diluents used, and the impact of dilution on the measurand in solution.6  We therefore examined the effects on the measurands caused by dilution using distilled water or diluents provided by the suppliers of individual instrument platforms. The WHO AFP standard 72/225 was used in this analysis. The impacts were assessed by calculating the deviation of the measured values from the certified levels, and the averages of 3 replicates are shown in Table 3. The results indicated obvious deviations in the measured values obtained on all 5 platforms at a certified level of 6.0 ng/mL. Using distilled water as a diluent generally exaggerated the deviation, except for platform A2000, which underestimated AFP concentrations irrespective of the diluent used (Table 3). At higher concentrations of AFP, the Architect system overestimated the AFP level when the samples were diluted with distilled water, but underestimated when its respective diluent was used. Other instruments were not sensitive to the change of diluent (Table 3; Figure 2, A through I).

The diagnosis of some liver diseases, abnormal pregnancy, and certain malignancies relies on accurate measurement of serum AFP level.7  Achieving comparable results regardless of instrument platforms and measurement procedures has been a fundamental goal of laboratory medicine to avoid misdiagnosis.1  The strategy of comparing 2 measurement procedures has been recommended by CLSI guideline EP09,8  and is particularly applicable to a scenario in which a validated measurement procedure is used to evaluate another protocol. However, it is challenging to use this scheme for a measurand for which there is no high-order measurement procedure available. Thus, in this study, we assessed the comparability of AFP measurements on 5 instrument platforms. Although analytical performance specifications based on direct or indirect clinical outcomes are highly recommended for the assessment of between-method comparability, these can be affected by current measurement quality, characteristics of the investigated population, and health care setting.9  Therefore, the desirable TEa value was based on biological variation, which is a more general approach to recommending analytical specifications5,10 : each instrument platform was compared with 2 times the average CV of the average measured values of all samples on the 5 instrument platforms to assess the comparability of analytical results.

Although all 5 measurement procedures tested in this study are believed to have calibration traceable to the WHO AFP standard 72/225, the 2CV value (25.4%) was higher than the desirable TEa (21.9%), which was calculated based on a previously used formula10  [TEa = 1.65 × 0.5CVI + 0.25(CVI2 + CVG2)0.5]. Significant system error was observed in the measured values and poor correlations were found between platforms. The poor comparability is a result not only of calibration, but also of the questionable specificity of the measurement procedures. Antibodies used by different suppliers may recognize different epitopes of AFP, as previously observed for other analytes.11,12 

Comparability of the clinical analysis relies on the standardization of the measurement procedures, and traceability of the standard used in calibration is required. Poor commutability of the reference materials may result in noncomparable results across different platforms. The CLSI guideline EP14 recommends the evaluation of a protocol for matrix effect based on a reference procedure.13  In this study, we selected 5 measurement procedures commonly used for AFP analysis and assessed the commutability of reference materials following the CLSI EP30-A protocol.3,13  The results indicate that the reference materials from BCCL and BCSB are commutable across all 5 procedures, representing ideal reference materials for future standardization of AFP measurement and for EQAS. However, the quality control materials from commercial suppliers, which have been frequently used in EQAS, were found to be commutable only between certain measurement procedures (Table 4; Figure 1, A through C, E, F, and H).

Precise determination of low levels of AFP in blood is particularly important for clinical diagnosis and monitoring of therapeutic outcomes.2  The commonly used reference material and the international standard for AFP, WHO 72/225, originates from human cord blood,14  and is prepared at a high concentration (60 500 ng/mL). Minimizing the impact of the dilution on AFP quantification determines the accuracy of calibration of the measurement procedures and the analysis of patient samples. Our data suggest that dilution with distilled water led to poor commutability at a low level (6.0 ng/mL) between different measurement protocols, with a matrix bias of 20% to 66% (Table 4; Figure 1, E, F, and I). The diluents recommended by the supplier of the individual instrument platforms were not ideal, either, because of the deviations from the certified value observed. It is, therefore, important in clinical analysis to determine the matrix effect on the commutability of the WHO standard 72/225 in the range of concentrations examined.

In summary, the standardization of AFP analysis is yet to be accomplished and further studies are required to identify the optimal reference materials that are robustly commutable across different measurement procedures. The frozen mixture of human serum exhibits better commutability than the candidate materials originated from other sources. If the WHO AFP standard 72/225 is used, appropriate diluents are needed to ensure accurate calibration.

1
Miller
WG,
Jones
GR,
Horowitz
GL,
Weykamp
C
.
Proficiency testing/external quality assessment: current challenges and future directions
.
Clin Chem
.
2011
;
57
(
12
):
1670
1680
.
2
Houwert
AC,
Lock
MT,
Lentjes
EG
.
Alphafetoprotein in the Dutch External Quality Assurance programme: a need for improvement
.
Ann Clin Biochem
.
2012
;
49
(
pt 3
):
273
276
.
3
Vesper
H,
Emons
H,
Gnezda
M,
et al.
Characterization and Qualification of Commutable Reference Materials for Laboratory Medicine: Approved Guideline
.
Wayne, PA
:
CLSI;
2010
.
4
Zhang
S,
Zeng
J,
Zhang
C,
et al.
Commutability of possible external quality assessment materials for cardiac troponin measurement
.
PLoS One
.
2014
;
9
(
7
):
e102046
.
5
Trape
J,
Franquesa
J,
Sala
M,
et al.
Determination of biological variation of alpha-fetoprotein and choriogonadotropin (beta chain) in disease-free patients with testicular cancer
.
Clin Chem Lab Med
.
2010
;
48
(
12
):
1799
1801
.
6
Vesper
HW,
Miller
WG,
Myers
GL
.
Reference materials and commutability
.
Clin Biochem Rev
.
2007
;
28
(
4
):
139
147
.
7
Houwert
AC,
Giltay
JC,
Lentjes
EG,
Lock
MT
.
Hereditary persistence of alpha-fetoprotein (HPAF P): review of the literature
.
Neth J Med
.
2010
;
68
(
11
):
354
358
.
8
Peng
Z,
Mao
J,
Li
W,
Jiang
G,
Zhou
J,
Wang
S
.
Comparison of performances of five capillary blood collection tubes
.
Int J Lab Hematol
.
2015
;
37
(
1
):
56
62
.
9
Sandberg
S,
Fraser
CG,
Horvath
AR,
et al.
Defining analytical performance specifications: consensus statement from the 1st Strategic Conference of the European Federation of Clinical Chemistry and Laboratory Medicine
.
Clin Chem Lab Med
.
2015
;
53
(
6
):
833
835
.
10
Fraser
CG,
Hyltoft Petersen P, Libeer JC, Ricos C. Proposals for setting generally applicable quality goals solely based on biology
.
Ann Clin Biochem
.
1997
;
34
(
pt 1
):
8
12
.
11
Rej
R,
Drake
P
.
The nature of calibrators in immunoassays: are they commutable with test samples?: must they be?
Scand J Clin Lab Invest Suppl
.
1991
;
205
:
47
54
.
12
Panteghini
M
.
Selection of antibodies and epitopes for cardiac troponin immunoassays: should we revise our evidence-based beliefs?
Clin Chem
.
2005
;
51
(
5
):
803
804
.
13
Korzun
WJ,
Nilsson
G,
Bachmann
LM,
et al.
Difference in bias approach for commutability assessment: application to frozen pools of human serum measured by 8 direct methods for HDL and LDL cholesterol
.
Clin Chem
.
2015
;
61
(
8
):
1107
1113
.
14
Griffiths
BW,
Pringle
DN,
Godard
A
.
Development of a provisional reference human cord serum standard for alpha-fetoprotein determination
.
Clin Biochem
.
1980
;
13
(
6
):
279
284
.

Author notes

From the Department of Clinical Laboratory, Beijing Center for Clinical Laboratories, Beijing Chaoyang Hospital, Beijing, China (Drs Yue, Zhang, and Wang and Ms Chen); and the Department of Clinical Laboratory, Beijing Chaoyang Hospital Jing-Xi Campus, Capital Medical University, Beijing, China (Dr Xu). Drs Yue and Zhang contributed equally to this manuscript.

Supplemental digital content is available for this article at www.archivesofpathology.org in the October 2017 table of contents.

The authors have no relevant financial interest in the products or companies described in this article.

This study was supported by the National High Technology Research and Development Program of China (2011AA02A116) and the National Clinical Key Specialty Construction Project.

Supplementary data