Misdiagnosis of antiphospholipid syndrome can occur owing to the wide diversity of antiphospholipid (aPL) assays and a lack of international calibrators and harmonized reference intervals.
To assess laboratory practices regarding reporting and establishing reference intervals for immunoglobulin (Ig) G/IgM anti-cardiolipin (aCL) and anti–beta-2 glycoprotein I (anti-β2GPI) assays.
Supplemental questions related to reporting and establishing reference ranges for aPL assays were sent as part of the Antiphospholipid Antibody (ACL)-B 2019 College of American Pathologists (CAP) proficiency testing survey. The response rate and methods assessment details were determined, as well as qualitative and quantitative results for 3 test samples.
The number of participants reporting results for IgG aCL (n = 489), IgM aCL (n = 476), IgG anti-β2GPI (n = 354), and IgM anti-β2GPI (n = 331) varied by antibody type. The enzyme-linked immunosorbent assay (ELISA) (up to 58.6%, 260 of 444) was the most used method; others included multiplex (from 18.9% to 23.9%), fluorescence enzyme immunoassay (14.4%–17.6%), and chemiluminescence immunoassay (6.5%–9.0%). More respondents reported quantitative than qualitative results, and manufacturer cutoff ranges were used by 92.9% and 94.2% of respondents for aCL and anti-β2GPI, respectively. Despite variation in the use of semiquantitative ranges, qualitative negative/positive reporting of the test samples achieved almost 100% consensus. Qualitative consensus was met in contrast to the wide range of quantitative results obtained for each analyte across different kits.
ELISA remains the most used method for detecting aPL antibodies, with most laboratories reporting quantitative results based on manufacturers' suggested reference ranges. The categorization of quantitative results as equivocal, weak positive, or positive for responders using kits from the same manufacturer was variable.
The 2006 revised Sapporo laboratory criteria for antiphospholipid syndrome (APS) recommend testing for specific antiphospholipid (aPL) antibodies, namely, lupus anticoagulant as well as immunoglobulin (Ig) G and IgM antibodies to cardiolipin (aCL) and beta-2 glycoprotein I (anti-β2GPI).1 This revised guidance was a significant step in improving the diagnosis and management of patients with APS. Compared to the original laboratory Sapporo classification criteria published in 1999, major changes to the revised document included the addition of anti-β2GPI IgG and IgM tests, recommendation for the use of medium- or high-titer aPL antibodies for APS diagnosis, and a change in the minimum interval to document antibody persistence from 6 to 12 weeks.1,2 Other improvements for aPL interpretations included categorizing patients with APS according to the presence or absence of other acquired or inherited thrombotic risk factors and their subclassification according to the extent of aPL positivity, since evidence suggests that multiple aPL positivity is associated with a more severe course of the disease.3–5
The committee also attempted to clarify the definition of medium and high antibody titers and introduced a statement that the threshold for medium antibody titers should be above 40 IgG phospholipid units (GPL) or IgM phospholipid units (MPL) (for aCL assays), or greater than the 99th percentile of the reference population (for aCL and anti-β2GPI assays), based on available data of their clinical associations.6,7 However, there were several practical difficulties related to the use of these thresholds by researchers and clinicians. At that time, there were no available international reference reagents to calibrate anti-β2GPI assays, although recent efforts in this regard have been made.8,9 Also, the suggested aCL threshold of 40 GPL (IgG phospholipid) or MPL (IgM phospholipid) units often differs from the 99th percentile value derived from the reference population.10–12 This is further complicated by high interassay variability, particularly among manual and automated solid-phase assays, which often results in dramatic differences in threshold calculations from assay to assay.13,14
The diversity of methods used by kit manufacturers and laboratories in establishing reference intervals and the impact of these factors on patient diagnosis and quality assessment programs remain poorly defined. Clinical laboratories performing aCL and anti-β2GPI solid-phase immunoassays in the United States and other countries subscribe to the College of American Pathologists (CAP) Antiphospholipid Antibody (ACL) proficiency testing survey that evaluates their performance with regard to the testing quality of patient specimens. The aims of this study by the CAP Diagnostic Immunology and Flow Cytometry Committee (DIFCC) were to identify the diversity of immunologic methods used to detect aCL and anti-β2GPI IgG and IgM antibodies and to determine how participating clinical laboratories establish reference intervals and report results for aPL antibodies.
MATERIALS AND METHODS
CAP External Quality Assessment—Questionnaire Development
As part of the quality assessment of antiphospholipid testing, the CAP oversees external quality assessments (EQAs) on a semiannual basis. This process involves sending a total of 6 well-characterized serum samples from patients with APS or normal controls each year; 3 samples in February (survey A) and the remaining half in October (survey B) to participating laboratories. Testing for criteria analytes IgG aCL, IgM aCL, IgG anti-β2GPI, and IgM anti-β2GPI, as well as noncriteria analytes IgA aCL, IgA anti-β2GPI, IgG, IgM, IgA anti-phosphatidylserine and IgG, IgM, IgA anti-phosphatidylserine/prothrombin, is available as part of the evaluation process for each survey. This peer-group assessment requires each participating laboratory to test these EQA samples in a similar fashion to samples obtained as a part of normal clinical testing. Participants report individual analytes tested and their quantitative and/or qualitative results as well as reagents and/or instrumentation used. This EQA allows for the comparison of results of any analyte across laboratories, using the same or different reagents or instrumentation, and provides a means to evaluate long-term performance.
As an adjunct to the CAP 2019 ACL-B survey, a short questionnaire was developed by members of the CAP DIFCC for inclusion. Participants were instructed to complete a questionnaire and include a completed copy along with the normal report of quantitative and/or qualitative results of analyte testing. The questions were based on the immunologic methods used for detecting and/or quantifying aCL and anti-β2GPI IgG and IgM antibodies, establishing reference ranges, and reporting results for these tests in the evaluation of APS (Supplemental Table 1; see supplemental digital content at https://meridian.allenpress.com/aplm in the June 2024 table of contents). The standard reporting units used for IgG/IgM aCL were the GPL/MPL units, which are defined as the binding activity of 1 μg of either IgG or IgM aCL antibody, respectively. There is no standard unit for IgG/IgM anti-β2GPI and so a general term, arbitrary unit (AU), was used for all anti-β2GPI assay reporting.
Qualitative and Quantitative Analysis
The CAP 2019 ACL-B survey included 3 well-characterized specimens (ACL-04, ACL-05, and ACL-06) that were evaluated to determine the impact that variation of the qualitative interpretation of quantitative results had on proficiency survey consensus. The reporting mechanism through the CAP website, at that time, allowed for quantitative reporting and grading of qualitative interpretations into several broad groups: negative, indeterminate, and positive. Therefore, grading of quantitative and qualitative results was reported independently and was not automatically linked. While qualitative reporting as negative, indeterminate/equivocal, or positive was standard, the characterization of the “weak positive” grade was included for the supplemental survey. Total participant consensus for any given analyte for a particular survey specimen (ie, 100% agreement) was defined as all laboratories characterizing that specimen with the same qualitative grade. For a given sample, the qualitative interpretation reported by 80% or more of participants was regarded as acceptable to define consensus among participating laboratories. Median values were calculated for all methods, while mean, SD, and coefficient of variation were analyzed by manufacturer peer group to present the heterogeneity in the distribution of results. All descriptive statistics were calculated after applying a nonparametric outlier screening procedure based on the interquartile range (IQR) of the distribution, with outlying values classified as more extreme than 1.5 times the IQR. The quantitative analysis did not include results for peer groups containing fewer than 10 laboratories or those that reported more than 50% of results as greater or lesser than the analytical measuring range. All figures were generated in Microsoft Excel, and all statistical analyses were performed with SAS 9.4 (SAS Institute, Cary, North Carolina).
RESULTS
Survey Participants, Response Rates, Analytes, and Methods Reported
The number of participants in the CAP 2019 ACL-B quality assessment for the criteria aPL antibodies varied by analyte and isotype (Table 1). Overall, more responders participated in the proficiency test survey for aCL IgG and IgM antibodies (aCL IgG: n = 489, aCL IgM: n = 476) than for anti-β2GPI autoantibodies (anti-β2GPI IgG: n = 354 and anti-β2GPI IgM: n = 331). Among all proficiency test survey participants, enzyme-linked immunosorbent assay (ELISA) was most commonly used across all analytes, with up to 58.6% (260 of 444) of participants reporting ELISA for aCL IgM; other methods included the automated Multiplex (range, 18.9%–23.9%), fluorescence enzyme immunoassay (14.4%–17.6%), chemiluminescence assay (CIA; 6.5%–9.0%), and all other methods (1.5%–2.8%). The response rates of the supplemental survey varied for the different analytes, ranging from 79.8% to 82.2% (aCL IgG: 80.2% [392 of 489]; aCL IgM: 79.8% [380 of 476]; anti-β2GPI IgG: 81.4% [288 of 354]; anti-β2GPI IgM: 82.2% [272 of 331]).
Qualitative and Quantitative Results of Test Samples
With respect to the samples evaluated in the CAP 2019 ACL-B survey, ACL-04 and ACL-05 were obtained from 2 male patients with catastrophic APS with moderate to high IgG aCL and anti-β2GPI levels, while ACL-06 was a pooled sample from normal human serum that was negative for all antibodies. Qualitative consensus based on the reporting options available at the time of the CAP 2019-B survey was met, and agreement was extremely high for all survey samples. For IgG aCL, the level of agreement for samples ACL-04, ACL-05, and ACL-06 was 97.8%, 99.1%, and 99.4%, respectively. The level of agreement for samples ACL-04, ACL-05, and ACL-06 of IgM aCL (100.0%, 100.0%, 99.0%, respectively), IgG anti-β2GPI (99.2%, 100.0%, 99.2%, respectively), and IgM anti-β2GPI (98.2%, 98.7%, 100.0%, respectively) were similarly high. Among laboratories using the same commercial kit method, agreement was also quite high.
In contrast, quantitative values across all testing methods were widely variable for IgG aCL and anti-β2GPI. For IgG aCL, the range of median values for ACL-04, ACL-05, and ACL-06 obtained from different kits was 60.6 to 529.1, 80.3 to 1251.4, and 1.0 to 10.1, respectively. For IgG anti-β2GPI, the range of values was 39.8 to 2086.5, 100.8 to 5014.4, and 0.8 to 2.5, respectively. All samples were expected to be negative for IgM aCL (0.5–2.8, 0.7–5.7, and 0.6–5.2, respectively) and IgM anti-β2GPI (1.2–2.8, 1.6–3.3, and 0.4–2.0, respectively), and the range of median values for these analytes was accordingly less varied across kits. Mean values obtained by kit K11, using CIA methodology, were invariably higher than values reported by other methods (Table 2). The variation of the quantitative measurements among participating laboratories using the same assay was generally good, with coefficient of variation values below 20% for several kits except for assay K16 (26.5% and 28.7% for ACL-04 IgG aCL and IgG anti-β2GPI, respectively). However, coefficient of variation values were highest for negative samples, as one would expect given their low numerical values.
Positive and Negative Cutoff Reporting of aPL Antibodies
Of the 392 respondents, 364 (92.9%) reported using the manufacturer's recommended reference intervals for the aCL antibody assays; the remaining 28 participants (7.1%) established different ranges from the kit used. For the anti-β2GPI kits, a slightly higher rate of 94.2% (276 of 293) was reported when using manufacturers' suggested reference ranges, compared to 5.8% (17 of 293) for in-house–established cutoff values. When analyzing cutoffs across all kits, the numerical range of the negative cutoffs among different kits for IgG aCL (from 10 through 23 GPL), IgM aCL (7–20 MPL), IgG anti-β2GPI (5–25 AU), and IgM anti-β2GPI (4–20 AU) was more narrow than corresponding positive cutoffs for IgG aCL (10–80 GPL), IgM aCL (7–80 MPL), and IgG anti-β2GPI (8–40 AU) but was similar for IgM anti-β2GPI (4–20 AU) (Figure, A through D).
Concordance of the most frequently reported (modal) cutoff values varied among kit manufacturers and across analytes (Table 3). While concordance could be as high as 100% for kit-analyte pairs, for the negative cutoff values, the lowest concordance was 71.4% for IgG aCL, 40.9% for IgM aCL, 60.0% for IgG anti-β2GPI, and 60.0% for IgM anti-β2GPI, depending on the assay. The lowest concordance values for positive cutoff values of these different analytes were 54.5%, 50.0%, 60.0%, and 55.6%, respectively (Figure, A through D). When comparing the distribution of positive and negative cutoffs among laboratories using the same kit, there was overall very little variation from laboratory to laboratory. The IQR value of negative cutoffs for all analytes was almost always zero, and the same was also true for positive cutoffs for all IgG and IgM anti-β2GPI and some IgG and IgM aCL assays. However, among laboratories testing for IgG and IgM aCL using kits K12 and K19, the first and third quartiles of positive cutoff values were 20 to 80 GPL and 30 to 80 MPL, respectively, indicating a wide spread of implemented positive cutoffs for these 2 assays (Table 3).
Borderline and Low/Weak Positive Cutoff Reporting of aPL
Most respondents only reported a simple negative to positive cutoff; however, for IgG aCL, 35.2% (138 of 392) also reported equivocal range cutoffs and 34.2% (134 of 392) reported weak/low-positive cutoffs (Table 4). This was similar for IgM aCL (35.0% [133 of 380] and 33.2% [126 of 380], respectively) but much lower for IgG anti-β2GPI (21.9% [63 of 288] and 4.9% [14 of 288], respectively) and IgM anti-β2GPI (20.2% [55 of 272] and 4.8% [13 of 272], respectively). Most respondents used the manufacturer's recommendation for the equivocal and weak or low-positive cutoffs. For IgG aCL, only 8.0% (11 of 138) and 11.2% (15 of 134) of respondent laboratories used laboratory-established ranges to report equivocal and weak positive reference ranges. Laboratory-established equivocal and weak positive ranges were used similarly for other analytes: for IgM aCL (7.5% [10 of 133] and 11.1% [14 of 126], respectively), IgG anti-β2GPI (4.8% [3 of 63] and 21.4% [3 of 14], respectively), and IgM anti-β2GPI (5.5% [3 of 55] and 15.4% [2 of 13], respectively). The manufacturer kits for which laboratory-established ranges were used in some laboratories included K3, K4, K8, K11, and K12, while for other assays, respondents used only manufacturer-recommended equivocal and weak positive ranges.
The agreement of participant-reported ranges with the most frequently reported (modal) equivocal and weak positive reference ranges varied according to manufacturer kit. Agreement with the modal reference range was highest for equivocal IgG and IgM anti-β2GPI cutoffs (97.0%–100.0% agreement), compared to lower agreements for equivocal IgG and IgM aCL cutoffs (40.0%–100.0%), IgG and IgM aCL weak positive cutoffs (33.3%–100.0%), and IgG and IgM anti-β2GPI weak positive cutoffs (50.0%). When agreement was calculated to include laboratories that reported values within 1 unit of the lower and/or upper limit of the modal reference range, the level of agreement increased for many kits.
DISCUSSION
From the evaluation of supplemental questions provided on the CAP ACL-B 2019 survey related to aPL testing and establishment or validation of reference ranges, we found that use of the ELISA test and validation of the manufacturer's suggested reference ranges remains the most common methodology for quantification and qualitative interpretation and reporting of criteria aPL antibodies. Based on practices reported in the survey, qualitative categorization of results as equivocal or weak positive among participants using kits from the same manufacturer was variable, which was more pronounced for aCL assays than for anti-β2GPI. Despite variation in the use of different levels of positivity to characterize results, the qualitative reporting of the test samples included in the survey achieved almost 100% consensus in all cases. Notably, this very high qualitative consensus was in contrast to the wide range of quantitative results obtained for each analyte across different kits.
Most respondents in the survey used a simple positive and negative cutoff to report results for these criteria analytes, but several laboratories used equivocal and weak positive ranges as well. It is important to realize that the main purpose of the original and revised classification criteria was to create a common ground for conducting clinical research, exchanging and comparing results, and analyzing data originating from different cohorts, and not primarily for use as diagnostic criteria.15,16 Therefore, in reality, the diagnosis of APS may still be made by the attending physician for patients who do not fulfill the requirement of 1 laboratory and 1 clinical classification criterion. Further, recent guidelines have highlighted the importance of higher levels of antibodies and double or triple aPL positivity for determining risk for severe outcomes in patients with APS, indicating that a more nuanced characterization of positivity is important for disease management.17
The definition of medium-positive antibody titers depends on the performance characteristics of the particular assay, the calculation method used to determine results, and the reference population that is being tested.1,13 The committee overseeing the revised classification criteria mentioned the lack of suitable evidence and specifically commented that these values are to be used “until an international consensus is reached.” The publication also mentions that the measurement of aCL and anti-β2GPI antibodies should be performed by “standardized ELISA” tests. In reality, the standardization of these assays is still far from complete. Further, while newly developed immunoassays show a comparable diagnostic outcome for identifying patients with APS, their observed differences in analytical measuring ranges and improved performance characteristics for individual analytes can also contribute to standardization issues.14,18–20
Our study highlights the very interesting fact that despite most laboratories indicating in the survey that they report negative and positive cutoffs using manufacturer's ranges, the level of agreement with the modal reference range in several kit/analyte pairs was quite low. This was also true for equivocal and weak positive cutoffs; however, when we calculated agreement to include values within a single unit of the upper and lower limits of these ranges, the level of agreement improved drastically. This indicates that different laboratories using the same kit might have different reference ranges for negative, equivocal, weak positive, and strong positive, but there is significant overlap from laboratory to laboratory. When we looked at the percentile distribution of positive and negative cutoffs, we noted very low IQR values, indicating a lack of variation in the middle of the distribution (25th to 75th percentile) for most kits. Kits K12 and K19 were notable exceptions, K19 having a relatively small number of participants, compared to kit K12, which was used by more than 100 participants. Various studies have highlighted the impact that population differences have when calculating reference ranges by percentile and how these values often differ from the manufacturer-recommended reference ranges for even US Food and Drug Administration–cleared kits.21–23 While reference range validation within a test population is recommended to facilitate appropriate diagnostic assessment by end-user laboratories, this practice necessarily results in a wide array of reference ranges being used for the same assay across laboratories.13 As a result, substantial confusion may be created for clinicians on a day-to-day basis and for researchers conducting clinical trials in patients with APS.
While this quality survey included only 3 samples and so could not be broadly representative, it is interesting to speculate how the divergent approach to the calculation and implementation of negative, equivocal, low-positive, and high-positive cutoffs may affect qualitative reporting of quantitative results in the clinical setting. A further analysis of CAP surveys over time could help us to answer this question, simulating grading using the reported ranges for the various cutoffs across kits and analytes to examine qualitative agreement. External quality control programs have reported considerable variation in qualitative interpretations for identical numerical values across all respondents, but this was without consideration of assay type and method for calculating cutoffs.24 Recent reports have highlighted the vast numerical difference obtained when trying to harmonize semiquantitative reference ranges among various aPL assays, using a clinical approach versus using regression analysis of standard material or population data.25,26 The contribution of the reporting method to interassay variability is not often considered when compared to other factors affecting harmonization of aPL assays such as manufacturer, methodology (eg, ELISA, multiplex, CIA, fluorescence enzyme immunoassay), calibration methods, and source of kit reagents, among others.15,27 It seems clear that consensus for establishing and reporting results for aPL antibodies for optimal evaluation remains a challenge.
CONCLUSIONS
Given that the revised 2006 Sapporo consensus statement for APS is widely used by clinicians in the evaluation of patients,1 it is important that laboratories integrate testing and reporting of aCL and anti-β2GPI IgG and IgM antibody tests with the lupus anticoagulant test for best practice. Thus, the laboratory interpretative comments should reflect the analytes in the testing panel, reference ranges, units, recommended follow-up to document persistence, clinical significance, and limitations.
References
Author notes
Supplemental digital content is available for this article at https://meridian.allenpress.com/aplm in the June 2024 table of contents.
Competing Interests
The authors have no relevant financial interest in the products or companies described in this article.