Total serum bilirubin (TSB) analysis is pivotal for diagnosing neonatal hyperbilirubinemia. Because of a routine change in laboratory equipment, our TSB assay changed from a diazo to a vanadate oxidase method. Upon implementation, TSB results were substantially higher in newborns than expected based on the validation.
To investigate the application of TSB and intermethod differences in neonates and their impact on phototherapy treatment.
The diazo and vanadate methods were compared directly using neonatal and adult samples. Anonymized external quality control data were analyzed to explore interlaboratory differences among 8 commercial TSB assays. Clinical patient data were extracted from the medical records to investigate the number of newborns receiving phototherapy.
The mean bias of the vanadate versus the diazo TSB method was +17.4% and +3.7% in neonatal and adult samples, respectively. External quality control data showed that the bias of commercial TSB methods compared with the reference method varied from −3.6% to +20.2%. Within-method variation ranged from 5.2% to 16.0%. After implementation of the vanadate TSB method, the number of neonates treated with phototherapy increased approximately threefold.
Currently available TSB assays lack harmonization for the diagnosis of neonatal hyperbilirubinemia. Between-methods differences are substantially higher in neonatal compared with adult samples, highlighting the importance of including neonatal samples during assay validation. Close collaboration between laboratory specialists and clinicians is essential to prevent overtreatment or undertreatment upon the implementation of novel analyzers or assays. Also, harmonization of TSB assays, with an emphasis on neonatal application, is warranted.
Neonatal jaundice is a usually benign condition with an estimated incidence of more than 60% in full-term newborns.1 It is caused by an increase of unconjugated bilirubin, which is largely formed upon the degradation of erythrocytes. As unconjugated bilirubin is neurotoxic, excess levels require prompt intervention to avoid permanent neurologic damage or kernicterus. Treatment is required in approximately 2% of jaundiced neonates and includes (intensive) phototherapy and in severe cases also exchange transfusion.2 Risk factors for developing severe neonatal hyperbilirubinemia (SNH) include a younger gestational age, ethnicity, and blood group antagonisms.
The diagnosis of hyperbilirubinemia relies on total serum bilirubin (TSB) analysis.2 Most clinical guidelines use nomograms with medical decision points for phototherapy and exchange transfusion based on TSB, gestational age, postnatal age in hours, and the risk category of SNH. Although these nomograms were first derived by Bhutani et al3 in 1999, the TSB cutoff points had already been published4 in 1994. In 2009, Wennberg et al5 reviewed the TSB thresholds and concluded that these values were based primarily on expert consensus with little evidentiary support, resulting in TSB limits with high sensitivity but very poor specificity for bilirubin encephalopathy.
The revised American Academy of Pediatrics guideline6 on the management of hyperbilirubinemia in newborns born at 35 or more weeks gestational age was published in September 2022. Compared with the previous version dating from 2004, TSB thresholds were updated and were increased by 1 or 2 mg/dL (to convert to μmol/L, multiply by 17.1) depending on gestational age and presence of risk factors for SNH. This increase is based on an altered risk assessment for developing SNH versus TSB levels, and not on the analytical performance of TSB assays. Although the 2022 guideline states that TSB thresholds are based on expert opinion rather than strong evidence, in our experience clinicians strongly adhere to the published TSB thresholds in the decision to initiate treatment.
Despite this recent update in TSB thresholds, it is still unknown how modern, automated total bilirubin assays, which have undergone many improvements and (re)calibrations, relate to the diazo-based assay that was used to derive the original TSB decision points. Previous studies showed that the recalibration of bilirubin assays by Roche and Ortho resulted in a reduction in the percentage of newborns with clinically relevant hyperbilirubinemia.7,8 However, it is unknown whether this led to undertreatment, or whether there was overtreatment prior to the recalibration.
Another confounding factor in the TSB analysis is variation among different analytical platforms and methodologies. Most clinical laboratories use automated methods based on the diazo (Jendrassik-Grof) or vanadate oxidase chemical reactions, or measure TSB using whole blood co-oximetry on blood gas analyzers.9,10 Despite the existence of standardized reference materials and a reference measurement procedure,11 variation among analytical methods remains.9,12 TSB measurement using high-pressure liquid chromatography (HPLC) is often considered the gold standard for TSB determination because it allows separate quantitation of different bilirubin forms. However, its accuracy and precision are too limited to consider HPLC a true reference method.13 Furthermore, the HPLC method is laborious and unavailable for most laboratories.14 How modern TSB assays relate to the HPLC gold standard is therefore unknown.
Here we describe the impact of a routine change in laboratory equipment on the diagnosis of neonatal hyperbilirubinemia. In February 2019, the Roche Modular chemistry analyzers were replaced by Siemens Atellica Solution systems, and for TSB, this meant a switch from a diazo- to a vanadate oxidase–based assay. As required per ISO 15189:2021 accreditation,15 assay verification was performed that included analysis of imprecision and method comparison. For the latter, it is common practice to use anonymized excess material from adult patients, as the volume of neonatal samples is often limited. On average, a positive bias of 6.2% was found for Siemens Atellica TSB compared with Roche Modular (data not shown). This was accepted, as it was deemed appropriate considering the known analytical differences and was also within the desirable analytical performance specification based on the 20% within-subject biological variation for adults.16,17
After implementation of the Siemens Atellica in clinical routine, neonatologists noticed higher TSB levels in their patients. Based on information provided by the manufacturer, these differences were ascribed to known assay differences, which might be exaggerated during phototherapy because of different reactivity of bilirubin photoisomers, and were not further investigated at first. However, over time, several cases were presented where Siemens Atellica neonatal TSB levels were above treatment thresholds for phototherapy or exchange transfusion, whereas they were below the cutoff in another hospital using the Roche method. As treatment had not yet started in these patients, this could not be attributed to phototherapy. Also, neonatologists incidentally experienced a shortage in phototherapy lamps. Altogether, this suggested that methodologic differences for TSB in neonates were larger than expected based on adult samples and prompted further investigation.
A comparison of neonatal samples was performed between the Roche diazo and Siemens vanadate oxidase TSB methods. General differences among 8 commercial TSB assays were further explored based on the results of an external quality assessment (EQA) program with bilirubin levels in the neonatal range. Furthermore, clinical data were obtained from patient records to investigate the effect of the change in TSB method on the number of newborns receiving phototherapy for SNH.
MATERIALS AND METHODS
Method Comparison
TSB was compared in 72 neonatal and 33 adult Li-heparin plasma samples using the diazo method on Roche Modular or Cobas analyzers (Roche Diagnostics, Basel, Switzerland) and the vanadate oxidation method on the Siemens Atellica analyzer (Siemens Healthineers, Erlangen, Germany). Only excess material of preexisting anonymized and nonhemolyzed samples was used. Patient informed consent was therefore not required. Samples were compared either directly (ie, on the day of collection) or after freezing. For the Roche assay, the instructions for use (IFU) stated that samples were stable at −25°C. As this was not specified for the Siemens assay, a freeze-thawing experiment was performed using 7 neonatal samples. Three freeze-thawing cycles, each performed 1 week apart, did not influence TSB results (data not shown). Data were analyzed using Deming regression in EP Evaluator 12 (Data Innovations Inc, Colchester, Vermont).
In the Siemens method, bilirubin is oxidized by vanadate at pH 2.9 to produce biliverdin. In the presence of a detergent, both conjugated and unconjugated bilirubin are oxidized. The bilirubin concentration is directly related to a decrease in optical density of the yellow color, which is measured at 451/545 nm. In the Roche method, conjugated and unconjugated bilirubin are coupled with 3,5-dichlorophenyl diazonium in a strongly acidic medium at pH 1.0 to form azobilirubin. The color intensity of the red azo dye is directly related to the total bilirubin and is measured photometrically at 546/600 nm.
According to the IFU, both TSB methods are standardized to the American Association of Clinical Chemistry reference method according to Doumas et al.18 Additionally, the Siemens IFU claims traceability to standardized reference material from the National Institute of Standards and Technology (NIST SRM 916). Traceability to a higher-order reference material is not mentioned in the Roche IFU. Both Siemens and Roche IFU refer to the same thresholds from the 2004 American Academy of Pediatrics guideline19 for interpretation of neonatal TSB results.
EQA Program
TSB results across commercially available analyzers were compared using anonymized data from the Dutch External Quality Assessment Organization in Medical Laboratories (Stichting Kwaliteitsbewaking Medische Laboratoria, SKML, Nijmegen, The Netherlands) from the neonatal bilirubin survey 2021.3. The survey consisted of 6 samples of human serum spiked with unconjugated bilirubin, resulting in bilirubin levels ranging from 13.6 to 25.4 mg/dL as assigned using the reference method.18 In total, 66 laboratories were included with 131 analyzers covering 9 different platforms from 6 manufacturers (Table 1). As the Leica method was used by a single laboratory only, it was not further included in the analysis.
For all results, the percentage bias compared with the reference method was calculated. Per method, the mean bias was calculated over all 6 samples. Additionally, for each method the difference between the lowest and highest results was calculated per sample. The within-method variation was determined by dividing the absolute within-method difference over the average result per sample, which was averaged over all 6 samples to obtain an overall within-method variation. As most Siemens Atellica and Advia users were known to apply a correction factor, results were calculated both with and without this factor to obtain more accurate data on the native method differences. The correction factor was obtained from the individual laboratories through personal communication.
Clinical Patient Data
A single-center, retrospective cohort study was conducted at Rijnstate Hospital, a 780-bed teaching hospital in the Netherlands including a specialized neonatal care unit. All hospitalized newborns treated for hyperbilirubinemia between April 2016 and December 2018 (Roche Modular), February 2019 and October 2021 (Siemens Atellica raw), and January 2022 and December 2022 (Siemens Atellica corrected) were screened for eligibility. Newborns had to be referred from primary care, have a gestational age of at least 37 weeks, and be less than 2 weeks of age at referral. Pseudonymized clinical data were extracted from the hospital electronic patient records using CTcue data-mining software (CTcue BV, Amsterdam, The Netherlands). The following variables were collected: age, sex, birth weight, gestational age, direct antiglobulin test results (if performed), ABO-antagonism test results (if performed), duration of phototherapy, and length of stay. Data were analyzed using SPSS version 22.0 (IBM Corp, Armonk, New York). Descriptive statistics are presented as mean with SD for normally distributed continuous data, median and interquartile range for skewed continuous variables, and numbers and percentages for dichotomous and categorical variables. Student t tests and Mann-Whitney U tests were used to evaluate differences in mean and median values between the groups. The χ2 test or Fisher exact test was used to evaluate differences in percentages between the groups. The study was approved by the Institutional Review Board of Rijnstate Hospital.
RESULTS
Method Comparison
Figure 1, A, shows the correlation of the Siemens versus the Roche TSB for neonatal and adult samples. The results of the Deming regression analyses are presented in Table 2. For all samples, TSB determined by the Siemens method provided higher results compared with the Roche method. However, TSB in neonatal samples showed a considerably higher between-methods variation than observed for the adult samples (Figure 1, B, +17.4% and +3.7%, respectively).
TSB in EQA Program
Next, the overall performance of 8 commercially available TSB assays was compared using data from an EQA program (Figure 2). The mean bias as compared with the reference method value and the coefficient of variation per method are included in Table 1. The percentage bias and mean values for each of the 6 EQA samples are presented per method in Supplemental Tables 1 and 2 (see supplemental digital content at https://meridian.allenpress.com/aplm in the February 2024 table of contents). Interestingly, most Siemens Atellica and Advia users were known to use a correction factor. Therefore, for the Siemens Atellica and Advia groups, both the corrected and uncorrected results are displayed. On average, the correction factor was 0.85 (range, 0.82–0.88) and 0.88 (range, 0.85–0.90) for Atellica and Advia analyzers, respectively.
Clearly, the uncorrected Siemens vanadate oxidase method shows a strong positive bias compared with the expert value as determined using the reference method. The observed bias was comparable to the bias found in real neonatal samples. Interestingly, the bias observed in the EQA samples seems to overestimate the bias observed in adult samples (Figure 1, A and B).
The results of most other participants and methods were within 10% of the expert value. The mean bias ranged from −3.6% (Radiometer) to +20.2% (Siemens Atellica, uncorrected), indicating that the intermethod differences in TSB can be almost 25% (Table 1). The within-method coefficient of variation ranged from 5.2% (Siemens Dimension and Advia) to 16% (Roche), showing that also within the same method and manufacturer, differences among laboratories can be substantial and may lead to different patient management.
Clinical Patient Data
In 2 time periods of 33 months each, 296 newborns were included in the study: 57 in the period the Roche Modular analyzers were used and 239 after the change to Siemens Atellica. The number of births in the hospital’s surrounding area was nearly identical in the 2 periods: 10 492 and 10 493, respectively.20 Later, a third period of 12 months was included to analyze the effect of a correction factor that was implemented for neonatal TSB (vide infra). The baseline characteristics of the newborns are found in Table 3. Statistical analysis was performed only on the comparison of the first 2 groups, as the study focused mainly on the effects of the assay change. Also, the third group was much smaller.
Male to female ratio and birth weight were similar between groups. A significant difference in age at referral was observed; the median age of patients referred in the Siemens Atellica–raw period was 1 day less than in the Roche period. Additionally, newborns had a slightly higher gestational age at referral in the Siemens Atellica–raw period.
After the routine change in laboratory equipment, there was an almost threefold increase in the number of newborns treated for SNH (Figure 3, A). The duration of phototherapy was significantly higher in the Roche group (P = .04) despite a similar median value of 2 days in both groups (Figure 3, B). The length of stay tended to be slightly longer in the Siemens Atellica–raw group, but this was not statistically significant.
Coinciding with the change in laboratory method, there was an increase in the number of requests for the direct antiglobulin test and testing for ABO antagonism (Table 3). Both tests are requested in the follow-up of SNH in order to identify a potential cause. No significant difference was found in the number of positive results for these tests between both groups, indicating that there was no change in the incidence of conditions associated with SNH. The number of exchange transfusions was identical in both periods (1 each).
DISCUSSION
In this paper we describe how the routine change in laboratory equipment resulted in a dramatic increase in the number of newborns admitted and treated for hyperbilirubinemia. This was caused by a switch in TSB methods, from a diazo- (Roche) to a vanadate oxidase–based technique (Siemens). Analyses of EQA results from 8 commercially available TSB methods confirmed a substantial between-methods variability and also a considerable within-method variation for several assays and manufacturers. Consequently, icteric newborns are likely to receive different treatments in different hospitals, depending on the particular type of TSB assay in use. This is despite the fact that manufacturers claim that their assay is fit for use in the neonatal population and that assays are standardized to a reference procedure or reference material.
Although slight methodologic differences between the Roche and Siemens TSB assays were found during assay validation using leftover adult samples, the differences turned out to be much larger in neonatal samples. Because of the large positive bias of the Siemens vanadate TSB method, which was specifically found in neonatal samples, many more newborns were treated with phototherapy than during the period when the Roche assay was used. At the time of hospital admission, newborns were 1 day younger and had a slightly higher gestational age during the Siemens period compared with the Roche period (Table 3). Although the difference in mean TSB levels on admission between the 2 periods was not statistically significant (Table 3), the combination of clinical risk factors and higher absolute TSB values may have resulted in earlier exceedance of the phototherapy threshold for individual patients during the Siemens period.
Although the between-methods variation for TSB was previously described,7–9,12,14,21–25 our study provides a novel combination of method comparison with unique neonatal samples, EQA results, and real-life clinical data. Participation in an EQA scheme is important to uncover between-methods variability and to help improve diagnostic assays. A prerequisite, however, is that samples are commutable and are representative of the specific patient population. Unfortunately, no proven commutable EQA samples are available to date for neonatal TSB.9 Whereas Hulzebos et al9 described improvement in interlaboratory variation for TSB in 2021 compared with 2009 based on EQA data, our data show that the true variation is likely obscured because of the application of correction factors. A national study to further uncover these factors is currently being undertaken.
Furthermore, a recent study by Thomas et al12 also described considerable differences in TSB results among laboratories, which were ascribed to a lack of standardization in calibrator values. This is consistent with our findings, although we additionally found that the between-methods variation is substantially larger in neonatal than in adult samples.
Despite the fact that differences in TSB methods are well known, there seems to have been little effort toward improved harmonization or standardization. IFUs from both Siemens and Roche describe traceability to the same reference method. Although this perhaps leads to acceptable intermethod differences for adults, this is clearly not the case for newborns. The observed differences are likely related to a lack of knowledge of the measurand. Different (un)conjugated forms of bilirubin circulate in the bloodstream (ie, α, β, γ, and δ bilirubin), and bilirubin is also extensively metabolized into different photo-isomers and oxidation products.9,14,26 Presumably, these distinct bilirubin forms react differently in each analytical technique. Especially in neonates, this may influence TSB measurement, as different metabolites may be present compared with adults because of liver immaturity and altered bilirubin metabolism.27 Also, the international reference standard may not accurately reflect the different physiological bilirubin forms present in neonates. Although in vitro diagnostic companies may therefore have a TSB assay that is, strictly speaking, traceable to a higher-order international standard, harmonization is yet to be achieved, especially in neonatal samples.
A complicating factor in dealing with these TSB differences in newborns is the fact that many countries all use the same TSB thresholds, which date back almost 30 years and have only recently been revised based on expert opinion.6 For the Roche method, it is known that analytical adjustments have resulted in lower absolute TSB results over the years.8,21 As it is unknown how either method relates to TSB thresholds in the nomograms, it cannot be determined whether the Roche method leads to undertreatment or the Siemens method to overtreatment.
Based on the professional experience and clinical judgement of our neonatologists, as well as patient outcome obtained during routine follow-up of jaundiced newborns, TSB levels as measured using the Roche method seemed clinically appropriate. We were, however, hesitant at first to introduce a lowering correction factor on the Siemens TSB results because of a risk of undertreatment and consequently an increased risk of permanent complications from hyperbilirubinemia. In contrast, overtreatment should also be avoided because of the potential side effects of phototherapy, such as interference with crucial bonding between infant and parents.28,29 After careful consideration of all the data presented in this study and weighing the risks of overtreatment or undertreatment, we decided to correct the Siemens TSB results with a factor of 0.85 for newborns younger than 3 months of age. The factor was based on EQA data from our laboratory (n = 126; 7 surveys of 6 samples each and measured on 3 Atellica systems) and confirmed by the comparison of Roche and Siemens TSB results in neonatal samples. As the Roche method was found to have a slight negative bias in the EQA data (Table 1), we decided not to base the correction factor directly on the Roche method. For older children, the intermethod differences in TSB are unknown, and for adults, there was no need for correction based on the acceptable comparability of TSB results (Figure 1, A and B). Upon implementation of the correction factor, a decrease in the number of newborns treated with phototherapy was observed (Figure 3, A; Table 3).
As an alternative to TSB, measurement of unbound bilirubin seems promising, as this is probably the most critical parameter for the risk of bilirubin encephalopathy in newborns.30,31 Unfortunately, unbound bilirubin is not routinely measured, requires a dedicated analyzer, and is currently unavailable in Europe. Also, there are no thresholds for phototherapy or exchange transfusion based on free bilirubin.
In order to reduce variability in TSB results, and consequently in the treatment of newborns with hyperbilirubinemia, harmonization and standardization of the TSB assay is imperative. Although development of assay-specific cutoff values would likely also result in reduced treatment variation, it would probably lead to more questions from clinicians regarding which cutoff to use, leading to an increased risk of erroneous interpretation. Therefore, in our opinion, efforts from both laboratory specialists and manufacturers should be directed at harmonization of TSB assays. A national initiative to achieve more concurrent neonatal TSB results was recently started, and results will be expected within a few years. Hopefully, this will result in a significantly reduced interlaboratory variation, although the uncertainty around the accuracy of the TSB thresholds remains.
In summary, our data highlight the importance of harmonization and standardization of TSB assays. Especially for neonatal samples, the differences between and within currently available commercial assays are substantial, thereby leading to different neonatal care across hospitals. During assay validation, clinical laboratory professionals need to be aware of the differences that can occur between neonatal and/or pediatric samples versus adult samples. For those assays that have a specific neonatal and/or pediatric application, such as neonatal TSB, samples from these patient groups need to be included during assay verification. The required performance specifications, for example total allowable error, also need to be considered in light of the clinical application for the specific patient group. If this cannot be achieved, for example because of a lack of samples during verification, careful monitoring should be performed upon implementation of the assay or analyzer into routine clinical care.
Finally, our data underline the importance of close collaboration between laboratory specialists and clinicians, and of integrating laboratory expertise into the development of clinical guidelines and thresholds, to prevent overtreatment or undertreatment upon the implementation of new analyzers, assays, or guidelines.
We would like to thank the Dutch External Quality Assessment Organization in Medical Laboratories (SKML; Stichting Kwaliteitsbewaking Medische Laboratoria) for providing the anonymized external quality assessment (EQA) data from the neonatal bilirubin survey. Furthermore, we would like to thank the Dutch Siemens Atellica and Advia users for sharing their raw total bilirubin data from the EQA samples and their correction factors.
References
Author notes
Supplemental digital content is available for this article at https://meridian.allenpress.com/aplm in the February 2024 table of contents.
The authors have no relevant financial interest in the products or companies described in this article.