The time has come to regulate clinical immunohistochemistry (IHC) as an assay rather than a stain. Since IHC originally evolved as an extension of special stains in anatomic pathology,1 regulating IHC as a stain made sense. A lot has changed since then.1 In this article, we share our perspective explaining why IHC testing should be regulated similarly to immunoassays in clinical pathology. Similar checklist requirements should apply because the same quality assurance (QA) principles and methods are relevant to both. Right now, clinical IHC and clinical immunoassay checklist requirements bear little resemblance to each other. Contemplating such a change has far-reaching implications and will take several years to implement. It also requires the participation of the in vitro diagnostics industry. In the interest of patient care, it is time to start the discussion. The College of American Pathologists (CAP) could take a leading role.
Also see p. 1232.
Context is important, and context has changed. IHC employed as a tissue stain differs significantly from IHC used as an assay. IHC was introduced into routine diagnostic surgical pathology almost half a century ago to improve cell and tissue recognition.1 The original intent was to employ it as a more specific “special stain” in much the same fashion as methyl green pyronine in histopathology (plasma cells) or Sudan black in cytopathology (acute myeloblastic leukemia).
The context began to change in the 1990s when IHC methods were adapted to testing for estrogen receptors and progesterone receptors in formalin-fixed, paraffin-embedded (FFPE) sections, replacing cytosol-based assays. A greater sense of urgency was imparted in 1998 with US Food and Drug Administration (FDA) approval of Hercep-
Test (Dako, now Agilent) alongside the related drug, Herceptin (trastuzumab; Genenentech, now Roche), heralding the era of companion diagnostics. The need for more rigorous methods of analytic standardization became paramount; namely, the requirements of an assay, not a stain. Numerous additional biomarker-targeted drugs followed. FDA approval of HercepTest established a model for the development of a burgeoning series of IHC-based companion diagnostic tests, none of which meet the demands of a fit-for-purpose assay that is both accurate and quantitative. The necessity to accurately distinguish human epidermal growth factor receptor 2 (HER2) 0 from HER2 1+ is only the most recent example of the need to achieve substantially higher levels of precision and accuracy than we are currently capable of. Without intervention, there will continue to be an ever-increasing gulf between IHC’s current performance capabilities and what is required.
Despite exponential growth in IHC testing, the QA methods are still those of a histology stain. IHC QA has not sufficiently adapted over the decades to fit the many new purposes to which it is applied. To explain what is missing, compare the CAP checklist requirements of clinical IHC assays to clinical immunoassays relating to assay controls and calibration (Table).
Both types of laboratory testing require the use of daily positive controls. Of necessity, every IHC laboratory uses a different (nonstandardized) control tissue. Clinical immunoassay requirements, on the other hand, go much further. The CAP Master Chemistry & Toxicology checklist’s item CHM.13900 requires controls with analyte concentrations at relevant decision points. For IHC, control samples can express any target concentration, even though high-concentration samples beyond the assay linear range make for insensitive controls. CHM.14000 requires identifying a control range, which is quantified and recorded. For IHC, pass/fail decisions are subjective and dependent on visual inspection. CHM.14300 requires Levey-Jennings graphing (or similar) so that assay performance trends can be spotted early. For IHC, control quantification and tracking do not exist. These additional clinical pathology requirements (CHM.13900, CHM.14000, CHM.14300) allow personnel to detect assay problems objectively before they affect patient results.
The difference between the 2 laboratory testing disciplines is even larger in the context of assay calibration. Calibration is a foundational principle in the clinical laboratory, ensuring the accurate reporting of patient results. It is the accepted process for standardizing laboratory testing across the globe. Calibration is also essential for linking a clinical trial assay with the subsequent commercial assays upon which patient care depends. While we would not calibrate a Giemsa or periodic acid–Schiff stain, IHC is different. Assays require calibration even if they are qualitative, with test results reported as positive or negative. For example, qualitative clinical immunoassays such as an HIV serology test are calibrated at the threshold concentration separating positive and negative.
As a result of the disparity in CAP checklist requirements, pathologists are not sufficiently empowered with the information they need to identify IHC assay problems. Approximately 90% of IHC errors are false-negative results caused by weak or absent staining.2 Such errors are obvious to managers of external QA (EQA) programs that incorporate a central review of stained slides. These programs include Nordic Immunohistochemical Quality Control (NordiQC, https://www.nordiqc.org), United Kingdom National External Quality Assessment–Immunocytochemistry/In-situ-Hybridization (UKNEQAS-ICC/ISH, www.ukneqasiccish.org), and Canadian Biomarker Quality Assurance (CBQA, www.cbqa.ca).
Broad variability can be observed by directly evaluating hundreds of laboratories’ stained slides.2–4 Approximately 10% to 30% of the labs either fail or demonstrate borderline performance.2–4 Without the comparison to a large peer group, the pathologist at his or her institution is challenged to distinguish a true-negative from a false-negative test result. A high positive batch control (per the current CAP checklist) will detect complete failure of an entire staining run, but there are many other causes of a false-negative test result. The additional checklist requirements associated with clinical immunoassays (Table) specify a more comprehensive quality system that detects these other sources of assay variability. The comprehensive quality system provides information to keep laboratory testing aligned with peers, even without the EQA peer data. This system has worked successfully for decades in clinical pathology.
Pathologists require objective information about their laboratory’s IHC stains daily or weekly so that corrective action can be taken promptly. In clinical immunoassay testing, such information is delivered in a timely fashion via quantitative controls and calibrators (Table). As a result, EQA programs are not the primary source of objective assay performance feedback in clinical pathology. They are a secondary check.
So we ask: “What is the explanation for the striking differences in the clinical IHC and clinical immunoassay checklists?” There are a number of possible responses.
First, an upgraded checklist is not needed for clinical IHC because trained anatomic pathologists will detect errors such as false-negative or -positive staining. Unfortunately, the data say otherwise. To put it in perspective, almost all IHC assays are qualitative, with a binary output (positive or negative). Just by guessing, laboratories (pathologists) have a 50/50 chance of reporting correct IHC test results without examining (or even performing) the test. The data show that IHC testing improves those odds by another 20% to 40%. Different labs testing the exact same FFPE tissue samples generally disagree 10% to 30% of the time.2–4 Since the EQA surveys use the exact same samples, the errors are not preanalytic (although preanalytic or fixation issues undoubtedly worsen the problem). Qualitative clinical immunoassays, following the more stringent regulatory requirements, perform 10-fold better, with interlaboratory concordance rates around 99%.
Second, it might be argued that clinical IHC is simply a stain, not an assay. Both clinical IHC and clinical immunoassays use similar reagents, detect similar types of proteins, follow similar reaction steps, and are governed by similar types of immunologic and enzymatic reactions. Therefore, both could be held to a similar standard in the analytic phase.
Third, there are of course the logistic and economic concerns that implementing the proposed upgraded QA system is too difficult or would cost too much. Change is difficult, particularly when it requires relearning or retooling from a practice that has not changed in decades. We suspect that the costs of test errors to both individual patients and the healthcare industry in general far outweigh the additional difficulty or expense. It is hard to envision why those barriers of cost or difficulty apply to clinical IHC but not to clinical immunoassays.
Fourth, the samples (specimens) utilized in clinical IHC and clinical immunoassay are different. That is true and it does lead to dramatically different (preanalytic) sample handling requirements. However, all the checklist requirements in the Table relate to the analytic part of the test or assay. Therefore, they are relevant regardless of sample type.
Fifth, translating the QA requirements of immunoassays to clinical IHC is not technically feasible. This is no longer true. IHC calibrators are now incorporated into international studies in collaboration with NordiQC, UKNEQAS-ICC/ISH, and CBQA.5 These studies will establish IHC target analytic sensitivity intervals, which will help pathologists track their own laboratory’s assay performance. Quantitative IHC controls suitable for Levey-Jennings graphical analysis have been cleared by the FDA.6
Finally, the authors recognize that the far-reaching changes proposed herein will take time. Within this overall context, it is clear that use of IHC as a type of “special stain” in anatomic pathology is likely to persist for the foreseeable future and that QA methods and controls have yielded some improvements in reproducibility in a “fit-for-purpose” approach.1,2 In this modality the IHC test result is interpreted and reported by the anatomic pathologist in concert with the morphologic features of the case, which is reflected in assignment as Class I IHC by the FDA. Equally, it is clear that with the advent of predictive biomarkers and companion diagnostics, the basic IHC method has been employed to “fit a different purpose,” having requirements for quantifiable results. For this new purpose, the intended use of IHC is not simply as a stain, but rather as an assay. Typically it produces “stand-alone” results, and is therefore defined by the FDA as Class II or III IHC with more stringent requirements.7 It is in this latter application, the development of a “fit-for-purpose” IHC assay, that higher levels of IHC quality assurance will have the greatest impact, enhancing the utility of predictive IHC biomarkers that are used to stratify patients into specific treatment protocols. Therefore, predictive markers might be phased in first, followed by the remaining IHC applications as laboratories develop experience in the use of new control systems and enhanced QA practices.
It is time to envision and plan the IHC CAP checklist of 2026 and beyond. CAP took a proactive role early in the evolution of clinical pathology standards through its Clinical Pathology Standards Program. A similar model might be applicable now for anatomic pathology as we turn a page in developing IHC standards of practice. The pathology community can develop novel strategies to solve these problems by articulating new regulatory goals. We can explore the opportunity to engage with industry in creating the new tools that pathologists require in caring for their patients. The ability of industry to develop improved hardware and software for image analysis and quantification methods goes hand in glove with improved technical standards in IHC, and will prove essential to the reading and interpretation of a quantitative assay.1,7,8
A further benefit of utilizing a standardized analytic process for IHC is that it becomes possible to separate preanalytic from analytic variation in IHC results. A better understanding of the impact of preanalytic variation that results from the use of nonstandardized FFPE tissues may lead to methods that control for and compensate for preanalytic variables, including the use of internal reference standards, rendered quantifiable by the availability of true quantitative external controls.7,8
In the final analysis, IHC must be addressed by integrating preanalytic, analytic, and postanalytic phases in a total test approach.1,7 By these means, the direct measurement of protein within cells and tissues, “in-situ proteomics,” is now becoming possible. The tools are becoming available. Essential among these tools is an improved regulatory process encompassing the methods and goals.
As future directions in IHC regulation are considered, the team making those recommendations might optimally include both clinical and anatomic pathologists. Across the world, we are, after all, pathologists first and foremost. While this topic relates to anatomic pathology, there is clear advantage in “borrowing” QA principles and practice that have proven successful in clinical pathology.
The authors have no relevant financial interest in the products or companies described in this article.