Context.—

Quantitative imaging is a promising tool that is gaining wide use across several areas of pathology. Although there has been increasing adoption of morphologic and immunohistochemical analysis, the adoption of evaluation of fluorescence in situ hybridization (FISH) on formalin-fixed, paraffin-embedded tissue has been limited because of complexity and lack of practice guidelines.

Objective.—

To perform human epidermal growth factor receptor 2 (HER2) FISH validation in breast carcinoma in accordance with the American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) 2018 guideline.

Design.—

Clinical validation of HER2 FISH was performed using the US Food and Drug Administration–approved dual-probe HER2 IQFISH (Dako, Carpinteria, California) with digital scanning performed on a PathFusion (Applied Spectral Imaging, Carlsbad, California) system. Validation parameters evaluated included z-stacking, classifier, accuracy, precision, software, and hardware settings. Finally, we evaluated the performance of digital enumeration on clinical samples in a real-world setting.

Results.—

The accuracy samples showed a final concordance of 95.3% to 100% across HER2 groups 1 to 5. During clinical implementation for HER2 groups 2, 3, and 4, we achieved a final concordance of 76% (95 of 125). Of these cases, only 8% (10 of 125) had discordances with clinical impact that could be identified algorithmically and triaged for manual review.

Conclusions.—

Digital FISH enumeration is a useful tool to improve the efficacy of HER2 FISH enumeration and capture genetic heterogeneity across HER2 signals. Excluding cases with high background or poor image quality and manual review of cases with ASCO/CAP group discordances can further improve the efficiency of digital HER2 FISH enumeration.

Determination of human epidermal growth factor receptor 2 (HER2) amplification status by fluorescence in situ hybridization (FISH) analysis is an essential part of the management of invasive breast carcinoma to determine therapeutic eligibility for HER2-directed therapy and is considered the gold standard. Thus far, the scoring of HER2 FISH is mostly done by manual enumeration. Although manual scoring is the mainstay of HER2 enumeration, it suffers from several limitations: it is very time and labor intensive, requires a high level of expertise for interpretation, has a tendency for biased inclusion of cells with higher HER2 signal copies, and has a limited number of cells that can be scored, to name a few. In recent years, with the emergence of digital image analysis platforms, there is increasing interest in adoption of this technology for signal enumeration for FISH slides. Quantitative imaging (QIA) or digital analysis can be used to aid in the detection and classification of HER2 FISH slides and overcome at least some of the limitations of manual scoring by providing unbiased scores, increasing the number of cells scored, providing an alternative/complement to manual scoring methods, and allowing the ability to archive more than single select images. An automated scanning system can also decrease the amount of time a pathologist needs to sign out a case while providing reproducible results. The College of American Pathologists (CAP) has revised HER2 testing guidelines multiple times during the last 15 years,14  and more recently it issued guidelines for implementation of whole slide imaging systems for diagnostic purposes in pathology and QIA for HER2 immunohistochemistry (IHC) for breast cancer.5,6  Although there are currently no guidelines that exist for QIA specifically for FISH testing, with additional inclusion of FISH-specific parameters these guidelines for IHC can also be extrapolated to FISH QIA validations.

In this study, we report our experience with implementation of clinical validation of QIA/digital analysis for reporting HER2 FISH using the most recent American Society of Clinical Oncology (ASCO)/CAP 2018 guideline,2  incorporating applicable key elements required for assay validation as suggested in the above-referenced CAP guideline for whole slide imaging and HER2 IHC QIA.

FISH and Manual Enumeration

HER2 FISH was performed on 4- to 5-μm–thick formalin-fixed, paraffin-embedded tissues using the US Food and Drug Administration–approved dual-color HER2 IQFISH probe (Dako, Carpinteria, California) as per manufacturer instructions. Each FISH slide was scanned manually in its entirety, and representative tumor nuclei were enumerated in each case by a technologist and reviewed by a pathologist with expertise in solid tumor FISH.

For QIA, slides were prescanned at ×4 for tissue matching with corresponding hematoxylin-eosin (H&E) or HER2 IHC slides. Regions of interest were selected and scanned at ×100 magnification using a PathFusion system (Applied Spectral Imaging, Carlsbad, California) available in our laboratory. Single filter images were taken at several different levels throughout the 4- to 5-μm thickness (z-stacking) and then combined into 1 image by the software. For digital counts, a minimum of 2 regions of tumor and 8 digital frames per slides were analyzed. PathFusion analysis performs nuclear segmentation, spot counting, and cell classification. Nuclear segmentation is the identification of nuclei and circling by the software. Spot counting involves counting of red and green signals that achieve a predetermined pixel-size threshold associated with the size of the probe being evaluated. Cell classification involves tabulation of the red and green signal spot counts and sorting them into predefined groups. The FISH images that are initially generated by the PathFusion system require review and manual editing by a trained technologist. Editing typically involves correcting automated classification errors due to overlapping nuclei, inaccurate nuclear segmentation, or erroneous inclusion of nontumor nuclei. Once the nuclei are edited from the FISH images, the classification scores are incorporated into a validation report.

Z-Stacking/Normal Range Validation

HER2 FISH assay reference ranges recommended by the ASCO/CAP 2018 guideline2  for manual enumeration were used for digital enumeration. Because digital FISH enumeration uses z-stacking to mimic the manual focal adjustments made by the enumerator, validation of z-stacking was performed to: (1) help ensure that the z-stacking levels used by digital imaging capture hybridization signals through the entire depth of a nucleus and are similar to the number of signals seen on manual enumeration; (2) determine the number of cells that should be digitally enumerated to provide the most accurate result.

z-Stacking was set during development phase to 9 focal planes with a 0.7-μm interval between planes. This covers a 6.3-μm thickness, and given the 3- to 5-μm thickness specified by the HER2 IQFISH PharmaDx manufacturer’s instructions, should be sufficient to capture all signals in enumerated nuclei.

A minimum of 100 to 300 cells (range, 102–894) and 50 cells were scored on each slide digitally and manually, respectively, by 2 independent scorers on normal breast tissue. The scores were averaged and the 95th percentile calculated.

To standardize the results, cells/frames were deleted from right to left (to mimic the technologist stopping earlier), and results were recorded at 300, 200, 150, and 100 cell points for digital enumeration. Digital and manual scores were compared.

Number of Cells to be Enumerated

Current ASCO/CAP guideline and New York State regulations require enumeration of 20 to 40 cells and 50 cells, respectively, for manual HER2 FISH. A literature review for the optimal number of cells for enumeration for HER2 FISH showed that the average margins of error of HER2/CEP17 ratio and HER2 copy number were 0.40 and 0.53, respectively, when counting 20 cells. These decreased to 0.20 and 0.26 when counting 100 cells.7 

We evaluated invasive tumor in 16 samples at different cell cutoffs similarly to the method used for the normal breast range samples to arrive at the decision of the number of cells to be enumerated digitally.

Software Settings

The software settings were optimized during the development phase, with adjustments for reoptimization done quarterly or upon changing of the light bulb or source.

Classifier Setting

The classifier was programmed to sort cells into 3 categories based on average copy numbers per cell: less than 4 HER2, 4 to 5.9 HER2, and 6.0 or more HER2 in the development workup in line with the classification groups of the ASCO/CAP 2018 guideline. For CEP17, the classifier binned nuclei as 0 (not present) or greater than 0 (present). Cells with no CEP17 or HER2 present were not enumerated. Halfway through the validation process, working with the instrument vendor, a new classifier was designed that sorted cells into individual ASCO/CAP group designations. All possible signal combinations up to 10 HER2 and/or 10 CEP17 were evaluated to verify that each cell was correctly classified. Then the new classification system was checked on 2 to 5 samples from each group to verify that the group designation did not change. This classifier was then used for the remainder of the validation and for clinical implementation.

Hardware/Monitor Requirements

The hardware and monitor quality necessary for digital review of samples was assessed to standardize across all users and for optimal image resolutions.

Accuracy

Accuracy of the PathFusion Imaging platform for use with HER2 FISH slides was established by comparison of results obtained using the digital QIA method to manual FISH scores. In the absence of the CAP guidelines for FISH QIA, the 2019 CAP guideline for QIA of HER2 IHC for breast cancer5  was extrapolated to FISH for the purpose of this validation.

Case Selection

A total of 168 cases were included in the study. A minimum of 40 clinical samples were selected from group 1 (n = 43), group 4 (n = 46), and group 5 (n = 42). Group 2 (n = 14) and group 3 (n = 23) tend to be uncommon, and all available cases were included for these groups. All cases were manually reviewed by a minimum of 2 reviewers and confirmed for group classification either before or after complementary HER2 IHC testing. Because the ASCO/CAP guidelines require equivocal HER2 FISH samples to have HER2 IHC performed and a second blinded FISH evaluation in the strongest circumferential membranous staining area of 2+ IHC staining, it was possible for samples to be assigned more than 1 group designation. For our validation, the ASCO/CAP group designation during the original manual H&E-guided enumeration was used. Digital FISH scans were performed in either tumor-enriched areas identified by an H&E slide, or, in the case of FISH groups 2 to 4, digital FISH scanning was performed in the strongest membranous protein-expressing areas marked on the HER2 IHC slides when HER2 IHC staining was 2+/equivocal. The technologist performing the digital scans was blinded to the manual results.

Discordance Definition

A sample was considered discordant between manual and QIA if on average: (1) there was a ±1.0 or more copy number difference without a change of group status, or (2) a sample changed group designation and had a ±1.0 or more copy number difference. However, if the group designation changed but the average difference was less than ±1.0 copy number the sample was considered “borderline but concordant.”

The only exception to the ±1.0 copy number rule were groups 1 and 3 cases, with an average HER2 copy number 6.0 or more, because higher HER2 copy numbers tend to have a higher level of variability in enumeration. Of note, such variability is also seen on manual scoring and has been reported in densely clustered FISH signals by others as well.8  These were considered discordant only if the group designation changed. There is a lack of literature consensus on the cutoff values for clinically significant changes between digital and manual results. Our lab uses the ±1 or more copy number cutoff because a normal cell has 2 copies of the gene of interest and if you lose 1 copy (2:1 control gene to target gene) you have a ratio of 0.5, which is a widely used criterion for gene deletion. On the reverse side, if you gain a copy of the gene, the cell is then defined as having trisomy of that gene and it becomes a clinically significant finding. The discordant samples were submitted for a blinded review by 2 pathologists to determine if the digital or manual count was more accurate. Concordance between digital and manual methods was calculated. Linear regression models and Bland-Altman plots were evaluated for correlation between the 2 methods and to identify outliers.

Precision

Instrument and operator precisions were evaluated as follows:

  • Instrument within-run precision: One HER2 FISH slide from each of the 5 groups was scanned in triplicate on the same day by the same operator.

  • Instrument between-run precision: One HER2 FISH slide from each of the 5 groups was scanned on 3 different days by the same operator.

  • Operator precision: One HER2 FISH slide from each of the 5 groups was scanned and analyzed by a minimum of 3 different operators.

Z Stacking/Normal Range

Differences between the 95th percentile cutoffs for each of the digital and the manual cell points ranged from −0.02 to 0.07 (Table 1). The average HER2 and CEP17 copy number varied from 2.00 to 2.07 and 1.85 to 1.90, respectively, which is close to the biologically expected normal of 2.0 copies per cell, establishing equivalence between digital and manual enumeration.

Table 1

The 95th Percentile Cutoffs for 300, 200, 150, and 100 Cell Counts for Digital and Manual Scores

The 95th Percentile Cutoffs for 300, 200, 150, and 100 Cell Counts for Digital and Manual Scores
The 95th Percentile Cutoffs for 300, 200, 150, and 100 Cell Counts for Digital and Manual Scores

We also compared the digital and manual counts for each individual case by creating a Bland-Altman plot. The average results between manual and digital were plotted against the difference between the 2 results (Figure 1). There was no significant bias for either HER2 or CEP17, although there was a slight trend for HER2 to have a higher copy number on the digital method. This corresponds with the 2 outliers seen on the HER2 graph. A review of these cases showed that there were cells with artifact and overlapping nuclei that were not edited and may reflect expected variance in clinical practice.

Figure 1

Reference range cases. A, Manual versus digital average CEP17 copy number. B, Manual versus digital average human epidermal growth factor receptor 2 copy number.

Figure 1

Reference range cases. A, Manual versus digital average CEP17 copy number. B, Manual versus digital average human epidermal growth factor receptor 2 copy number.

Close modal

Number of Cells to Be Enumerated

In evaluating invasive tumor in 16 samples at different cell cutoffs (50, 100, and 150, 200) we found that there was 0.92 or less difference in HER2 copy number, with an average difference of 0.11, between all cell cutoffs. Four samples resulted in a change in group designation; in 3 of these cases, this change occurred between 100 and 50 cell counts. The fourth case was borderline (HER2 of 3.98 at 100 cells versus 4.02 at 200 cells). Stability in FISH group designation was seen in all other cases once the minimum number of cells counted was above 100.

Classifier Setting

The PathFusion HER2 classifier program correctly classified cells into different groups based on the 2018 ASCO/CAP guideline for breast carcinoma. Samples enumerated with CEP17 greater than 0 classifier maintain the same group designation and/or have less than ±1.0 copy number difference (excluding group 1 or 3 cases with HER2 copy number ≥6) when enumerated with the PathFusion HER2 Classifier.

Software Settings

The settings were optimized during the development phase as a cell diameter of 75 to 300 and cell circularity of 1.5, z-stack of 9 planes at a 0.7-m depth, with exposure times of 30 msec, 50 msec, and 75 msec for 4′,6-diamidino-2-phenylindole (DAPI), fluorescein isothiocyanate, and Texas Red, respectively. Per the laboratory’s standard operating procedure, the software exposure settings are adjusted quarterly to help correct for aging light sources or with a change of light bulb/source. There was 1 adjustment during the validation that changed the exposure times to 100 msec, 300 msec, and 300 msec for DAPI, fluorescein isothiocyanate, and Texas Red, respectively, per the manufacturer’s recommendation. The increase in DAPI exposure improved the prescan quality and autofocus portions of the digital scan with better segmentation of the cells by the software. Image quality improved significantly upon reoptimization.

Hardware/Monitor Requirements

A range of screen sizes and resolutions were evaluated (Table 2). Screens 24 inches or larger with a 1920 × 1080 resolution or higher were determined to be optimal. However, screens/laptops with smaller size and resolution were also deemed acceptable (Figure 2). The only major changes were the loss of the helpful close-up zoom box on the display, which was not critical for enumeration and required additional scrolling to fully view all regions of the FISH image.

Table 2

Computer Monitors Evaluated During Validation

Computer Monitors Evaluated During Validation
Computer Monitors Evaluated During Validation
Figure 2

Screen size and pixel resolution. A, 15.6-inch screen. B, 24-inch screen. Both screen sizes were acceptable for visualization of probe signals.

Figure 2

Screen size and pixel resolution. A, 15.6-inch screen. B, 24-inch screen. Both screen sizes were acceptable for visualization of probe signals.

Close modal

Accuracy

For the final analysis, 21 of 189 cases (11.1%) were excluded because of inherent issues with the cases. Reasons for these included ink artifacts; quality issues, such as photobleaching; green mist; extensive stroma; and limited tumor/scattered tumor with fewer than 100 evaluable tumor cells (Figure 3). Six cases were excluded for circling errors or tissue selection errors on manual enumeration. No cases met the ASCO/CAP guideline for genetic heterogeneity.

Figure 3

Types of excluded fluorescence in situ hybridization (FISH) cases. A, Low cellularity/tumor on a FISH 4’, 6-diamidino-2-phenylindole scan. B, Indistinct cell borders on FISH slide. C, Limited and/or widely scattered tumor on hematoxylin-eosin slide, which is highlighted in the black circle with inset showing the entire tumor and the manual circling. D, Extensive stroma on FISH slide. E, Red ink artifact on FISH slides. F, Green mist on FISH slide (original magnifications ×5 [A], ×100 [B, D, E, and F], ×20 [C], and ×2 [C, inset]).

Figure 3

Types of excluded fluorescence in situ hybridization (FISH) cases. A, Low cellularity/tumor on a FISH 4’, 6-diamidino-2-phenylindole scan. B, Indistinct cell borders on FISH slide. C, Limited and/or widely scattered tumor on hematoxylin-eosin slide, which is highlighted in the black circle with inset showing the entire tumor and the manual circling. D, Extensive stroma on FISH slide. E, Red ink artifact on FISH slides. F, Green mist on FISH slide (original magnifications ×5 [A], ×100 [B, D, E, and F], ×20 [C], and ×2 [C, inset]).

Close modal

ASCO/CAP requires a 95% concordance for HER2 cases and a 90% concordance for HER2+ cases for HER2 IHC QIA. We observed an initial overall concordance between 78.3% and 100% for the individual groups and an overall concordance of 87.5% (147 of 168). The concordance rate for HER2+ cases (group 1) was 95.3% and that for HER2 (group 5) was 98.2%. A total of 85.7% (18 of 21) of the initial discordant cases were resolved in favor of the digital count upon a second blinded pathologist review, with a final concordance of 95.3% to 100% across individual groups (Table 3). Details of discordant cases are provided in Table 4, with example cases depicted in Figure 4.

Table 3

Summary of Accuracy Concordance

Summary of Accuracy Concordance
Summary of Accuracy Concordance
Table 4

Overview of Discordant Cases

Overview of Discordant Cases
Overview of Discordant Cases
Figure 4

Discordant cases. A, Selective counting on fluorescence in situ hybridization (FISH) where manual scores captured cells with higher human epidermal growth factor receptor 2 (HER2) copies (white arrows), and the digital scores also included cells with lower HER2 copies (red arrows). B, Case that did not qualify as genetic heterogeneity but had ∼1% of tumor (location shown with blue circle and displayed on inset, hematoxylin-eosin stain [H&E]), with a higher average HER2 copy number (2.1 versus 4.3) and no expression of HER2 protein on immunohistochemistry staining (C), which has been highlighted by a black arrow. D, H&E. Example of a case with some signal heterogeneity (3.7 versus 4.9 HER2) between the larger and smaller circles. A representative FISH scan is shown (E); the left panel is from the smaller circle and the right panel is from the larger circle. F, FISH scan of alphoid signals complicating CEP17 enumeration (original magnifications ×100 [A, E, and F], ×5 [B and D], ×10 [C], and ×40 [B, inset]).

Figure 4

Discordant cases. A, Selective counting on fluorescence in situ hybridization (FISH) where manual scores captured cells with higher human epidermal growth factor receptor 2 (HER2) copies (white arrows), and the digital scores also included cells with lower HER2 copies (red arrows). B, Case that did not qualify as genetic heterogeneity but had ∼1% of tumor (location shown with blue circle and displayed on inset, hematoxylin-eosin stain [H&E]), with a higher average HER2 copy number (2.1 versus 4.3) and no expression of HER2 protein on immunohistochemistry staining (C), which has been highlighted by a black arrow. D, H&E. Example of a case with some signal heterogeneity (3.7 versus 4.9 HER2) between the larger and smaller circles. A representative FISH scan is shown (E); the left panel is from the smaller circle and the right panel is from the larger circle. F, FISH scan of alphoid signals complicating CEP17 enumeration (original magnifications ×100 [A, E, and F], ×5 [B and D], ×10 [C], and ×40 [B, inset]).

Close modal

The linear regression R2 values were 0.816 for HER2 and 0.7748 for CEP17, indicating good correlation between the manual and digital methods (Figure 5). Outlier scores for cases were identified on Bland-Altman plots (Figure 6). The HER2 and CEP17 Bland-Altman plots each had a few outliers. All the HER2 outliers occurred in 10 or more HER2 copy number samples with clumped signal, which intrinsically have more variability in scoring even on manual enumeration. The CEP17 outliers were more evenly distributed, but the most significant outliers occurred at higher copy numbers. There was a trend for decreased average CEP17 and HER2 copy numbers with the digital method, −0.13 and −0.97, respectively. The CEP17 trend was not significant for FISH enumeration. The HER2 trend, although concerning, was more significant in cases with higher HER2 copies. Upon eliminating the highly amplified samples (≥10 HER2 copies) the difference was reduced (−0.56) and was within the acceptable variance for FISH enumeration (Figure 6, C).

Figure 5

Linear regression plots for accuracy cases. A, Manual versus digital average human epidermal growth factor receptor 2 (HER2) copy number. B, Manual versus digital average CEP17 copy number.

Figure 5

Linear regression plots for accuracy cases. A, Manual versus digital average human epidermal growth factor receptor 2 (HER2) copy number. B, Manual versus digital average CEP17 copy number.

Close modal
Figure 6

Bland-Altman plots for accuracy cases. A, Manual versus digital average human epidermal growth factor receptor 2 (HER2) copy number. B, Manual versus digital average CEP17 copy number. C, Manual versus digital average HER2 copy number excluding the highly amplified (amp) cases.

Figure 6

Bland-Altman plots for accuracy cases. A, Manual versus digital average human epidermal growth factor receptor 2 (HER2) copy number. B, Manual versus digital average CEP17 copy number. C, Manual versus digital average HER2 copy number excluding the highly amplified (amp) cases.

Close modal

Precision

Instrument Within-Run Precision

One HER2 FISH slide from each of the 5 groups was scanned in triplicate on the same day by the same operator. The overall group designation remained the same for each sample with a 100% within-run precision.

Instrument Between-Run Precision

One HER2 FISH slide from each of the 5 groups was scanned on 3 different days by 1 operator. The overall group designation remained the same for each sample except for 1 group 2 sample, which showed borderline but concordant results per the discordant criteria established in the accuracy section. An additional sample from group 2 was subsequently tested, and the overall group designation remained the same between all 3 replicates for a 100% between-run precision.

Operator Precision

One HER2 FISH slide from each of the 5 groups was scanned and analyzed by a minimum of 3 different operators. The overall group designation remained the same in 22 of 25 counts for an overall operator precision of 88%. These discordances were specific to certain technologists and were easily addressed with additional training on the removal of nonneoplastic cells and correct classifier editing.

Clinical Implementation

To ensure similar results during clinical implementation and transition to the clinical laboratory, we used a side-by-side digital and manual review of all group 2, 3, and 4 cases requiring a second blinded IHC-guided scoring. In this phase, besides looking for concordance, we also monitored challenges for clinical implementation and specifically identified cases that would not be suitable for QIA and needed to be flagged for manual review. In our experience, we encounter genetic heterogeneity in approximately 0.5% of breast carcinoma cases,9  and these areas correspond to high protein expression that would in most instances be identified on IHC-directed FISH enumeration. Because IHC was not performed on all cases, the clinical implementation was limited to groups 2, 3, and 4 undergoing a second blinded IHC-directed FISH enumeration. Being a reference laboratory with a high volume of cases being reflexed to referral FISH testing, these groups comprise approximately 24% of our testing.10  Furthermore, because of their borderline nature, these cases also prove to be the most challenging cases for manual enumeration.

A total of 129 consecutive new samples prospectively collected (after completion of the validation phase) that were reflexed to IHC-guided scoring were evaluated by both manual and digital review as part of the clinical workflow. Four cases (3.1%) were unsuitable for digital review because of high background and poor image quality or fewer than 100 evaluable tumor cells. Of the remaining 125 cases evaluable by both digital and manual enumeration, 13 (10.4%) were borderline but concordant and 82 (65.6%) were concordant, for an overall concordance of 76%. The discordant cases comprised 16 group discordances (12.8%), 7 (range, 1.0–1.54 difference; mean, 1.20) HER2 copy number discordances (5.6%), and 7 (range, 1.02–2.76 difference; mean, 1.56) CEP17 copy number discordances (5.6%). However, only 10 (8%) of the discordant cases had a clinically significant discordance, that is, amplified versus not amplified (Table 5). The reasons for discordances were signal heterogeneity with the presence of scattered equivocal or amplified cells, inadequate matching of highest areas of protein expression to areas where the manual count was performed, improved unbiased digital counting, and cases with indistinct cell borders or artifact that should have been excluded from digital enumeration.

Table 5

Concordance of Digital and Manual Enumeration for Clinical Cases

Concordance of Digital and Manual Enumeration for Clinical Cases
Concordance of Digital and Manual Enumeration for Clinical Cases

We compared the digital enumeration results to the original and IHC-guided manual enumeration results to create criteria that would allow us to capture any clinically discordant cases. We evaluated the 10 cases that changed clinical significance between the second manual count and the digital count in this cohort. Of the 10 cases, 7 also had clinically significant group changes between the original and the digital counts and could be captured by implementing a manual review of cases with group changes with clinical impact, that is, amplified to not amplified or vice versa. The remaining 3 cases had borderline HER2 copy numbers. To ensure these had the most accurate results we decided to require a minimum of 200 cells and to score additional tumor regions for large tumors (Figure 7). Based on the cohort results, we anticipate these extra steps will only be required in ∼10% of cases and the manual reflex for clinically significant changes would occur in ∼16% of cases.

Figure 7

Algorithm for triaging cases for manual review. Abbreviation: HER2, human epidermal growth factor receptor 2.

Figure 7

Algorithm for triaging cases for manual review. Abbreviation: HER2, human epidermal growth factor receptor 2.

Close modal

Our validation study showed an excellent concordance of greater than 95% between manual and QIA across all groups of HER2 classification meeting the ASCO/CAP requirements of concordance. The QIA also showed equivalence for reference range, within-run, between-run, and observer precision to the manual methods. For our validation, at least 7 observers formally trained for both digital and manual FISH enumeration were involved in digital and manual enumeration of the slides, capturing interobserver variability that occurs in clinical practice. In this study, we used the scanner available in our laboratory that uses automated nuclear segmentation and spot counting with an option for manual review and editing before finalizing the results on a case. We also used clinical samples in our laboratory, thus providing a simulation of the actual clinical practice in a high-volume laboratory with similar results. Importantly, the exercise of clinical implementation allowed us to understand and create a clinical algorithm that would help triage and identify cases with potential clinical impact and flag them for additional manual review. This was very helpful in understanding the limitations of digital enumeration and excluding cases with high background, artifact, indistinct cell borders, and scattered tumor cells from digital review. Furthermore, cases with clinically significant group changes between the original manual enumeration and IHC-guided digital enumeration are best scored manually.

FISH QIA has thus far used different technologies, such as DAPI-based tile sampling using equal-sized tiles that include 1 to 2 nuclei or nuclear sampling using nuclear segmentation followed by spot counting of fluorescence signals by Metafer 4 (MetaSystems) and EIKONA3D Alpha Tec Ltd.1113  The Visia imaging D-Sight platform provides the option of screening at ×20 followed by scanning at ×100 for a focused scoring in frames in the region of interest captured using z-stacking.14  Other platforms include the Leica HER2 FISH system for Bond.15  More recently, convolutional neural network and difference of Gaussian-based signal enumeration methods have been explored.8  Top hat and bottom hat filters for image analysis is yet another promising technique.16 

All these methods, like ours, have demonstrated a high degree of concordance between manual and digital enumeration, providing support for the adoption of QIA for FISH enumeration. Some of these also had the advantage of manual editing that improved the accuracy of HER2 analysis and was an important component of the analysis by eliminating truncated and nontumor nuclei, similar to the practice of manual enumeration. Although all these studies have shown a slightly lower concordance for equivocal cases, understanding the limitations of digital analysis and appropriate selection of cases can reduce these errors. These errors are mostly caused by issues related to the tissue, inappropriate selection of tumor regions for scoring or analysis methods, or, importantly, genetic heterogeneity. The option of scanning the whole slide at ×20 can overcome this limitation.8,9,14,16  FISH enumeration can also be further simplified by matching areas of highest protein expression for digital enumeration in place of scanning the entire tumor on the FISH slide, which is very time intensive. This can be a useful tool to capture genetic heterogeneity.9  Furthermore, digital images provide for longer and more accessible image archival.

Lastly, the scope of our validation project did not include an analysis of cost and time savings. During the initial phase of transition, the time required for digital analysis may be more than for manual enumeration; we are hopeful that with familiarity and team effort, there will be time saving for each user with improved accuracy of FISH testing, specifically for equivocal cases.

In conclusion, this validation study demonstrates the utility of QIA in providing highly reliable HER2 FISH analysis by a more objective enumeration of a greater number of cells, removing manual scoring bias and, importantly, improving standardization across scoring, with similar results to manual methods.

1.
Wolff
AC,
Hammond
ME,
Hicks
DG,
et al
Recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update
.
J Clin Oncol
.
2013
;
31
(31)
:
3997
4013
.
2.
Wolff
AC,
Hammond
MEH,
Allison
KH,
et al
Human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline focused update
.
Arch Pathol Lab Med
.
2018
;
142
(11)
:
1364
1382
.
3.
Wolff
AC,
Hammond
ME,
Schwartz
JN,
et al
American Society of Clinical Oncology/College of American Pathologists guideline recommendations for human epidermal growth factor receptor 2 testing in breast cancer
.
Arch Pathol Lab Med
.
2007
;
131
(1)
:
18
43
.
4.
Bast
RC
Ravdin
P,
Hayes
DF,
et al
2000 update of recommendations for the use of tumor markers in breast and colorectal cancer: clinical practice guidelines of the American Society of Clinical Oncology
.
J Clin Oncol
.
2001
;
19
(6)
:
1865
1878
.
5.
Bui
MM,
Riben
MW,
Allison
KH,
et al
Quantitative image analysis of human epidermal growth factor receptor 2 immunohistochemistry for breast cancer: guideline from the College of American Pathologists
.
Arch Pathol Lab Med
.
2019
;
143
(10)
:
1180
1195
.
6.
Evans
AJ,
Brown
RW,
Bui
MM,
et al
Validating whole slide imaging systems for diagnostic purposes in pathology
.
Arch Pathol Lab Med
.
2022
;
146
(4)
:
440
450
.
7.
Polonia
A,
Caramelo
A.
HER2 in situ hybridization test in breast cancer: quantifying margins of error and genetic heterogeneity
.
Mod Pathol
.
2021
;
34
(8)
:
1478
1486
.
8.
Hofener
H,
Homeyer
A,
Forster
M,
Drieschner
N,
Schildhaus
HU,
Hahn
HK.
Automated density-based counting of FISH amplification signals for HER2 status assessment
.
Comput Methods Programs Biomed
.
2019
;
173
:
77
85
.
9.
Wilcock
DM,
Sirohi
D,
Coleman
JF,
Gulbahce
HE.
Digital Imaging correlation of immunohistochemistry and fluorescence in situ hybridization in breast carcinoma cases with HER2 genetic heterogeneity
.
Hum Pathol
.
2022
;
126
:
129
135
.
10.
Lin
L,
Sirohi
D,
Coleman
JF,
Gulbahce
HE.
American Society of Clinical Oncology/College of American Pathologists 2018 focused update of breast cancer HER2 FISH testing guidelines: results from a national reference laboratory
.
Am J Clin Pathol
.
2019
;
152
(4)
:
479
485
.
11.
Furrer
D,
Jacob
S,
Caron
C,
Sanschagrin
F,
Provencher
L,
Diorio
C.
Validation of a new classifier for the automated analysis of the human epidermal growth factor receptor 2 (HER2) gene amplification in breast cancer specimens
.
Diagn Pathol
.
2013
;
8
:
17
.
12.
Theodosiou
Z,
Kasampalidis
IN,
Karayannopoulou
G,
et al
Evaluation of FISH image analysis system on assessing HER2 amplification in breast carcinoma cases
.
Breast
.
2008
;
17
(1)
:
80
84
.
13.
Schunck
C,
Mohammad
E.
Automated analysis of FISH-stained HER2/neu samples with Metafer
.
Methods Mol Biol
.
2011
;
724
:
91
103
.
14.
van der Logt
EM,
Kuperus
DA,
van Setten
JW,
et al
Fully automated fluorescent in situ hybridization (FISH) staining and digital analysis of HER2 in breast cancer: a validation study
.
PLoS One
.
2015
;
10
(4)
:
e0123201
.
15.
Ohlschlegel
C,
Kradolfer
D,
Hell
M,
Jochum
W.
Comparison of automated and manual FISH for evaluation of HER2 gene status on breast carcinoma core biopsies
.
BMC Clin Pathol
.
2013
;
13
:
13
.
16.
Reljin
B,
Paskas
M,
Reljin
I,
Konstanty
K.
Breast cancer evaluation by fluorescent dot detection using combined mathematical morphology and multifractal techniques
.
Diagn Pathol
.
2011
;
6
(suppl 1)
:
S21
.

Author notes

The authors have no relevant financial interest in the products or companies described in this article.

Results from this study were presented in part at the 2022 College of American Pathologists annual meeting; October 10, 2022; New Orleans, Louisiana.

Sirohi is now at the Department of Pathology, University of California San Francisco.