The pathologist, sometimes referred to as the doctor's doctor, is the integrator of patient information. The pathologist serves other clinicians as a consultant to help them optimally understand the nature of the patient's disease and to advise on optimal management.1  We are good at what we do. A pathologist converts a largely 2-dimensional image into a classification that allows optimal/standard-of-care management of the patient.

However, lately we have been faced with some degree of expertise migration that is taking us out of our most proficient range. Although immunohistochemistry (IHC), in its most common use, can help us make a binary diagnostic decision, recently assays have been developed and US Food and Drug Administration (FDA) approved that extend the decision into the recommendation for optimal therapy. These IHC companion diagnostic tests are not new (arguably, they began when Harvey et al2  showed the value of IHC for interpretation of estrogen receptor [ER] expression and subsequent recommendation for endocrine therapy in breast cancer). But 3 new companion diagnostic tests are pushing the limit of pathologist interpretation of IHC stains because they require assessment of subjective parameters in largely nonreproducible assays.

The first of these is programmed death ligand-1 (PD-L1) IHC tests that have been approved as companion diagnostics for immunotherapy.3  At first this seemed to be a great opportunity for pathologists to extend their expertise and add another companion diagnostic assay to their test menu. However, it was rapidly demonstrated that some aspects of the test, developed by pharmaceutical companies and their diagnostic company partners, were not reproducible.4,5  Though the IHC tests were FDA approved and are frequently requested by our oncology colleagues, ultimately these tests put pathologists in an awkward position, and we were left to explain to these colleagues why different tests and different cut points are needed for different immunotherapy drugs. For example, a positive in lung cancer is called when more than 50% of tumor cells are expressing PD-L1 when using the Agilent 22C3 assay testing for pembrolizumab, but the same assay in breast cancer must have a combined positive score of 10 or more to be considered positive for the same drug. Furthermore, we need to explain that, in some cases, we cannot perform the requested FDA-approved test because we do not have the proper equipment (autostaining platform) required by the FDA-approved assay in our labs.6  The optimal solution for assessment of PD-L1 by IHC, from the pathologists' point of view, is yet to be determined.

Close on the heels of the unresolved PD-L1 issue was the approval of Ki-67 IHC as a companion diagnostic for the CDK4/6 inhibitor abemaciclib to treat patients with high-risk, ER+ breast cancer based on the monarchE trial.7  Here the FDA approval was based on the prognostic (not predictive) value of the assay and prescribed at a 20% cut point, which has been shown by the International Ki67 in Breast Cancer Working Group to be nonreproducible.8  Although the International Ki67 in Breast Cancer Working Group showed that we pathologists are concordant and accurate for Ki-67 less than 5% or more than 30%, the 20% cut point showed wide variability among pathologists, but yet was FDA approved.

As if these 2 new, questionably reproducible diagnostic tests were not enough, a third has been added. Recently, the antibody-drug conjugate trastuzumab deruxtecan was shown to be effective in patients with low human epidermal growth factor receptor 2 (HER2) expression (defined as HER2 = 1+ or HER2 = 2+ without ERBB2 gene amplification by fluorescence in situ hybridization).9  This has triggered an FDA approval of interpretation of the HER2 assay outside the dynamic range for which it was designed. An assay designed to separate high levels of HER2 (millions of molecules per cell) from lower levels of HER2 (hundreds of thousands of molecules per cell or fewer), will now be prescribed for use to distinguish tens of thousands from hundreds of thousands.10,11  Pathologists all understand the dynamic range of assays, and most know that if the assay is not optimized for the proper dynamic range, it is likely to fail. Failure, in this case, will be the pathologist's inability to distinguish HER2 = 0 from HER2 = 1+ in a reproducible manner. The Rimm lab has recently collected College of American Pathologists IHC Committee Survey data from 2019–2020 to show that survey cases in the HER2 low range showed a high level of discordance among labs. Because this part of the surveys (the 0 versus 1+ cut point) is not graded (we only grade consensus 3+ versus consensus 0 or 1+), it may not have come to the attention of many pathologists. But if we are required to distinguish 0 from 1+, the data in Fernandez et al12  show we will be unable to achieve consensus. That study engaged 18 pathologists from 15 institutions to read HER2 IHC on 170 breast core biopsies from the Yale archives (New Haven, Connecticut). The pathologists were instructed to score using the American Society of Clinical Oncology/College of American Pathologists guideline.13  The study reports that, of 125 cases scored either 0 or 1+ by at least one pathologist, only 24 showed unanimous agreement by the pathologists for a score of 0 and only one case showed unanimous agreement for a score of 1+ (many cases called 1+ by some pathologists were called either 0 or 2+ by other pathologists). In an attempt to make oncologists aware of this issue, this work was recently published in JAMA Oncology.12  This has triggered 3 assays approved by the FDA that require the pathologist to make a distinction that is not humanly possible or reproducible.

This is the pathologist's conundrum: Should we simply acquiesce, pretend that we can make these diagnoses in a reproducible manner, and not “trouble” our oncology colleagues with this mess, or should we be honest and discuss this problem with them and enlist them as allies to find a rational solution? Many pathologists may be concerned that this much sharing is bad for building client confidence and bad for career advancement because we must admit that we cannot provide a reproducible result for an FDA-approved test. Furthermore, some oncologists may minimize or even dismiss the importance of the issues we as pathologists face. They want solid results from solid tests and trust that FDA approval will ensure this. So, what should we do? It would be easiest to remain silent on the topic so as not to jeopardize pathologists' future collaborations with drug companies. However, we also should not be forced to issue diagnoses that lead to $100 000 therapies that are no more reproducible than the flip of a coin. We would argue that the solution is to produce solid science that proves either the reproducibility or the nonreproducibility of the tests and then enlist our oncology colleagues to help us require rigorous and reproducible assays prior to FDA approval. Perhaps these assays are not a conundrum, but rather a call to action for pathologists to raise awareness that FDA approval of an assay that cannot reliably and reproducibly provide the information that it is intended to provide is potentially as dangerous as approval of a bad drug.

1.
AACR Pathology Task Force.
Pathology: hub and integrator of modern, multidisciplinary [precision] oncology
.
Clin Cancer Res
.
2022
;
28
(2)
:
265
270
.
2.
Harvey
JM,
Clark
GM,
Osborne
CK,
Allred
DC.
Estrogen receptor status by immunohistochemistry is superior to the ligand-binding assay for predicting response to adjuvant endocrine therapy in breast cancer
.
J Clin Oncol
.
1999
;
17
(5)
:
1474
1481
.
3.
Doroshow
DB,
Bhalla
S,
Beasley
MB,
et al
PD-L1 as a biomarker of response to immune-checkpoint inhibitors
.
Nat Rev Clin Oncol
.
2021
;
18
(6)
:
345
362
.
4.
Rimm
DL,
Han
G,
Taube
JM,
et al
A prospective, multi-institutional, pathologist-based assessment of 4 immunohistochemistry assays for PD-L1 expression in non-small cell lung cancer
.
JAMA Oncol
.
2017
;
3
(8)
:
1051
1058
.
5.
Tsao
MS,
Kerr
KM,
Kockx
M,
et al
PD-L1 immunohistochemistry comparability study in real-life clinical samples: results of Blueprint phase 2 project
.
J Thorac Oncol
.
2018
;
13
(9)
:
1302
1311
.
6.
Torlakovic
E,
Lim
HJ,
Adam
J,
et al
“Interchangeability” of PD-L1 immunohistochemistry assays: a meta-analysis of diagnostic accuracy
.
Mod Pathol
.
2020
;
33
(1)
:
4
17
.
7.
Johnston
SRD,
Harbeck
N,
Hegg
R,
et al
Abemaciclib combined with endocrine therapy for the adjuvant treatment of HR+, HER2−, node-positive, high-risk, early breast cancer (monarchE)
.
J Clin Oncol
.
2020
;
38
(34)
:
3987
3998
.
8.
Nielsen
TO,
Leung
SCY,
Rimm
DL,
et al
Assessment of Ki67 in breast cancer: updated recommendations from the International Ki67 in Breast Cancer Working Group
.
J Natl Cancer Inst
.
2021
;
113
(7)
:
808
819
.
9.
Modi
S,
Jacot
W,
Yamashita
T,
et al
Trastuzumab deruxtecan in previously treated HER2-low advanced breast cancer
.
N Engl J Med
.
2022
;
387
(1)
:
9
20
.
10.
McCabe
A,
Dolled-Filhart
M,
Camp
RL,
Rimm
DL.
Automated quantitative analysis (AQUA) of in situ protein expression, antibody concentration, and prognosis
.
J Natl Cancer Inst
.
2005
;
97
(24)
:
1808
1815
.
11.
Onsum
MD,
Geretti
E,
Paragas
V,
et al
Single-cell quantitative HER2 measurement identifies heterogeneity and distinct subgroups within traditionally defined HER2-positive patients
.
Am J Pathol
.
2013
;
183
(5)
:
1446
1460
.
12.
Fernandez
AI,
Liu
M,
Bellizzi
A,
et al
Examination of low ERBB2 protein expression in breast cancer tissue
.
JAMA Oncol
.
2022
;
8
(4)
:
1
4
.
13.
Wolff
AC,
Hammond
MEH,
Allison
KH,
et al
Human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline focused update
.
Arch Pathol Lab Med
.
2018
;
142
(11)
:
1364
1382
.

The authors have no relevant financial interest in the products or companies described in this article.